WaterCrawl Ecosystem

Discover our collection of plugins and libraries that extend WaterCrawl's capabilities. From AI-powered content extraction to client libraries, we've got everything you need.

Official Plugins

Enhance your crawling capabilities with our official plugins. Each plugin is designed to add specific functionality to your WaterCrawl projects.

WaterCrawl Dify Plugin

v0.3.1MIT License

This plugin allows you to easily connect Dify with WaterCrawl.

Key Features

  • Scrap Tool - Single URL content extraction in various formats (markdown, HTML, JSON, screenshot)
  • Crawl Tool - Web crawling with configurable URL patterns and image alt text generation
  • Crawl Job Tool - Retrieve and manage scraping results
  • API Key based authentication

Installation

Download .difypkg from GitHub Releases or Dify marketplace

Watercrawl Plugin Library

v1.0.0MIT License

Base library for creating plugins for the WaterCrawl web crawling framework.

Key Features

  • Abstract base classes for plugin development
  • JSON Schema-based input validation
  • Pipeline processing support
  • Spider and Downloader middleware integration
  • Type hints and comprehensive documentation

Installation

pip install watercrawl-plugin

Watercrawl-OpenAI Plugin

v0.0.1MIT License

Use OpenAI's LLM to extract information from crawled content.

Key Features

  • OpenAI integration for content extraction
  • Configurable system prompts
  • Environment-based configuration
  • MIT licensed

Installation

pip install watercrawl-openai

Core Libraries

Build and extend WaterCrawl with our core libraries. These libraries provide the foundation for creating plugins and integrating with the WaterCrawl API.

WaterCrawl Python Client

v1.0.0MIT License

Python client library for interacting with the WaterCrawl API.

Key Features

  • Simple and intuitive API client
  • Support for both synchronous and asynchronous crawling
  • Comprehensive crawling options
  • Built-in request monitoring
  • Efficient session management

Installation

pip install watercrawl-py

WaterCrawl Node.js Client

v1.0.0MIT License

Node.js client library for interacting with the WaterCrawl API.

Key Features

  • Simple and intuitive API client
  • Support for both synchronous and asynchronous crawling
  • Comprehensive crawling options
  • Built-in request monitoring
  • Efficient session management

Installation

npm install @watercrawl/nodejs