AnyCrawl icon

AnyCrawl

2

Provides advanced web scraping, crawling, and search capabilities, integrating with large language model clients via the Model Context Protocol.

Acerca de

AnyCrawl Server acts as a powerful backend for LLM clients like Cursor and Claude, enabling them to interact with the live web for data extraction. It offers robust features for scraping content from single pages, crawling entire websites with configurable depth, and integrating with search engines to scrape results. With support for multiple engines like Playwright, Cheerio, and Puppeteer, and flexible output formats including Markdown, HTML, JSON, and screenshots, it provides the necessary tools for LLMs to access and process up-to-date web information.

Características Principales

  • Comprehensive Web Scraping and Website Crawling
  • Integrated Web Search with Result Scraping
  • Support for Multiple Scraping Engines (Playwright, Cheerio, Puppeteer)
  • Flexible Output Formats (Markdown, HTML, Text, JSON, Screenshots)
  • Multiple Deployment Modes (STDIO, SSE Server, HTTP Streamable Server)
  • 1 GitHub stars

Casos de Uso

  • Enhancing LLM client capabilities with real-time web access and content extraction.
  • Conducting comprehensive website analysis and bulk data collection for research.
  • Automating information gathering from search engine results.