AnyCrawl icon

AnyCrawl

2

Provides advanced web scraping, crawling, and search capabilities, integrating with large language model clients via the Model Context Protocol.

关于

AnyCrawl Server acts as a powerful backend for LLM clients like Cursor and Claude, enabling them to interact with the live web for data extraction. It offers robust features for scraping content from single pages, crawling entire websites with configurable depth, and integrating with search engines to scrape results. With support for multiple engines like Playwright, Cheerio, and Puppeteer, and flexible output formats including Markdown, HTML, JSON, and screenshots, it provides the necessary tools for LLMs to access and process up-to-date web information.

主要功能

  • Comprehensive Web Scraping and Website Crawling
  • Integrated Web Search with Result Scraping
  • Support for Multiple Scraping Engines (Playwright, Cheerio, Puppeteer)
  • Flexible Output Formats (Markdown, HTML, Text, JSON, Screenshots)
  • Multiple Deployment Modes (STDIO, SSE Server, HTTP Streamable Server)
  • 1 GitHub stars

使用案例

  • Enhancing LLM client capabilities with real-time web access and content extraction.
  • Conducting comprehensive website analysis and bulk data collection for research.
  • Automating information gathering from search engine results.