FreeCrawl
Provides a self-hosted Model Context Protocol server for JavaScript-enabled web scraping and versatile document processing.
소개
FreeCrawl is a robust, self-hosted Model Context Protocol (MCP) server engineered for advanced web scraping and comprehensive document processing. As a direct replacement for Firecrawl, it offers a suite of capabilities including JavaScript-enabled scraping with anti-detection measures, intelligent caching, concurrent batch processing, and extensive error handling. Designed for production environments, FreeCrawl empowers developers to efficiently extract and transform data from diverse web sources and document types, seamlessly integrating with AI tools like Claude Code.
주요 기능
- Rate limiting per domain
- Document processing with fallback support for various formats
- JavaScript-enabled web scraping with Playwright and anti-detection measures
- 1 GitHub stars
- Concurrent batch processing with configurable limits
- Intelligent caching with SQLite backend
사용 사례
- Scraping dynamic web content for data collection
- Processing and converting various document formats (PDF, DOCX) to structured data
- Extracting structured data from web pages using JSON schema