Web Scraping & Data Collection MCP Servers
Discover our curated collection of MCP servers for web scraping & data collection. Browse 425 servers and find the perfect MCPs for your needs.
Firecrawl
Empowers LLMs with advanced web scraping capabilities for content extraction, crawling, and search functionalities.
Browserbase
Enables LLMs to control cloud browsers for web interaction, data extraction, and task automation using Browserbase and Stagehand.
Fetcher
Fetches web page content using a Playwright headless browser, enabling JavaScript execution and intelligent content extraction.
DevDocs
Crawls, extracts, and organizes technical documentation into an LLM-ready format, streamlining research and implementation for developers.
YouTube
Downloads YouTube subtitles using yt-dlp and connects to claude.ai via Model Context Protocol.
Fetch
Fetches and transforms web content into various formats.
Tavily
Integrates Tavily's search and data extraction capabilities with AI assistants via the Model Context Protocol.
Web Research
Enables Claude to access real-time information from the web for enhanced research capabilities.
Google Search
Bypasses search engine anti-scraping mechanisms to execute Google searches and extract results.
Hyperbrowser
Scrape webpages, extract structured data, and crawl websites while providing access to browser agents.
Financial Datasets
Provides access to real-time and historical stock market data for AI assistants through the Model Context Protocol (MCP).
Apify Actors
Enables AI applications to use Apify Actors as tools for performing specific tasks like data extraction and web scraping.
Graphlit
Enables integration between Model Context Protocol (MCP) clients and the Graphlit service for content ingestion and knowledge retrieval.
ReAct Web Search
Integrates web search capabilities into AI assistant frameworks using the Exa API for real-time, markdown-formatted results.
Search1API
Provides search and crawl functionality using Search1API through a Model Context Protocol (MCP) server.
Playwright
Enables browser automation capabilities using Playwright for LLMs to interact with web pages.
YouTube Transcript
Retrieves transcripts directly from YouTube videos using a simple interface.
Rag Web Browser
Enables AI agents and LLMs to interact with the web and extract information from web pages via the RAG Web Browser Actor.
Fetch
Fetches URLs and YouTube video transcripts through an MCP server.
A-Stock Data
Provides A-share (China stock market) data to large language models via the Model Content Protocol (MCP).
Scroll for more results...