Web Scraping & Data Collection MCP Servers

Discover our curated collection of MCP servers for web scraping & data collection. Browse 425 servers and find the perfect MCPs for your needs.

Firecrawl icon

Firecrawl

2,119

Empowers LLMs with advanced web scraping capabilities for content extraction, crawling, and search functionalities.

Browserbase icon

Browserbase

652

Enables LLMs to control cloud browsers for web interaction, data extraction, and task automation using Browserbase and Stagehand.

Fetcher icon

Fetcher

490

Fetches web page content using a Playwright headless browser, enabling JavaScript execution and intelligent content extraction.

DevDocs icon

DevDocs

458

Crawls, extracts, and organizes technical documentation into an LLM-ready format, streamlining research and implementation for developers.

YouTube icon

YouTube

271

Downloads YouTube subtitles using yt-dlp and connects to claude.ai via Model Context Protocol.

Fetch icon

Fetch

218

Fetches and transforms web content into various formats.

Tavily icon

Tavily

214

Integrates Tavily's search and data extraction capabilities with AI assistants via the Model Context Protocol.

Web Research icon

Web Research

211

Enables Claude to access real-time information from the web for enhanced research capabilities.

Google Search icon

Google Search

184

Bypasses search engine anti-scraping mechanisms to execute Google searches and extract results.

Hyperbrowser icon

Hyperbrowser

181

Scrape webpages, extract structured data, and crawl websites while providing access to browser agents.

Financial Datasets icon

Financial Datasets

172

Provides access to real-time and historical stock market data for AI assistants through the Model Context Protocol (MCP).

Apify Actors icon

Apify Actors

132

Enables AI applications to use Apify Actors as tools for performing specific tasks like data extraction and web scraping.

Graphlit icon

Graphlit

131

Enables integration between Model Context Protocol (MCP) clients and the Graphlit service for content ingestion and knowledge retrieval.

ReAct Web Search icon

ReAct Web Search

124

Integrates web search capabilities into AI assistant frameworks using the Exa API for real-time, markdown-formatted results.

Search1API icon

Search1API

110

Provides search and crawl functionality using Search1API through a Model Context Protocol (MCP) server.

Playwright icon

Playwright

109

Enables browser automation capabilities using Playwright for LLMs to interact with web pages.

YouTube Transcript icon

YouTube Transcript

106

Retrieves transcripts directly from YouTube videos using a simple interface.

Rag Web Browser icon

Rag Web Browser

97

Enables AI agents and LLMs to interact with the web and extract information from web pages via the RAG Web Browser Actor.

Fetch icon

Fetch

95

Fetches URLs and YouTube video transcripts through an MCP server.

A-Stock Data icon

A-Stock Data

92

Provides A-share (China stock market) data to large language models via the Model Content Protocol (MCP).

Scroll for more results...