Web Scraping & Data Collection MCP Servers

Discover our curated collection of MCP servers for web scraping & data collection. Browse 534 servers and find the perfect MCPs for your needs.

GPT Researcher icon

GPT Researcher

21,325

Conducts in-depth web and local research on any topic, generating comprehensive reports with citations.

Skyvern icon

Skyvern

13,306

Automates browser-based workflows using LLMs and computer vision for robust and adaptable web interactions.

Trafilatura icon

Trafilatura

4,220

Extracts text and metadata from web pages and online resources, offering various output formats.

YouTube Transcript icon

YouTube Transcript

3,872

Retrieves transcripts and subtitles from YouTube videos, including automatically generated ones, without requiring an API key or headless browser.

ENScan Go icon

ENScan Go

3,591

Collects and aggregates domestic enterprise information from various sources to aid in reconnaissance tasks.

Firecrawl icon

Firecrawl

3,042

Empowers LLMs with advanced web scraping capabilities for content extraction, crawling, and search functionalities.

Agent Twitter Client icon

Agent Twitter Client

1,638

Automates interactions with Twitter, including scraping data, sending tweets, and engaging with Grok AI, all without needing the official Twitter API.

Browserbase icon

Browserbase

1,609

Enables LLMs to control cloud browsers for web interaction, data extraction, and task automation using Browserbase and Stagehand.

DevDocs icon

DevDocs

1,483

Crawls, extracts, and organizes technical documentation into an LLM-ready format, streamlining research and implementation for developers.

Browser icon

Browser

1,401

Enables AI applications to control a user's existing browser instance.

Zenfeed icon

Zenfeed

800

Empowers RSS with AI to automatically filter, summarize, and deliver important information, reducing information overload.

Fetcher icon

Fetcher

669

Fetches web page content using a Playwright headless browser, enabling JavaScript execution and intelligent content extraction.

Mobile Next icon

Mobile Next

558

Enables scalable mobile automation through a platform-agnostic interface for interacting with native iOS/Android applications and devices.

RedNote icon

RedNote

440

Access content from the RedNote (XiaoHongShu) platform via an MCP server.

Tavily icon

Tavily

370

Integrates Tavily's search and data extraction capabilities with AI assistants via the Model Context Protocol.

Fetch icon

Fetch

359

Fetches and transforms web content into various formats.

YouTube icon

YouTube

331

Downloads YouTube subtitles using yt-dlp and connects to claude.ai via Model Context Protocol.

Hyperbrowser icon

Hyperbrowser

261

Scrape webpages, extract structured data, and crawl websites while providing access to browser agents.

Web Research icon

Web Research

248

Enables Claude to access real-time information from the web for enhanced research capabilities.

Graphlit icon

Graphlit

248

Enables integration between Model Context Protocol (MCP) clients and the Graphlit service for content ingestion and knowledge retrieval.

Scroll for more results...