Discover our curated collection of MCP servers for web scraping & data collection. Browse 2646 servers and find the perfect MCPs for your needs.
Automates LinkedIn job applications and feed exploration through an MCP server.
Provides access to browse, filter, analyze, and download datasets hosted on the Hugging Face Hub.
Extracts content from PDF files using a local file path.
Automates web browser interactions via Playwright, driven by Azure OpenAI and the Model Context Protocol.
Integrates with SearXNG to provide privacy-focused meta search capabilities.
Scrape Weibo user information, feeds, and perform content searches.
Extracts transcripts from YouTube videos, enabling content analysis and processing.
Extracts and formats web content using the Jina AI Reader API for seamless integration with Large Language Models.
Provides access to Jina AI's web services through Claude, enabling web page reading, web search, and fact-checking.
Enables interaction between web pages and a local server by establishing a WebSocket connection, allowing access to browser APIs and DOM elements.
Enables searching Google Patents information via the SerpApi Google Patents API.
Integrates local and remote tools with AI agents using the Model Context Protocol (MCP) within the AutoGen framework.
Automates Google Chrome through AI agents via a WebSocket connection to a browser extension.
Enables large language models to effortlessly retrieve and parse web content, converting it into clean Markdown format.
Enables AI agents and applications to access and extract real-time web data.
Fetches and parses standard RSS/Atom feeds, including specialized support for RSSHub, to deliver structured content to language models and other Model Context Protocol clients.
Enables AI models like Claude Desktop to interact with Telegram channels and groups for comprehensive content scraping and analysis.
Enables AI assistants to extract structured information from unstructured text using Google's langextract library through an optimized Model Context Protocol interface.
Provides programmatic access to Kagi's search and summarization services using a session token, ideal for integration with AI agents and LLMs.
Compresses web page HTML into a structured, actionable map, enabling AI agents to efficiently read and interact with web content at significantly reduced token counts.
Scroll for more results...