Web Scraping & Data Collection MCP Servers
Discover our curated collection of MCP servers for web scraping & data collection. Browse 1045servers and find the perfect MCPs for your needs.
Marginalia
Accesses the Marginalia Search engine, focusing on independent and non-commercial web content.
Sonata
Orchestrates browser automation for government services, transforming bureaucratic web interfaces into programmable APIs for LLM agents.
Jason's Servers
Sets up and configures a collection of Model Context Protocol (MCP) servers for various tasks.
Perplexity
Enables AI assistants to access and utilize the Perplexity API for search and information retrieval.
Playwright
Enables LLM-powered clients to control a browser for automation tasks via the Model Context Protocol.
MVG Störung
Provides cached access to Munich Public Transport (MVG) disruption data.
Webtools
Provides web analysis tools, including HTML extraction, markdown conversion, screenshot capture, performance analysis, and Lighthouse audits.
Research Server
Searches, extracts, and locally stores information about research papers from arXiv.
Chef Agent
Leverages a Neo4j-backed knowledge graph and AI agents to provide real-time recipe querying, web scraping, dynamic graph updates, and personalized cooking assistance.
CleanWeb
Extracts and cleans core web content, filtering ads and converting it into a pristine Markdown format.
CGV Cinema API
Provides a Python client for interacting with the CGV Cinema mobile API to access movie listings, locations, schedules, and seat maps.
Job Search Node
Scrapes LinkedIn job listings, performs AI-driven analysis against a candidate profile, persistently indexes relevant jobs, and offers an API for management and retrieval.
StockScreener
Analyzes stock data locally using a language model and web scraping.
Query Blogger
Access your Blogger blog content securely and efficiently, enabling large language models to retrieve blog information and latest posts.
Youtube Transcript
Retrieves transcripts from YouTube videos using the Model Context Protocol.
Master Puppeteer
Orchestrates advanced browser automation tasks using Puppeteer, providing token-efficient and comprehensive web data for AI agents.
Web Reader
Enables large language models to effortlessly retrieve and parse web content, converting it into clean Markdown format.
Perplexity
Enables web searches using the Perplexity AI API.
Google Search & Webpage Reader
Provides web search capabilities via Google Custom Search API and extracts content from any webpage.
Scrappey
Control a browser through an LLM using Scrappey's web automation and scraping capabilities.
Scroll for more results...