Discover our curated collection of MCP servers for web scraping & data collection. Browse 2298 servers and find the perfect MCPs for your needs.
Automates browser interactions using Pyppeteer for tasks like navigation, screenshot capture, and element interaction.
Enables AI models to search the web and retrieve up-to-date information using the Tavily API.
Proxies requests to the Google Programmable Search Engine, providing caching, rate-limiting, and metrics.
Automates web browser interactions for Large Language Models using structured accessibility data instead of visual data.
Enables real-time web search, intelligent data extraction, structured website mapping, and systematic web crawling through Tavily's suite of tools, supporting both SSE and STDIO transport protocols.
Transforms UK census and other bulk data into a unified statistical lookup table, enabling multi-resolution geospatial and demographic analysis through a flexible query interface.
Manages AI-powered web search collections and data using Exa's Websets API.
Provides real-time weather alerts and detailed forecasts from the US National Weather Service via a Model Context Protocol (MCP) server.
Provides real-time information on Singapore hawker centre closures and cleaning schedules using official government data.
Provides comprehensive Google search functionality through a plugin-based architecture, leveraging external APIs like Serper and SerpApi.
Accesses real-time environmental data from the Facultad de Informática, UNLP weather station in La Plata, Argentina.
Automates the collection of daily KPI reports from a specified internal system, generating structured Markdown reports for a given date range.
Provides an intelligent data pipeline for high-fidelity crawling and extraction of technical documentation, optimized for AI agents.
Processes arXiv LaTeX sources to enable precise interpretation of mathematical expressions by Large Language Models.
Fetches YouTube video subtitles and transcripts, converting them into multiple formats including SRT, VTT, TXT, and JSON.
Extracts time tracking data from Time Doctor, generating easy-to-use CSV reports with AI assistant integration for enhanced productivity.
Extracts comprehensive track metadata, audio preview URLs, and cover artwork from Beatport track URLs.
Ingests and searches documentation from GitHub repositories and web pages in real-time to provide up-to-date context for AI assistants.
Fetches page text from a specified URL, offering options for character limits and request timeouts.
Provides comprehensive video tools for transcript retrieval, downloading, and AI-driven subtitle generation across various platforms.
Scroll for more results...