Discover Agent Skills for web scraping & data collection. Browse 15skills for Claude, ChatGPT & Codex.
Automates the end-to-end processing and metadata curation of genome assembly datasets for VEuPathDB resources.
Performs structured web searches via DuckDuckGo to retrieve real-time documentation, library information, and technical solutions.
Fetches and downloads content from any URL using the powerful wget command-line utility.
Crawls entire websites and builds searchable full-text indexes of content converted into Markdown format.
Automates the end-to-end lifecycle of discovering, validating, building, and publishing Model Context Protocol (MCP) servers and automation tools.
Performs concurrent, recursive web scraping to extract clean text content while maintaining URL-based directory structures and respecting domain boundaries.
Optimizes web searching and source reliability filtering by delegating to Gemini and categorizing results into authority tiers.
Aggregates and preprocesses high-quality AI programming tips and best practices from major developer platforms.
Automates information gathering from web searches and authoritative sources to generate and save structured research reports.
Fetches web documentation, extracts specific topics using AI subagents, and generates structured markdown summaries.
Conducts comprehensive market analysis and trend forecasting across the consumer, technology, healthcare, and finance sectors.
Conducts enterprise-grade company research, competitive analysis, and market intelligence using professional web scraping and search tools.
Installs and configures the YouTube Info MCP server to enable automated extraction of video metadata and details within Claude Code.
Automates the installation and configuration of the Pure.md MCP server for seamless web-to-markdown conversion within Claude Code.
Extracts and correlates high-quality food images from delivery platforms and restaurant websites to populate digital catalogs and menus.
Integrates Gemini AI models with real-time web content through Google Search grounding to provide verified information and automatic citations.
Converts entire PDF documents into clean, structured Markdown while preserving formatting, tables, and images for seamless context loading.
Scrapes web pages and WeChat articles to produce clean, noise-free Markdown content for processing, translation, or archival.
Downloads audio and video from thousands of websites with advanced control over formats, subtitles, and metadata.
Crawls entire websites and extracts clean, structured content into markdown files with AI-enriched metadata.
Orchestrates the discovery, retrieval, and organization of academic literature to facilitate theoretical pattern extraction and qualitative research synthesis.
Isolates specific XML elements from large source files while maintaining structural integrity and formatting.
Processes, analyzes, and transforms various file formats into structured data or new document types using a standardized CLI.
Extracts and converts YouTube transcripts or auto-generated captions into clean, readable text files.
Downloads high-quality videos and extracts audio from over 1000 platforms including YouTube, Bilibili, and TikTok using yt-dlp.
Enforces a rigorous, test-driven development workflow for building and maintaining web scrapers and data extraction agents.
Retrieves real-time information, news, images, and videos from the web using DuckDuckGo to provide up-to-date data and resources.
Searches the internet and converts live webpage content into markdown for real-time information retrieval and analysis.
Analyzes AI tool URLs to extract metadata and automatically categorizes and adds them to the awesome-ai-tools repository.
Scrapes Australian creative writing competitions and automatically manages them as structured GitHub issues with intelligent duplicate detection.
End of results