Discover Agent Skills for web scraping & data collection. Browse 17skills for Claude, ChatGPT & Codex.
Extracts content from websites using the Scrape.do API to bypass anti-bot protections and render dynamic JavaScript.
Implements production-proven Playwright web scraping patterns with a selector-first approach and robust error handling.
Extracts and formats transcripts from YouTube videos, playlists, and channels using a single unified command.
Automates web content retrieval using a four-tier escalation strategy to bypass bot detection and CAPTCHAs.
Automates the extraction of structured data from websites using optimized scraping patterns.
Extracts and saves subtitles or transcripts from any YouTube video URL directly to your local workspace.
Extracts clean, distraction-free text from web articles and blog posts for easy reading and archiving.
Converts interview recordings, academic PDFs, and various document formats into structured markdown for qualitative analysis.
Extracts audio, subtitles, and cover images from MP4 video files using MCP services and ffmpeg.
Automates video metadata extraction and media downloading by processing structured task lists through MCP services.
Orchestrates systematic web research through automated planning, parallel subagent delegation, and comprehensive report synthesis.
Conducts systematic web research by planning subtasks, delegating to research agents, and synthesizing findings into comprehensive reports.
Aggregates and preprocesses high-quality AI programming tips and best practices from major developer platforms.
Processes, analyzes, and transforms various file formats into structured data or new document types using a standardized CLI.
Conducts real-time, AI-optimized web searches and content extraction to provide up-to-date information beyond Claude's knowledge cutoff.
Ensures rigorous factual accuracy through systematic, multi-pass evidence validation and source tiering.
Crawls entire websites and builds searchable full-text indexes of content converted into Markdown format.
Performs structured web searches via DuckDuckGo to retrieve real-time documentation, library information, and technical solutions.
Exports comprehensive TripIt travel data including trips, flights, and lodging into a structured JSON format via browser automation.
Parses and monitors RSS/Atom feeds to extract structured updates from news sites, blogs, and social platforms.
Streamlines the development of Python-based video classification systems with optimized scraping and incremental database management.
Automates the installation and configuration of the VeyraX MCP server for enhanced web search and content extraction.
Extracts and analyzes competitor advertisements from major ad libraries to uncover successful messaging patterns and creative strategies.
Downloads high-quality videos and extracts audio from over 1000 platforms including YouTube, Bilibili, and TikTok using yt-dlp.
Extracts and converts YouTube transcripts or auto-generated captions into clean, readable text files.
Curates specialized AI technology news and technical insights using targeted search strategies and quality filtering rules.
Downloads high-quality video and audio content from YouTube and other platforms directly through the Claude Code interface.
Performs intelligent web searches via the Zhipu search engine with automated relative date resolution.
Extracts and processes comprehensive data from GitHub repositories for ingestion into RAG pipelines and LLM knowledge bases.
Downloads videos, audio, and subtitles from YouTube and other platforms using yt-dlp with optimized settings.
Scroll for more results...