Discover Agent Skills for web scraping & data collection. Browse 16skills for Claude, ChatGPT & Codex.
Extracts text and structural data from complex Microsoft Word documents containing nested tables, checkboxes, and multi-layered cell layouts.
Extracts clean source code from GitHub file URLs using the GitHub CLI to bypass web scraping noise and HTML clutter.
Empowers Claude with real-time internet research capabilities by integrating Gemini's Google Search tool directly into the terminal workflow.
Searches and retrieves life sciences preprints from the bioRxiv server using keywords, authors, date ranges, and categories.
Accesses and queries the ClinicalTrials.gov API v2 to retrieve detailed medical study data, recruitment status, and eligibility criteria for clinical research.
Extracts typed commands and sequential text inputs from screen recordings and terminal sessions using optimized OCR workflows.
Automates the extraction, transcription, and cleaning of YouTube video subtitles and captions into readable text.
Accesses official USPTO APIs to perform comprehensive patent and trademark searches, intellectual property analysis, and prosecution history tracking.
Extracts clean, readable text from web URLs by removing advertisements, navigation menus, and distractions.
Extracts structured data from financial documents using OCR and text extraction while enforcing rigorous data safety and verification protocols.
Automates the classification and extraction of data from financial documents while ensuring data integrity through rigorous safety and verification protocols.
Extracts text commands, terminal inputs, and gameplay moves from screen recordings using optimized OCR and image preprocessing techniques.
Extracts and implements code or algorithms from images by utilizing OCR tools, image preprocessing, and systematic verification strategies.
Ensures uninterrupted research capabilities by delegating web searches to autonomous agents when primary search APIs fail or hit limits.
Streamlines web scraping, data collection, and Actor development using the official Apify JavaScript SDK and platform documentation.
Extracts clean, readable content from web articles and blog posts directly into Markdown format by removing clutter and ads.
Extracts and transforms unstructured information from various sources into structured, AI-interpretable formats.
Automates the extraction, structuring, and organization of unstructured data into AI-ready formats from web and local sources.
Fetches and processes YouTube video transcripts, subtitles, and captions using automated tools and AI transcription.
Downloads videos and audio from YouTube and other platforms for offline viewing, archival, and content repurposing.
Performs real-time web and local searches using the Brave Search API directly via curl commands to retrieve current information and technical solutions.
Extracts and converts YouTube video transcripts into clean, readable text files directly within the terminal.
Scrapes web content, maps site structures, and extracts structured data using advanced crawling and search capabilities.
Extracts clean web content, crawls entire domains, and searches the web directly through the Firecrawl API using terminal commands.
Analyzes website structures and debugs web scraping issues using Chrome DevTools to improve data extraction accuracy.
Integrates the Perplexity API to conduct deep web research, capture real-time data, and generate structured reports with verifiable citations.
Optimizes online research by applying structured query patterns and advanced search techniques for precise information retrieval.
Extracts clean, readable text from blog posts and articles by removing ads, navigation, and clutter.
Extracts clean, distraction-free text content from web URLs and saves it as readable text files.
Downloads high-quality video and audio content from YouTube and other platforms for offline viewing, editing, or archival.
Scroll for more results...