Web Scraping & Data Collection Agent Skills

Discover Agent Skills for web scraping & data collection. Browse 17 skills for Claude, ChatGPT & Codex.

Parallel Web Content Extraction

Extracts high-fidelity, verbatim content from URLs, PDFs, and JavaScript-heavy sites using a token-efficient forked context.

Parallel Web Search

Conducts fast, cost-effective web research and information lookups with automated inline citations and structured data output.

Exa Search CLI

Enables real-time web searching and high-quality programming context retrieval using the Exa neural search engine.

Anysite Market Research

Conducts deep market analysis and competitive intelligence by aggregating data from Y Combinator, SEC filings, and social media platforms.

Competitor Intelligence Analysis

Gathers comprehensive competitive intelligence across LinkedIn, social media, and the web to track market movements and hiring trends.

Anysite CLI Data Extraction & Pipelines

Automates web data extraction, multi-source dataset pipelines, and LLM-powered data analysis through a unified command-line interface.

Person Intelligence Analyzer

Conducts deep multi-platform background research to generate comprehensive professional intelligence reports and outreach strategies.

Competitor Intelligence & Analysis

Conducts deep-dive competitive research across web, social media, and professional networks to generate actionable market intelligence.

YouTube Video Downloader

Downloads YouTube videos and extracts audio with optimized quality presets for easy sharing and local playback.

Incremental API Fetching

Builds resilient, state-aware data ingestion pipelines for paginated APIs using advanced watermark tracking.

Incremental API Fetcher

Builds resilient data ingestion pipelines that handle paginated API results with state tracking and historical backfills.

Universal Search & Data Retrieval

Executes autonomous search missions across local codebases and the web to return structured, attributed data.

Firecrawl Self-Hosted Manager

Manages and troubleshoots self-hosted Firecrawl instances for high-performance web-to-markdown scraping.

Web Search Fallback Agent

Ensures uninterrupted research capabilities by delegating web searches to autonomous agents when primary search APIs fail or hit limits.

YouTube Data API Wrapper

Accesses and manages YouTube data including video metadata, channel statistics, and comment threads through the official v3 API.

Intelligent Web Scraper

Automates complex web data extraction using self-learning algorithms to navigate pagination, bypass blocks, and analyze page structures autonomously.

Global Patents Search

Searches global patent databases using natural language queries to discover prior art and track innovation landscapes.

Stock Data Verifier

Implements a rigorous cross-verification protocol for stock and ETF data to prevent AI hallucinations through multi-source validation.

Web Search Verifier

Standardizes multi-source web search protocols to eliminate AI hallucinations when collecting critical macroeconomic and financial data.

Data Collection Quality Guide

Establishes rigorous quality standards and verification methods for market research, patent analysis, and professional data collection tasks.

LitSearch Academic Review Agent

Automates systematic literature reviews and builds structured scholarship databases using the OpenAlex API.

Review Analyst Agent

Automates product review collection and sentiment analysis to identify prioritized product improvements and actionable insights.

Brand Research Agent

Analyzes brand websites to extract visual identity, voice, and market positioning into a reusable JSON profile for consistent content creation.

Web Scraper & Data Extractor

Extracts web content into structured markdown files using recursive and parallel scraping techniques.

Research & Content Analysis

Conducts comprehensive multi-source research, deep content extraction, and intelligent analysis using parallel agents and specialized patterns.

Perplexity Search for Claude Code

Enhances Claude with real-time web search capabilities using Perplexity models to access current information and scientific citations.

MQL5 Article Extractor

Extracts and organizes technical trading documentation and articles from mql5.com for research and training data collection.

Exa Semantic Search

Empowers Claude with neural web search capabilities to find high-quality information through semantic understanding rather than simple keyword matching.

Apify SDK & Actor Development

Streamlines web scraping, data collection, and Actor development using the official Apify JavaScript SDK and platform documentation.

YouTube Transcript Fetcher

Extracts and saves YouTube video transcripts as timestamped text files for analysis, documentation, and content review.

30 results loaded • More available

Scroll for more results...