Web Scraping & Data Collection Agent Skills

Discover Agent Skills for web scraping & data collection. Browse 17 skills for Claude, ChatGPT & Codex.

Bright Data Web Scraper

Automates web content retrieval using a progressive four-tier fallback strategy to bypass bot detection and access restrictions.

Video Downloader

Downloads high-quality videos and audio from YouTube and other platforms for offline access and archival.

Structured Web Research

Automates multi-step information gathering and synthesis using structured planning and parallel subagents.

Bright Data Progressive Scraper

Implements a four-tier progressive escalation strategy to reliably scrape web content and bypass advanced bot detection.

Web Research Agent

Conducts deep, multi-faceted web research by orchestrating parallel subagents to plan, gather, and synthesize complex information.

Structured Web Research

Conducts systematic web research through autonomous subagent delegation and multi-source synthesis.

Advanced Progressive Web Scraper

Automates web content extraction using a four-tier fallback strategy to bypass bot detection and CAPTCHAs.

MarkItDown Document Converter

Converts complex file formats including PDF, Office documents, and media into clean Markdown optimized for LLM processing.

BioRxiv Database Search

Searches and retrieves life sciences preprints from the bioRxiv database with advanced filtering and PDF download capabilities.

Video Downloader

Downloads high-quality video and audio content from YouTube and other platforms directly through your terminal workspace.

Matrix Repomix

Packs external GitHub or local repositories into a token-efficient format for deep context analysis within Claude Code.

Bright Data Progressive Scraper

Retrieves web content through a four-tier progressive fallback strategy to bypass bot detection and access restrictions.

Bright Data Progressive Scraper

Implements a four-tier progressive scraping strategy to bypass bot detection and reliably extract web content.

Structured Web Research

Conducts deep web investigations by delegating tasks to specialized subagents and synthesizing findings into organized reports.

Working Nomads Job Scraper

Scrapes and organizes remote job listings from workingnomads.com with advanced filtering and multi-format export capabilities.

Claude Community Insights & Feature Research

Analyzes Reddit community discussions to identify feature requests, user pain points, and emerging use cases for Claude AI and Claude Code.

Context Gatherer

Acquires and stabilizes information from URLs, web searches, and local codebases into reusable markdown artifacts for AI reasoning.

Browser Content Capture

Captures web content from JavaScript-heavy, login-protected, and multi-page sites using the agent-browser CLI.

YouTube Transcriber

Extracts subtitles and transcripts from YouTube videos directly into local text files using command-line tools or browser automation.

Browser Content Capture

Captures web content from JavaScript-rendered pages and authenticated sites using the agent-browser CLI.

Documentation Scraper

Transforms documentation websites into structured, categorized reference files optimized for AI context and offline archives.

llms.txt Support

Detects and ingests LLM-optimized documentation via the llms.txt standard to accelerate context gathering for autonomous agents.

Documentation Scraper

Scrapes documentation websites and transforms them into organized, categorized reference files for AI context and offline archives.

Google Places Search & Data

Queries the Google Places API to retrieve detailed location information, reviews, and search results directly within the Claude Code environment.

Yandex Search API Integration

Retrieves and parses Yandex Search engine results using the official Yandex Cloud Search API v2.

Dev Opinions Scan

Aggregates and synthesizes technical opinions and developer reactions from major online communities like Reddit and Hacker News.

Z.AI CLI Multi-Tool

Enhances Claude with advanced vision analysis, real-time web searching, and deep GitHub repository exploration capabilities.

Nia - Intelligent Context & Documentation

Indexes and searches external repositories, documentation, and research papers to provide Claude with high-fidelity context for development tasks.

MiniMax Web Search

Enables real-time internet search capabilities using the MiniMax MCP for legal research and general information retrieval.

Exa Semantic Search & Research

Performs high-precision semantic search and structured content retrieval using the Exa AI API for deep research and code documentation.

30 results loaded • More available

Scroll for more results...