Discover Agent Skills for web scraping & data collection. Browse 17 skills for Claude, ChatGPT & Codex.
Automates the discovery, extraction, and organization of academic literature for qualitative research and theoretical pattern identification.
Searches multiple torrent trackers and automates content downloading via magnet links and WebTorrent.
Extracts web page content into clean Markdown using the Gemini CLI as a robust alternative to native browsing tools.
Conducts real-time web searches, deep multi-source research, and high-fidelity page scraping using Perplexity and Firecrawl.
Executes autonomous multi-step research and information synthesis using the Google Gemini Deep Research Agent.
Fetches and converts web page content into clean Markdown using the Gemini CLI as a robust alternative to native browsing tools.
Extracts clean, readable content from web articles and blog posts by removing ads, navigation menus, and distracting clutter.
Downloads high-quality videos and audio from YouTube and other platforms for offline viewing, editing, or archival.
Fetches and cleans transcripts from YouTube videos using yt-dlp with optional Whisper transcription fallback.
Checks the archival status and availability of URLs within the Internet Archive's Wayback Machine.
Automates systematic, multi-agent research workflows to generate validated, structured JSON data from web sources.
Converts JavaScript-rendered web pages into clean, readable Markdown files using Puppeteer and the Readability algorithm.
Diagnoses and resolves web scraping failures for precious metal retail vendors using Firecrawl and Playwright diagnostics.
Deploys a local, privacy-respecting metasearch engine to aggregate web, package repository, and code results in structured JSON.
Lists and manages archived snapshots from the Wayback Machine to track website history and recover lost content.
Extracts and organizes technical trading articles and documentation from mql5.com for research and training data collection.
Downloads high-quality videos and audio from YouTube and other platforms for offline access and archival.
Discovers hidden APIs, browser automation recipes, and structured data extraction patterns for any website to streamline AI agent interactions.
Extracts complete text content from complex, dynamically-loaded, and canvas-rendered web pages where standard tools fail.
Discovers hidden APIs, structured data functions, and browser automation recipes to streamline web scraping and data extraction for AI agents.
Automates company discovery, market research, and lead enrichment using Extruct AI's semantic and Deep Search capabilities.
Conducts deep-dive research into SEC EDGAR filings to extract financial data, officer information, and risk factor analysis.
Manages local API response caching for Wayback Machine operations to optimize performance and ensure data freshness.
Retrieves comprehensive GitHub user and organization profile data including repository counts, follower statistics, and account metadata.
Locates and retrieves the most recent archived version of any URL from the Internet Archive's Wayback Machine.
Archives URLs to the Internet Archive's Wayback Machine for permanent digital preservation and snapshot tracking.
Automates the collection and organization of AI and data-related job listings from Zighang into Obsidian-compatible markdown.
Automates the collection of bookmarked job postings from Zighang and synchronizes them into Obsidian as structured Markdown files.
Retrieves and manages historical visual snapshots of websites using the Internet Archive's Wayback Machine.
Fetches and converts web content into clean Markdown using the Gemini CLI as a reliable fallback for Claude's native tools.
Scroll for more results...