发现web scraping & data collection类别的 Claude 技能。浏览 17 个技能,找到适合您 AI 工作流程的完美功能。
Automates the discovery, extraction, and organization of academic literature for qualitative research and theoretical pattern identification.
Automates the discovery, extraction, and organization of academic literature for qualitative research and theoretical pattern extraction.
Searches multiple torrent trackers and automates content downloading via magnet links and WebTorrent.
Automates the discovery, retrieval, and organization of academic literature for qualitative research and theoretical pattern extraction.
Implements ethical, resilient, and legally compliant web scraping strategies to extract high-quality data while avoiding bot detection.
Optimizes data extraction from websites and APIs using specialized Python scripts to maximize performance and minimize token consumption.
Executes autonomous multi-step research and information synthesis using the Google Gemini Deep Research Agent.
Extracts and ingests social graph data and content from the AT Protocol and Bluesky into structured formats.
Converts batches of images and scanned documents into structured markdown files using local DeepSeek-OCR models via Ollama.
Downloads high-quality videos and audio from YouTube and other platforms for offline access and archival.
Conducts comprehensive market intelligence, company analysis, and competitive research using structured methodologies and automated data collection.
Orchestrates large-scale data acquisition and ingestion from the Bluesky/AT Protocol social graph for downstream analysis.
Retrieves and manages historical visual snapshots of websites using the Internet Archive's Wayback Machine.
Performs neural, context-aware web searches and deep research tasks to find high-quality information that keyword matching misses.
Extracts and organizes technical trading articles and documentation from mql5.com for research and training data collection.
Crawls global AI news sources to generate deduplicated, Chinese-language summaries in a structured JSON format.
Converts JavaScript-rendered web pages into clean, readable Markdown files using Puppeteer and the Readability algorithm.
Orchestrates the extraction, validation, and database loading of comprehensive fighter data from UFCStats.com using Scrapy spiders.
Downloads videos, extracts high-quality audio, and generates clean, paragraph-style transcripts from YouTube and other media platforms.
Extracts and analyzes large PDF documents locally with semantic chunking to minimize token usage and maximize context efficiency.
Integrates Google Programmable Search Engine capabilities directly into Claude Code for programmatic web and image retrieval.
Converts any webpage into clean, formatted Markdown using Chrome CDP for full JavaScript rendering and metadata extraction.
Searches for media and automates torrent downloads across multiple sources using a local API.
Retrieves comprehensive GitHub user and organization profile data including repository counts, follower statistics, and account metadata.
Automates systematic, multi-agent research workflows to generate validated, structured JSON data from web sources.
Extracts and analyzes competitor advertisements from ad libraries to uncover winning messaging, pain points, and creative strategies.
Extracts clean, clutter-free article and blog content from URLs by stripping away ads, navigation, and unnecessary UI elements.
Extracts high-speed, read-only markdown content from documentation, blogs, and static websites.
Extracts deep web content, captures screenshots, and parses PDFs using the powerful Firecrawl API.
Orchestrates parallel subagents to perform structured, multi-source web investigations and synthesize findings into comprehensive reports.
Scroll for more results...