发现web scraping & data collection类别的 Claude 技能。浏览 16 个技能,找到适合您 AI 工作流程的完美功能。
Overcomes web access restrictions and rate limits by performing federated searches and intelligent content extraction from blocked or challenging URLs.
Scrapes websites, extracts structured data, and automates web data collection pipelines using the Crawl4AI library.
Extracts and structures metadata from PDF form fields into JSON format to facilitate automated document processing and form filling.
Executes comprehensive web searches using the Gemini command to gather real-time data and detailed information.
Extracts and validates structured data from scientific literature collections to create analysis-ready datasets for systematic reviews and meta-analyses.
Extracts specific data from JSON files efficiently to minimize token usage and improve processing speed.
Conducts deep, iterative web research to generate comprehensive reports with verified citations and source tracking.
Performs intelligent web searches using a prioritized MCP strategy to find the most relevant documentation and live technical data.
Synthesizes real-world data from multiple search tools into structured narratives and citations for documentation and drafting.
Extracts Twitter posts and comments to organize viewpoints and generate professional narration scripts for content production.
Extracts YouTube video transcripts, metadata, and chapters into formatted Markdown files for knowledge management systems.
Extracts and organizes technical trading articles and documentation from mql5.com for research and training data collection.
Generates fact-based answers and structured data from the web using AI-powered search and synthesis.
Adds and configures Instagram accounts and web aggregators to local media event tracking systems.
Extracts event data from Instagram, Facebook, and web aggregators to power local media newsletters.
Conducts deep technical research by gathering multi-source evidence, analyzing GitHub repositories, and documenting implementation options.
Converts websites into LLM-ready markdown or structured data using the Firecrawl v2 API.
Automates the periodic search and refresh of Exa.ai websets to keep your data collections continuously updated.
Orchestrates a multi-source image pipeline to download, validate, and normalize fighter photos from Wikimedia, Sherdog, and Bing.
Discovers related web content, articles, and research papers using AI-powered similarity matching via Exa.ai.
Empowers Claude with AI-powered semantic search to find web content, research papers, and code repositories by meaning rather than keywords.
Orchestrates the extraction, validation, and database loading of comprehensive fighter data from UFCStats.com using Scrapy spiders.
Crawls global AI news sources to generate deduplicated, Chinese-language summaries in a structured JSON format.
Converts batches of images and scanned documents into structured markdown files using local DeepSeek-OCR models via Ollama.
Deploys a local, privacy-respecting metasearch engine to aggregate web, package repository, and code results in structured JSON.
Implements ethical, resilient, and legally compliant web scraping strategies to extract high-quality data while avoiding bot detection.
Extracts structured data and AI-generated summaries from any URL with high token efficiency and live crawling.
Manages automated web searches, structured data enrichment, and entity-based collection building using the Exa.ai engine.
Conducts complex, multi-step asynchronous research and deep analysis using Exa's AI-driven search engine.
Extracts clean, clutter-free content from web articles and blog posts for readable text storage.
Scroll for more results...