web scraping & data collection Claude 스킬을 발견하세요. 17개의 스킬을 탐색하고 AI 워크플로우에 완벽한 기능을 찾아보세요.
Transforms web URLs into clean, distraction-free Markdown files using reader-mode heuristics and automated filename sanitization.
Performs high-precision semantic searches and intent-based research using the Exa AI search engine.
Automates deep web scraping, PDF parsing, and visual snapshots using the Firecrawl API.
Performs semantic web searches and similar content discovery using the Exa API to retrieve high-quality research data.
Converts interview recordings, academic PDFs, and various document formats into structured markdown for qualitative analysis.
Executes parallel web searches across multiple providers like Tavily, Perplexity, Gemini, and Exa for high-density, verified information.
Orchestrates the discovery, retrieval, and organization of academic literature to facilitate theoretical pattern extraction and qualitative research synthesis.
Implements a high-performance decision framework for parsing structured text by combining deterministic regex with LLM-powered edge case validation.
Performs neural, semantic web searches and content discovery using the Exa API to find highly relevant data and research.
Extracts clean, structured data from any website using the powerful Firecrawl API for deep web crawling and scraping.
Empowers Claude to perform advanced web searches, crawl websites, and extract high-quality content using the Tavily AI search engine.
Implements a hybrid decision framework for parsing structured text using Regex for efficiency and LLMs for complex edge cases.
Performs concurrent, recursive web scraping to extract clean text content while maintaining URL-based directory structures and respecting domain boundaries.
Queries and manages workforce data including company tracking, semantic job searching, and data refresh monitoring.
Fetches web documentation, extracts specific topics using AI subagents, and generates structured markdown summaries.
Converts entire PDF documents into clean, structured Markdown while preserving formatting, tables, and images for seamless context loading.
Conducts deep market research, competitive analysis, and investor due diligence with cited sources and actionable business insights.
Orchestrates systematic web research through automated planning, parallel subagent delegation, and comprehensive report synthesis.
Exports comprehensive TripIt travel data including trips, flights, and lodging into a structured JSON format via browser automation.
Downloads videos, audio, and subtitles from YouTube and other platforms using yt-dlp with optimized settings.
Integrates Gemini AI models with real-time web content through Google Search grounding to provide verified information and automatic citations.
Simplifies querying the Google Places API for search, location details, and business reviews directly from the terminal.
Automates the collection, normalization, and deduplication of Request for Proposal (RFP) opportunities from government and private data sources.
Architects scalable Firecrawl integrations using validated monolith, service layer, and microservice patterns.
Extracts structured data from complex websites using a robust, three-phase Playwright automation workflow.
Extracts and correlates high-quality food images from delivery platforms and restaurant websites to populate digital catalogs and menus.
Conducts deep-dive market analysis, competitive research, and investor due diligence with sourced evidence and decision-ready summaries.
Scrapes comprehensive vehicle configurations and pricing from the BYD Australian website and converts the data into ready-to-use MySQL INSERT statements.
Optimizes structured text extraction by combining efficient regex patterns with LLM validation for high-accuracy, low-cost data parsing.
Implements a high-efficiency hybrid pipeline that prioritizes regex for structured text extraction while reserving LLMs for low-confidence edge cases.
Scroll for more results...