Web Scraping & Data Collection Claude 스킬

web scraping & data collection Claude 스킬을 발견하세요. 17개의 스킬을 탐색하고 AI 워크플로우에 완벽한 기능을 찾아보세요.

URL Reader Mode to Markdown

Transforms web URLs into clean, distraction-free Markdown files using reader-mode heuristics and automated filename sanitization.

Exa Semantic Search

Performs high-precision semantic searches and intent-based research using the Exa AI search engine.

Firecrawl Scraper

Automates deep web scraping, PDF parsing, and visual snapshots using the Firecrawl API.

Exa Semantic Search

Performs semantic web searches and similar content discovery using the Exa API to retrieve high-quality research data.

Interview Ingest

Converts interview recordings, academic PDFs, and various document formats into structured markdown for qualitative analysis.

Search Hub

Executes parallel web searches across multiple providers like Tavily, Perplexity, Gemini, and Exa for high-density, verified information.

Literature Sweep

Orchestrates the discovery, retrieval, and organization of academic literature to facilitate theoretical pattern extraction and qualitative research synthesis.

Hybrid Structured Text Parsing

Implements a high-performance decision framework for parsing structured text by combining deterministic regex with LLM-powered edge case validation.

Exa Semantic Search

Performs neural, semantic web searches and content discovery using the Exa API to find highly relevant data and research.

Firecrawl Scraper

Extracts clean, structured data from any website using the powerful Firecrawl API for deep web crawling and scraping.

Tavily Web Search

Empowers Claude to perform advanced web searches, crawl websites, and extract high-quality content using the Tavily AI search engine.

Hybrid Structured Text Parser

Implements a hybrid decision framework for parsing structured text using Regex for efficiency and LLMs for complex edge cases.

Web Scraper

Performs concurrent, recursive web scraping to extract clean text content while maintaining URL-based directory structures and respecting domain boundaries.

Rollie Jobs & Workforce Analytics

Queries and manages workforce data including company tracking, semantic job searching, and data refresh monitoring.

Web Reference Fetcher

Fetches web documentation, extracts specific topics using AI subagents, and generates structured markdown summaries.

PDF to Markdown Converter

Converts entire PDF documents into clean, structured Markdown while preserving formatting, tables, and images for seamless context loading.

Market Research & Competitive Intelligence

Conducts deep market research, competitive analysis, and investor due diligence with cited sources and actionable business insights.

Web Research Assistant

Orchestrates systematic web research through automated planning, parallel subagent delegation, and comprehensive report synthesis.

TripIt Data Exporter

Exports comprehensive TripIt travel data including trips, flights, and lodging into a structured JSON format via browser automation.

YouTube & Video Downloader

Downloads videos, audio, and subtitles from YouTube and other platforms using yt-dlp with optimized settings.

Gemini Google Search Tool

Integrates Gemini AI models with real-time web content through Google Search grounding to provide verified information and automatic citations.

Google Places CLI

Simplifies querying the Google Places API for search, location details, and business reviews directly from the terminal.

RFP Data Ingestion Engine

Automates the collection, normalization, and deduplication of Request for Proposal (RFP) opportunities from government and private data sources.

Firecrawl Architecture Variants

Architects scalable Firecrawl integrations using validated monolith, service layer, and microservice patterns.

Playwright Web Scraper

Extracts structured data from complex websites using a robust, three-phase Playwright automation workflow.

Image Extraction

Extracts and correlates high-quality food images from delivery platforms and restaurant websites to populate digital catalogs and menus.

Market Research Agent

Conducts deep-dive market analysis, competitive research, and investor due diligence with sourced evidence and decision-ready summaries.

BYD Vehicle Data Scraper & SQL Generator

Scrapes comprehensive vehicle configurations and pricing from the BYD Australian website and converts the data into ready-to-use MySQL INSERT statements.

Regex & LLM Hybrid Text Parser

Optimizes structured text extraction by combining efficient regex patterns with LLM validation for high-accuracy, low-cost data parsing.

Regex vs LLM Parser Framework

Implements a high-efficiency hybrid pipeline that prioritizes regex for structured text extraction while reserving LLMs for low-confidence edge cases.

30 results loaded • More available

Scroll for more results...