Crawls websites and extracts content from multiple pages into structured JSON or local markdown files.
Tavily Web Crawler is a specialized Claude Code skill designed for bulk content extraction and deep web research. It enables users to crawl entire domains or specific sub-paths, automatically converting web pages into clean, local markdown files for offline use or AI training. With advanced controls for depth, breadth, and regex-based path filtering, it allows developers to download full documentation sets, extract semantic context for LLMs, and automate data collection tasks without manual scraping configuration.
主要功能
01Configurable multi-page crawling with depth and breadth controls
02Bulk documentation downloading for offline reference and RAG workflows
03213 GitHub stars
04Advanced path filtering using include and exclude regex patterns
05Semantic extraction that prioritizes relevant content chunks for AI context
06Automatic conversion of web content to formatted local markdown files
使用场景
01Bulk data collection for research and competitive analysis across multiple domains
02Extracting specific API guides from a domain to feed into an LLM context window
03Downloading an entire documentation site (e.g., /docs) for local reference