CodeDox
Crawl documentation websites, extract code snippets, and provide fast, searchable access to technical content.
About
CodeDox is a comprehensive system designed to transform scattered online documentation into an easily searchable and AI-consumable knowledge base. It intelligently crawls websites, extracts relevant code snippets while preserving context, and employs advanced language detection. With its lightning-fast PostgreSQL full-text search and seamless Model Context Protocol (MCP) integration, CodeDox empowers developers and AI assistants to quickly find specific code examples and information across diverse documentation sources through a modern web UI.
Key Features
- Smart code and content extraction with context preservation and LLM-based language detection
- Modern React-based web dashboard for managing crawls, searching, and monitoring
- 2 GitHub stars
- Fast full-text search for code snippets via PostgreSQL with sub-100ms response times
- Controlled web crawling with configurable depth and content deduplication
- Integration with AI assistants via Model Context Protocol (MCP) for tool exposure
Use Cases
- Provide AI assistants with context-rich documentation and code snippets via MCP
- Create a centralized, searchable repository of code examples from various online sources
- Efficiently manage and monitor web crawling jobs for technical documentation