Crawler API
byjawa7
0Provides a modular web crawler and a REST API for managing and storing webpage data, integrated with an MCP server.
Acerca de
This project serves as a comprehensive learning resource, featuring a robust web crawler designed for modularity and efficient task queuing. It's complemented by a powerful REST API, enabling full CRUD operations for webpage data, and an MCP server, facilitating seamless integration with external LLMs and AI agents. With automatic Swagger documentation, TypeORM/PostgreSQL integration, and comprehensive unit test coverage, it offers a solid foundation for developing intelligent data collection and management systems.
Características Principales
- Modular web crawler with depth control and task queue
- 0 GitHub stars
- PostgreSQL integration via TypeORM
- MCP server for integration with external LLMs/agents via HTTP
- REST API for managing and storing webpage data (CRUD)
- Automatic API documentation and validation via Swagger and class-validator
Casos de Uso
- Automating web content collection and indexing for AI agents or LLMs
- Developing custom integrations for agents to access and process web information
- Persisting and managing crawled webpage data in a structured database