01High-quality embeddings via `nomic-embed-text` and Ollama for superior semantic understanding of dense technical terms.
02Lightweight, zero-config portable JSON vector database, removing heavy dependencies like ChromaDB/Docker.
03State-of-the-art extraction using `pymupdf4llm` for converting complex scientific documents into clean Markdown.
04Provides MCP tools for intelligent PDF indexing, high-precision knowledge search, and secure database management.
05Enhanced batch processing for indexing up to 10 scientific papers simultaneously with optimized chunking.
061 GitHub stars