01Smart Deduplication: Pre-calculates MD5 hashes to bypass Gemini API calls for identical, unmodified files, optimizing token usage.
02Ultimate Multimodality: Natively scans, embeds, and retrieves images, video, and audio without extracting text.
03Visual PDF RAG: Parses PDFs page-by-page as high-quality images, embedding visual data while preserving text for LLM citation.
041 GitHub stars
05Local Privacy: Utilizes ChromaDB entirely locally, ensuring files never go to a 3rd party database.
06Agentic Guardrails: Features automatic junk filtering, wildcard blacklisting, API exponential backoff, and ghost file pruning for autonomous AI agents.