01Automatic OCR fallback for scanned and image-based PDFs
02Detection and extraction of tables into clean markdown format via pdfplumber
03Local semantic search across all ingested documents with similarity scoring
04Layout-preserving text extraction for multi-column documents using PyMuPDF
05Full compatibility with MCP clients like Claude Desktop, Claude Code, and Cursor
060 GitHub stars