01Structured output delivery including Markdown references and image assets
02Parallel processing and chunking for high-performance extraction of large PDFs
03Comprehensive metadata inspection and grounding checks
04High-fidelity text and table extraction using pdfplumber
0567 GitHub stars
06Integrated OCR support for scanned documents and images