01Performance optimization via parallel processing and granular feature toggling
02Formula and mathematical content identification within technical and academic documents
03High-fidelity table extraction for complex, merged, or multi-page table structures
04Granite vision-language model support for superior OCR and layout understanding
05Automated multi-column layout detection and logical reading order preservation
060 GitHub stars