01Checkpoint support for large document processing with resume capabilities
02Structured extraction of complex tables and visual grid patterns
03High-resolution image extraction with page-relative asset mapping
04Robust error recovery for corrupted files and password-protected documents
0567 GitHub stars
06Automated detection of text-based vs. scanned (OCR) PDF formats