01Multi-format conversion between PDF, DOCX, XLSX, HTML, and images
02Structural data extraction for tables and plain text
030 GitHub stars
04High-accuracy OCR for scanned documents in 100+ languages
05Programmatic PDF form filling and digital CMS signatures
06Automated PII redaction using preset patterns or custom regex