011 GitHub stars
02Multi-format conversion between PDF, Office (DOCX, XLSX, PPTX), HTML, and Images
03Automated PII redaction using presets for SSNs, emails, and credit cards
04Structured data and table extraction from PDFs directly to Excel or text
05Advanced OCR supporting over 100 languages for scanned document digitization
06Professional document finishing including digital signatures and form filling