01AI-enhanced extraction for tables, forms, and LaTeX mathematical equations
02Support for PDF, EPUB, PPTX, DOCX, XLSX, HTML, and image formats
03Multiple output modes including Markdown, HTML, JSON, and RAG-optimized chunks
04Integration with Claude, GPT-4o, Google Gemini, and local Ollama models
050 GitHub stars
06Configurable OCR processing for scanned documents and handwriting