01Structured output identifying what was found, what was missing, and uncertainties
02Quote-grounding technique to ensure every claim traces back to exact line numbers
03Type-specific extraction for code, JSON/YAML, CSV, and Markdown
0420 GitHub stars
05Binary file metadata detection and PDF text extraction capabilities
06Size-aware strategies including full-read, section-based, and map-reduce