012 GitHub stars
02Multi-stage fallback OCR for PDF, PPT, and PPTX files
03Duplicate image detection and similarity auditing tools
04Automated LaTeX math delimiter repair and Markdown cleanup
05Reference book metadata and directory page generation
06Strict integrity auditing and verification scripts to prevent data loss