01High-fidelity PDF extraction for tables, forms, and charts
020 GitHub stars
03Comprehensive audio transcription with timestamps and speaker ID
04Advanced vision tasks including object detection and segmentation
05Long-form video analysis and scene detection up to 6 hours
06Text-to-image generation and iterative image refinement