01Seamless integration with Google AI Studio and Gemini API
02Comprehensive multi-modal file processing (images, video, audio, documents, text)
03PDF-to-Markdown conversion with structure and formatting preservation
04Detailed image analysis for visual content, diagrams, and screenshots
05Accurate audio transcription with speaker identification and punctuation
065 GitHub stars