01Automated music identification and timestamping via Shazam and Gemini API integration.
02Native video editing capabilities including clipping, merging, and splitting via FFmpeg automation.
034 GitHub stars
04Intelligent frame extraction using either time-based intervals or change-based scene detection.
05Foreign language translation and non-speech audio analysis (e.g., detecting applause or ambient noise).
06Multi-modal analysis combining visual frame extraction with local Whisper speech-to-text transcription.