010 GitHub stars
02MCP Client–Server architecture for distributed multimodal processing
03Vector search with multimodal embeddings via ChromaDB
04Low-latency multilingual audio transcription powered by Faster-Whisper
05Interactive Gradio UI for seamless user interaction
06Detailed latency metrics for transcription, MCP tool, and post-processing