Generate AI-powered images, videos, music, and speech using Google Gemini models.
Media is an MCP server designed to empower AI agents with advanced media generation capabilities, leveraging Google Gemini models. It allows for the creation and editing of images using Nano Banana models, generation of videos with native audio, dialogue, and sound effects using Veo models, composition of instrumental music with weighted prompts using Lyria RealTime, and conversion of text to speech with voice and style control via Gemini TTS. This tool provides a standardized API for seamless integration into AI workflows, enhancing the interactivity and output richness of AI agents.
