Integrates Fish Audio's Text-to-Speech API with large language models, enabling natural language-driven speech synthesis through the Model Context Protocol.
Sponsored
The Fish Audio server acts as an MCP (Model Context Protocol) intermediary, seamlessly connecting Fish Audio's advanced Text-to-Speech (TTS) capabilities with LLMs like Claude. It empowers large language models to generate high-quality, natural-sounding speech from text, supporting features such as real-time audio streaming, multiple voice models via reference IDs, and a variety of audio formats. This robust tool allows developers and AI agents to easily leverage powerful TTS for diverse applications, from interactive conversational agents to dynamic content generation, all configurable via environment variables.
Key Features
012 GitHub stars
02Support for custom voice models via reference IDs
03High-Quality TTS leveraging Fish Audio's models
04Multiple audio formats including MP3, WAV, PCM, and Opus
05Real-time audio streaming for low-latency applications
06Flexible configuration via environment variables
Use Cases
01Synthesizing speech using specific custom voice models
02Performing real-time, low-latency audio streaming for interactive applications
03Generating speech from text for AI conversational agents