Fish Audio FAQs

Question 1

What is Fish Audio MCP Server?

Accepted Answer

Fish Audio MCP Server is a developer tool that seamlessly integrates Fish Audio's high-quality Text-to-Speech (TTS) API with large language models (LLMs) like Claude, leveraging the Model Context Protocol (MCP) to enable natural language-driven speech synthesis.

Question 2

How does Fish Audio MCP Server integrate with LLMs?

Accepted Answer

It exposes a 'fish_audio_tts' tool to LLMs, allowing them to generate speech from text using natural language commands. The integration is managed via the Model Context Protocol, making it easy to incorporate advanced TTS capabilities into LLM-powered applications.

Question 3

How do I get started with Fish Audio MCP Server?

Accepted Answer

You can quickly get started by installing it via npm or npx. Configuration requires your Fish Audio API key, which is set via environment variables (e.g., FISH_API_KEY), and then adding the server to your MCP settings configuration.

Question 4

What key audio features does this server offer?

Accepted Answer

The server provides high-quality TTS using Fish Audio's state-of-the-art models, real-time audio streaming for low-latency applications, support for custom voice models via reference IDs, and output in various formats including MP3, WAV, PCM, and Opus.

Question 5

Does it support real-time audio synthesis and playback?

Accepted Answer

Yes, Fish Audio MCP Server supports real-time audio streaming over both HTTP and WebSocket. It also offers a 'realtime_play' feature, enabling immediate audio playback during WebSocket streaming for interactive experiences.

Fish Audio

Fish Audio

Key Features

Use Cases

Key Features

Use Cases