Fish Audio MCP Server
An MCP server that provides seamless integration between Fish Audio's Text-to-Speech API and LLMs like Claude, delivering natural language speech synthesis capabilities with voice cloning, multilingual support, and streaming features.
Get this MCP server
An MCP server that provides seamless integration between Fish Audio's Text-to-Speech API and LLMs like Claude, delivering natural language speech synthesis capabilities with voice cloning, multilingual support, and streaming features.
Installation
NPX (Direct Run)
npx @alanse/fish-audio-mcp-server
Global Installation via NPM
npm install -g @alanse/fish-audio-mcp-server
From Source
git clone https://github.com/da-okazaki/mcp-fish-audio-server.git
cd mcp-fish-audio-server
npm install
npm run build
npm run dev
Configuration
Single Voice Mode
{
"mcpServers": {
"fish-audio": {
"command": "npx",
"args": ["-y", "@alanse/fish-audio-mcp-server"],
"env": {
"FISH_API_KEY": "your_fish_audio_api_key_here",
"FISH_MODEL_ID": "speech-1.6",
"FISH_REFERENCE_ID": "your_voice_reference_id_here",
"FISH_OUTPUT_FORMAT": "mp3",
"FISH_STREAMING": "false",
"FISH_LATENCY": "balanced",
"FISH_MP3_BITRATE": "128",
"FISH_AUTO_PLAY": "false",
"AUDIO_OUTPUT_DIR": "~/.fish-audio-mcp/audio_output"
}
}
}
}
Multiple Voices Mode
{
"mcpServers": {
"fish-audio": {
"command": "npx",
"args": ["-y", "@alanse/fish-audio-mcp-server"],
"env": {
"FISH_API_KEY": "your_fish_audio_api_key_here",
"FISH_MODEL_ID": "speech-1.6",
"FISH_REFERENCES": "[{'reference_id':'id1','name':'Alice','tags':['female','english']},{'reference_id':'id2','name':'Bob','tags':['male','japanese']},{'reference_id':'id3','name':'Carol','tags':['female','japanese','anime']}]",
"FISH_DEFAULT_REFERENCE": "id1",
"FISH_OUTPUT_FORMAT": "mp3",
"FISH_STREAMING": "false",
"FISH_LATENCY": "balanced",
"FISH_MP3_BITRATE": "128",
"FISH_AUTO_PLAY": "false",
"AUDIO_OUTPUT_DIR": "~/.fish-audio-mcp/audio_output"
}
}
}
}
Available Tools
| Tool | Description |
|---|---|
fish_audio_tts |
Generates speech from text using Fish Audio's TTS API with support for custom voices, streaming... |
fish_audio_list_references |
Displays all configured voice references with their IDs, names, and tags |
Features
- High-quality TTS with modern speech synthesis
- Streaming support for real-time audio streaming and low-latency applications
- Multiple voice support with custom voice models via reference ID
- Intelligent voice selection by ID, name, or tags
- Voice library management for configuring and managing multiple voice references
- Support for multiple audio formats (MP3, WAV, PCM, Opus)
- Voice cloning capabilities for creating custom voice models
- Multilingual support including English, Japanese, Chinese, and more
- Real-time WebSocket streaming with live playback
- Flexible configuration via environment variables
Environment Variables
Required
FISH_API_KEY- Your Fish Audio API key
Optional
FISH_MODEL_ID- TTS model to use (s1, speech-1.5, speech-1.6)FISH_REFERENCE_ID- Default voice reference ID (single reference mode)FISH_REFERENCES- Multiple voice references in JSON formatFISH_DEFAULT_REFERENCE- Default reference ID when using multiple referencesFISH_OUTPUT_FORMAT- Default audio format (mp3, wav, pcm, opus)FISH_STREAMING- Enable streaming mode (HTTP/WebSocket)FISH_LATENCY- Latency mode (normal, balanced)FISH_MP3_BITRATE- MP3 bitrate (64, 128, 192)
Usage Examples
Generate speech with the text 'Hello, world! Welcome to Fish Audio TTS.'
Generate speech with voice model xyz123 saying 'This is a custom voice test'
Use Alice's voice to say 'Hello from Alice'
Generate Japanese speech with the text 'こんにちは' in an anime voice
What voices are available?
Notes
Requires a Fish Audio API key. Supports up to 10,000 characters per TTS request. Voice selection priority: reference_id > reference_name > reference_tag > default. Includes error handling for API limits, network issues, and invalid parameters.