MCP

Generate Speech from Text and Upload to S3

Name: Generate Speech from Text and Upload to S3
Availability: OnlineOnly
Author: thewh1teagle

Kokoro TTS MCP Server generates MP3s from text with optional S3 upload. Install via UV or source.

Connect

Works with s3ffmpeg

thewh1teagle

Maintainer?

Spark score

out of 100

Updated 4 months ago

Version model-files-v1.0

Models

claude

Add to Favorites

Why it matters

Automate the creation of audio content by converting text into MP3 files using the Kokoro TTS model. Optionally, seamlessly upload generated audio to S3 storage for easy access and distribution.

Outcomes

What it gets done

Convert text input into speech using the Kokoro TTS model.

Generate MP3 audio files from synthesized speech.

Automatically upload generated MP3 files to an S3 bucket.

Configure voice, language, and speech speed for audio generation.

Install

Add it to your toolbox

Run in your project directory:

curl -fsSL https://spark.entire.vc/get/vb-kokoro-tts | bash

Capabilities

Tools your agent gets

text_to_speech

Generate MP3 files from text using Kokoro TTS with configurable voice, speed, and language settings.

upload_to_s3

Automatically upload generated MP3 files to S3 storage with optional local file cleanup.

Overview

Kokoro TTS MCP Server

What it does

The Kokoro TTS MCP Server is a connector that synthesizes text into MP3 audio files using the Kokoro TTS model. It offers features such as optional automatic upload of generated MP3s to S3 storage, support for multiple voices and languages, configurable speech speed, and automatic cleanup of old audio files.

How it connects

Use this server when you need to integrate text-to-speech functionality into an AI client or application. It is particularly useful for scenarios requiring programmatic audio generation, with the added benefit of cloud storage integration for generated audio assets.

Source README

kokoro-onnx

TTS with onnx runtime based on Kokoro-TTS

🚀 Version 1.0 models are out now! 🎉

https://github.com/user-attachments/assets/00ca06e8-bbbd-4e08-bfb7-23c0acb10ef9

Features

Supports multiple languages
Fast performance near real-time on macOS M1
Offer multiple voices
Lightweight: ~300MB (quantized: ~80MB)

Setup

pip install -U kokoro-onnx

Instructions

Install uv for isolated Python (Recommend).

pip install uv

Create new project folder (you name it)
Run in the project folder

uv init -p 3.12
uv add kokoro-onnx soundfile

Paste the contents of examples/save.py in hello.py
Download the files kokoro-v1.0.onnx, and voices-v1.0.bin and place them in the same directory.
Run

uv run hello.py

You can edit the text in hello.py

That's it! audio.wav should be created.

Examples

See examples

Voices

See the latest voices and languages in Kokoro-82M/VOICES.md

Note: It's recommend to use misaki g2p package from v1.0, see examples

Contribute

See CONTRIBUTE.md

Discussion