Coqui TTS (Open Source Speech Studio API)

Coqui TTS (Open Source Speech Studio API) MCP Connector for Claude

A+

Generate high-quality speech from text using Coqui TTS. List available models and synthesize audio directly from your AI agent.

2 tools Official Updated Jun 28, 2026 Official Vinkius Partner

Connect your Coqui TTS server to any AI agent to enable advanced text-to-speech capabilities. This MCP server allows you to interact with your self-hosted or cloud-based Coqui Speech Studio API through natural conversation.

What you can do

  • Model Discovery — List all available text-to-speech models currently loaded on your Coqui server
  • Speech Synthesis — Convert any text string into spoken audio instantly using the server's synthesis engine
  • Metadata Retrieval — Get detailed information about generated audio files and model configurations

How it works

  1. Subscribe to this server
  2. Enter your Coqui Server URL (e.g., your local or hosted API endpoint)
  3. Start generating high-quality voices from Claude, Cursor, or any MCP-compatible client

Who is this for?

  • Developers — integrate voice synthesis into applications without leaving the code editor
  • Content Creators — quickly generate voiceovers or speech samples using open-source models
  • AI Researchers — test and compare different TTS models directly through an agent interface
text-to-speechttscoqui-aispeech-synthesisvoice-generation

2 tools expose this connector's capabilities to your AI agent.

list_models

List available TTS models

synthesize_speech

Returns metadata about the generated audio file. Synthesize speech from text

See how to talk to your AI agent using Coqui TTS (Open Source Speech Studio API).

List all available TTS models on my Coqui server.

I've retrieved the available models from your Coqui server. You have access to models like 'tts_models/en/ljspeech/vits' and 'tts_models/multilingual/multi-dataset/xtts_v2'. Which one would you like to use?

Synthesize the text 'Welcome to the future of open source AI' into speech.

I am generating the audio for that text now... Done! The speech has been synthesized successfully using your default Coqui model. You can find the audio metadata and file details here.

What models can I use for speech synthesis?

Let me check your Coqui server... It currently reports 3 active models: a standard English model, a multilingual XTTS model, and a fast VITS model. Would you like to see the full technical IDs for these?

You can use the `list_models` tool. Your agent will query the Coqui server and return a list of all available TTS models ready for synthesis.

Related Connectors