XTTS-v2

XTTS-v2 provides cross-lingual text-to-speech with voice cloning, allowing the same voice to speak multiple languages.

Overview

Property	Value
Provider	Coqui
Cost	80 credits
Modality	Audio
Duration	Variable
Prompt Required	Yes

What It's Best For

Multilingual content — Same voice, different languages
Localization — Global content distribution
Voice cloning — Custom voice models
International brands — Consistent global voice
Dubbing — Same voice for translations

Inputs

Text (Required)

The text to speak.

Connection Color: 🟡 Yellow

Voice Sample (Optional)

Audio sample for voice cloning.

Connection Color: 🟠 Orange

Configuration

Language

Type: Select

English, Spanish, French, German, Italian, Portuguese, Polish, Turkish, Russian, Dutch, Czech, Arabic, Chinese, Japanese, Korean, Hindi.

Speed

Type: Slider
Range: 0.5 - 2.0
Default: 1.0

Seed

Type: Number

Output

Type: Audio
Connection Color: 🟠 Orange

Use Cases

Multilingual Narration

Same voice narrating in English, Spanish, and French,
consistent brand voice across languages.

Global Marketing

Product announcement in multiple languages,
same spokesperson voice,
localized content.

Dubbing

Translate video narration to new language,
maintain original voice characteristics.

Supported Languages

Language	Quality
English	⭐⭐⭐⭐⭐
Spanish	⭐⭐⭐⭐⭐
French	⭐⭐⭐⭐⭐
German	⭐⭐⭐⭐
Chinese	⭐⭐⭐⭐
Japanese	⭐⭐⭐⭐
Others	⭐⭐⭐

Tips for Best Results

Match language — Voice sample in target language helps
Clear samples — Better cloning results
Native text — Use native speakers for text
Test languages — Quality varies by language

Voice Cloning — Single language
Dia TTS — Multi-speaker
Kokoro TTS — Fast generation