MARS 8 TTS Models

The MARS 8 family consists of zero-shot, multilingual Text-to-Speech (TTS) models designed to cover a wide range of production needs. Each model varies in latency, quality, controllability, and ideal use case, allowing you to choose the right fit for your application. Plus, MARS-Pro and MARS-Flash models are available on all major cloud providers.

Model Summary

Model	Sample Rate	Notes
mars-8.1-flash-beta	48kHz	Beta MARS 8.1 model, optimized for faster generation. Try this model when quality is the priority, as it may perform much better for pronunciation, expressiveness with high-pitch references, overall prosody, accent control, accent coverage, and quality across different accents.
mars-8.1-pro-beta	48kHz	Beta MARS Pro model. Try this model when quality is the priority, as it may perform much better for pronunciation, expressiveness with high-pitch references, overall prosody, accent control, accent coverage, and quality across different accents.
mars-flash	22.05kHz/48kHz	Ultra-low latency TTS for real-time agents, call center agents, and live conversational assistants.
mars-pro	48kHz	Balanced speed and fidelity for expressive real-time speech, dubbing, audiobooks, and digital media.
mars-instruct	22.05kHz	Director-level emotional and prosodic control with embedded expressive text tags.

MARS-8.1-Flash-Beta (48khz)

Faster preview access to MARS 8.1 quality improvements

Model ID: mars-8.1-flash-beta
Primary Use Cases:
- Faster evaluation of MARS 8.1 generation quality
- Lower-latency workflows that still need stronger accent handling
- Comparing output quality and speed against mars-flash and mars-8.1-pro-beta
Quality Improvements: Provides the same quality as MARS 8.1, including much better quality across different accents, while being faster than mars-8.1-pro-beta.
Notes: This is a beta model. For both MARS 8.1 beta models, your target language should use a reference voice in the same language and accent for the best results. Test with your target voices, languages, accents, and audio formats before using either beta model in production workflows.

MARS-8.1-Pro-Beta (48khz)

Preview access to the newest Pro-quality speech model

Model ID: mars-8.1-pro-beta
Primary Use Cases:
- Evaluating the latest MARS Pro generation quality
- High-fidelity TTS experiments before production rollout
- Comparing output quality against mars-pro
Quality Improvements: May perform much better than mars-pro for pronunciation, overall prosody, accent control, accent coverage, and quality across different accents. It is also designed to improve expressiveness, especially for high-pitch reference voices.
Notes: This is a beta model. For MARS 8.1 beta models, your target language should use a reference voice in the same language and accent for the best results. Test it with your target voices, languages, accents, and audio formats before using it in production workflows.

MARS-Flash (22.05khz/48khz)

Ultra-low latency TTS for real-time agents and assistants

Parameters: 600M
TTFB: As low as 150 ms on certain GPUs like Blackwell
Primary Use Cases:
- Agentic conversations
- Call center agents
- Live conversational assistants

MARS-Pro (48khz)

Balanced speed and fidelity for expressive real-time speech

Parameters: 600M
TTFB: 800 ms – 2 s
Primary Use Cases:
- Real-time translation with voice and emotion transfer
- Expressive dubbing
- Audiobooks and digital media
Notes: Delivers best overall performance when speed is not the primary constraint, especially with short or challenging reference audio.

MARS-Instruct (22.05khz)

Director-level emotional and prosodic control

Parameters: 1.2B
TTFB: Higher latency (not intended for real-time use)
Primary Use Cases:
- High-end TV and film production
- Movie dubbing and post-production editing
Capabilities:
- Independent control of speaker identity and prosody
- Style and emotion can be tuned using both:
- A reference audio sample
- A textual description of desired prosody

MARS-Nano

Highly efficient TTS for on-device deployment

Parameters: 50M
TTFB: 500 ms – 2 s, depending on available compute
Primary Use Cases:
- On-device applications
- Environments with strict memory and compute constraints
Deployment Notes: Currently deployed with partners and providers such as Broadcom.

Tips For Best Results:

For texts with numbers expand the numbers to words. For example, instead of “123” to “one hundred twenty three” or “one two three” as you need.
For code-switched sentences, perform transliteration to convert the text to your chosen TTS language. We’re improving the model to handle above nuances better, but we find that practically most LLM outputs feeding in already have the conversions. We’ve focused more on other parameters related to quality.

Advanced Customization

Fine-tune the audio with additional parameters to control the performance, style, and quality of the generated speech. These can be sent in the payload. More details available in the API Reference.

user_instructions: Guide the voice’s delivery (e.g., “Warm, clear, and conversational”). Only supported with mars-instruct.
output_configuration: Set the audio format (wav, mp3), and apply enhancements.
voice_settings: Enhance reference audio quality, maintain the source accent, or adjust the speaking rate.
inference_options: Adjust stability, temperature, and speaker similarity for unique results.

Language Support

MARS-8.1-Flash-Beta, MARS-8.1-Pro-Beta, MARS-Flash, MARS-Pro, and MARS-Instruct are released across multiple languages, collectively covering 99% of the world’s speaking population. Coverage varies by model:

Model	Languages
`mars-flash`, `mars-pro`	33
`mars-8.1-flash-beta`, `mars-8.1-pro-beta`	312
`mars-instruct`	141

See the full per-model locale list in Language Support.

Getting Started

Models

Tutorials

SDK Guides

Hosting Platforms

Integrations

API Reference

Other Products

Release Logs

MARS 8 TTS Models

Model Summary

MARS-8.1-Flash-Beta (48khz)

MARS-8.1-Pro-Beta (48khz)

MARS-Flash (22.05khz/48khz)

MARS-Pro (48khz)

MARS-Instruct (22.05khz)

MARS-Nano

Tips For Best Results:

Advanced Customization

Language Support

​Model Summary

​MARS-8.1-Flash-Beta (48khz)

​MARS-8.1-Pro-Beta (48khz)

​MARS-Flash (22.05khz/48khz)

​MARS-Pro (48khz)

​MARS-Instruct (22.05khz)

​MARS-Nano

​Tips For Best Results:

​Advanced Customization

​Language Support

Model Summary

MARS-8.1-Flash-Beta (48khz)

MARS-8.1-Pro-Beta (48khz)

MARS-Flash (22.05khz/48khz)

MARS-Pro (48khz)

MARS-Instruct (22.05khz)

MARS-Nano

Tips For Best Results:

Advanced Customization

Language Support