Emotional Voice Control

MARS 8.1 Beta and mars-instruct both support expressive text controls, covering pronunciation overrides, non-verbal sounds, emotion tags, delivery hints, and SSML-style pauses. This page covers the syntax for each model.

Hear the Examples

MARS 8.1 Beta Text Controls

With mars-8.1-flash-beta and mars-8.1-pro-beta, you can add pronunciation and non-verbal controls directly in the text.

Pronunciation Control (English)

Use CMU pronunciation dictionary phonemes in uppercase, wrapped in brackets, to override default English pronunciations.

Python

response = client.text_to_speech.tts(
    text="Please [W AY1 N D] the clock before the strong [W IH1 N D] starts.",
    voice_id=147320,
    language="en-us",
    speech_model="mars-8.1-flash-beta",
    output_configuration=StreamTtsOutputConfiguration(format="wav")
)

TypeScript

const response = await client.textToSpeech.tts({
  text: 'He plays the [B EY1 S] guitar while catching a [B AE1 S] fish.',
  voice_id: 147320,
  language: 'en-us',
  speech_model: 'mars-8.1-flash-beta',
  output_configuration: { format: 'wav' }
});

Non-verbal Symbols

Insert supported tags directly in the text to add expressive non-verbal sounds.

Python

response = client.text_to_speech.tts(
    text="[laughter] You really got me. I didn't see that coming at all.",
    voice_id=147320,
    language="en-us",
    speech_model="mars-8.1-flash-beta",
    output_configuration=StreamTtsOutputConfiguration(format="wav")
)

TypeScript

const response = await client.textToSpeech.tts({
  text: '[laughter] You really got me. I didn\'t see that coming at all.',
  voice_id: 147320,
  language: 'en-us',
  speech_model: 'mars-8.1-flash-beta',
  output_configuration: { format: 'wav' }
});

Supported tags for mars 8.1 :

Supported Tags: [laughter], [sigh], [confirmation], [question], [surprise], [dissatisfaction].

Emotion Tags for Mars 8 Instruct:

mars-instruct accepts a richer set of tags for emotion, delivery, and sound effects.

Emotion Tone Tags

For emotional tone (happy, sad, angry), use tags and match your text content to the emotion:

Tag	Example Text
`[happy]`	”[happy] We won the match! This is the best day ever!”
`[sad]`	”[sad] I… I don’t know if I can do this anymore…”

Important: The text content and punctuation must match the emotion for best results.

Sound Effect Tags

Sound effect tags go within your sentence where the action naturally occurs:

Tag	Example	Notes
`[laughing]`	”That’s ridiculous! [laughing] I can’t believe that!”	Produces laughter sound
`[sighing]`	”I guess we have to start over. [sighing] Alright, let’s begin.”	Produces sigh sound
`ahem ahem`	”So what I was going to say is… ahem ahem… never mind.”	Produces throat-clearing sound

Delivery Tags

Delivery tags provide tone guidance for the words that follow them.

Tag	Effect
`[shouting, angry, threatening]`	Agitated, confrontational delivery
`[whispering, secretive]`	Quiet, intimate delivery
`[empathetic, helpful]`	Caring, supportive delivery
`[happy, excited, promotional]`	Upbeat, promotional delivery
`[patient, teaching]`	Educational, measured delivery

Emotion Tag Gradation Guide

How To Use This

Read each tag list from left to right.
Left side means more balanced, subtle, or restrained.
Right side means more extreme, forceful, or obvious.
If you want the strongest controllable result, start from the rightmost tag.
If you want a more natural or less exaggerated result, move one or two steps left.

This is a practical TTS guide, not a dictionary guide. Some tags are ordered by how strongly they tend to push delivery, not just by literal meaning.

Examples Of Use

[angry] Who stole my cash!
[trembling] I don't know who did it...
[cheerful] Welcome back. I saved you a seat.
[commanding] Stop right there and listen carefully.

Tag Ladders

Balanced -> Extreme

Nervousness: [uneasy] -> [nervous] -> [anxious] -> [trembling]
Fear: [fearful] -> [scared] -> [terrified] -> [panicked]
Anger: [irritated] -> [angry] -> [furious] -> [enraged]
Sadness: [down] -> [sad] -> [melancholic] -> [depressed]
Joy: [cheerful] -> [happy] -> [joyful] -> [delighted]
Excitement: [energetic] -> [excited] -> [thrilled] -> [hyped]
Calmness: [relaxed] -> [calm] -> [peaceful] -> [serene]
Confidence: [assured] -> [confident] -> [certain] -> [bold]
Doubt: [uncertain] -> [doubtful] -> [hesitant] -> [skeptical]
Surprise: [surprised] -> [startled] -> [shocked] -> [astonished]
Disgust: [grossed_out] -> [disgusted] -> [repulsed] -> [revolted]
Pride: [satisfied] -> [accomplished] -> [proud] -> [fulfilled]
Shame: [embarrassed] -> [guilty] -> [ashamed] -> [humiliated]
Love: [warm] -> [affectionate] -> [loving] -> [tender]
Flirtation: [charming] -> [playful] -> [flirty] -> [teasing]
Sarcasm: [dry] -> [ironic] -> [sarcastic] -> [mocking]
Determination: [focused] -> [determined] -> [driven] -> [resolute]
Frustration: [annoyed] -> [irritated] -> [frustrated] -> [exasperated]
Relief: [calmed] -> [reassured] -> [relieved] -> [grateful]
Curiosity: [interested] -> [curious] -> [inquiring] -> [intrigued]
Boredom: [dull] -> [uninterested] -> [bored] -> [apathetic]
Awe: [inspired] -> [amazed] -> [awed] -> [wonderstruck]
Suspicion: [wary] -> [suspicious] -> [guarded] -> [distrustful]
Urgency: [urgent] -> [rushed] -> [intense] -> [pressured]
Authority: [firm] -> [authoritative] -> [directive] -> [commanding]
Politeness: [polite] -> [courteous] -> [respectful] -> [formal]
Gratitude: [appreciative] -> [thankful] -> [grateful] -> [warm]
Confusion: [uncertain] -> [puzzled] -> [confused] -> [lost]
Hopelessness: [resigned] -> [defeated] -> [hopeless] -> [despairing]
Playfulness: [lighthearted] -> [playful] -> [fun] -> [silly]

Practical Rule Of Thumb

Use the leftmost tag when you want the emotion to be present but not overpower the sentence.
Use the middle tags when you want clear emotional color without sounding theatrical.
Use the rightmost tag when you need the emotion to come through strongly and consistently.

Example:

Nervousness, subtle: [uneasy]
Nervousness, clear: [anxious]
Nervousness, strongest: [trembling]

How To Generalize This To New Emotions

This same principle generalizes well to new emotions:

Start with 3 to 4 tags for the same emotional family.
Arrange them from balanced to extreme.
Test them on the same sentence.
Keep the tag that gives the clearest emotional control without distorting the sentence too much.
When in doubt, the most extreme tag often gives the strongest controllability.

General rule: same emotion family + left-to-right intensity ladder + same test sentence = reliable controllable TTS

Combining Tags

For precise control, combine multiple embedded emotion and delivery tags:

Python

response = client.text_to_speech.tts(
    text="[sighing, secretive] I have a secret to tell you... [happy, excited] We're going to Paris!",
    voice_id=147320,
    language="en-us",
    speech_model="mars-instruct",
    output_configuration=StreamTtsOutputConfiguration(format="wav")
)

TypeScript

const response = await client.textToSpeech.tts({
  text: '[sighing, secretive] I have a secret to tell you... [happy, excited] We\'re going to Paris!',
  voice_id: 147320,
  language: 'en-us',
  speech_model: 'mars-instruct',
  output_configuration: { format: 'wav' }
});

Pauses

Add SSML-style breaks anywhere in your text for dramatic pauses:

You... must... understand... this. <break time='600ms'/> The future begins NOW.

Best Practices

Use specific tags - Place concise delivery tags near the sentence they should affect
Match content to emotion - Text and punctuation should reflect the emotional tone
Place sound effects naturally - Tags like [laughing], [sighing] work best within sentences
Keep tags short - Tags like [happy], [sad], or [whispering] work best when focused
Add pauses - Use <break time='600ms'/> for dramatic effect

Next Steps

Text to Speech

Get started with basic TTS using the Python or TypeScript SDK.

Choosing a Model

Compare mars-instruct with mars-flash and mars-pro.

Voice Cloning

Create custom voices for your emotional speech.

TTS with Accents

Generate speech in 140+ language accents.

Documentation Index

​Hear the Examples

​MARS 8.1 Beta Text Controls

​Pronunciation Control (English)

​Non-verbal Symbols

​Supported tags for mars 8.1 :

​Emotion Tags for Mars 8 Instruct:

​Emotion Tone Tags

​Sound Effect Tags

​Delivery Tags

​Emotion Tag Gradation Guide

​How To Use This

​Examples Of Use

​Tag Ladders

​Practical Rule Of Thumb

​How To Generalize This To New Emotions

​Combining Tags

​Pauses

​Best Practices

​Next Steps

Text to Speech

Choosing a Model

Voice Cloning

TTS with Accents

Hear the Examples

MARS 8.1 Beta Text Controls

Pronunciation Control (English)

Non-verbal Symbols

Supported tags for mars 8.1 :

Emotion Tags for Mars 8 Instruct:

Emotion Tone Tags

Sound Effect Tags

Delivery Tags

Emotion Tag Gradation Guide

How To Use This

Examples Of Use

Tag Ladders

Practical Rule Of Thumb

How To Generalize This To New Emotions

Combining Tags

Pauses

Best Practices

Next Steps