Overview
Transform text from information to experience. With mars-instruct, you can craft speech that captures subtle emotional states, dramatic pacing, and conversational dynamics. Not just reading text, but performing it. Withmars-instruct, you control expression by adding concise tags directly in the text, such as [happy], [sad], [laughing], or [sighing].
Hear the Difference
Emotion Tags
Emotion Tone Tags
For emotional tone (happy, sad, angry), use tags and match your text content to the emotion:| Tag | Example Text |
|---|---|
[happy] | โ[happy] We won the match! This is the best day ever!โ |
[sad] | โ[sad] Iโฆ I donโt know if I can do this anymoreโฆโ |
Sound Effect Tags
Sound effect tags go within your sentence where the action naturally occurs:| Tag | Example | Notes |
|---|---|---|
[laughing] | โThatโs ridiculous! [laughing] I canโt believe that!โ | Produces laughter sound |
[sighing] | โI guess we have to start over. [sighing] Alright, letโs begin.โ | Produces sigh sound |
ahem ahem | โSo what I was going to say isโฆ ahem ahemโฆ never mind.โ | Produces throat-clearing sound |
Delivery Tags
Delivery tags provide tone guidance for the words that follow them.| Tag | Effect |
|---|---|
[shouting, angry, threatening] | Agitated, confrontational delivery |
[whispering, secretive] | Quiet, intimate delivery |
[empathetic, helpful] | Caring, supportive delivery |
[happy, excited, promotional] | Upbeat, promotional delivery |
[patient, teaching] | Educational, measured delivery |
Emotion Tag Gradation Guide
How To Use This
- Read each tag list from left to right.
- Left side means more balanced, subtle, or restrained.
- Right side means more extreme, forceful, or obvious.
- If you want the strongest controllable result, start from the rightmost tag.
- If you want a more natural or less exaggerated result, move one or two steps left.
Examples Of Use
[angry] Who stole my cash![trembling] I don't know who did it...[cheerful] Welcome back. I saved you a seat.[commanding] Stop right there and listen carefully.
Tag Ladders
Balanced -> Extreme- Nervousness:
[uneasy]->[nervous]->[anxious]->[trembling] - Fear:
[fearful]->[scared]->[terrified]->[panicked] - Anger:
[irritated]->[angry]->[furious]->[enraged] - Sadness:
[down]->[sad]->[melancholic]->[depressed] - Joy:
[cheerful]->[happy]->[joyful]->[delighted] - Excitement:
[energetic]->[excited]->[thrilled]->[hyped] - Calmness:
[relaxed]->[calm]->[peaceful]->[serene] - Confidence:
[assured]->[confident]->[certain]->[bold] - Doubt:
[uncertain]->[doubtful]->[hesitant]->[skeptical] - Surprise:
[surprised]->[startled]->[shocked]->[astonished] - Disgust:
[grossed_out]->[disgusted]->[repulsed]->[revolted] - Pride:
[satisfied]->[accomplished]->[proud]->[fulfilled] - Shame:
[embarrassed]->[guilty]->[ashamed]->[humiliated] - Love:
[warm]->[affectionate]->[loving]->[tender] - Flirtation:
[charming]->[playful]->[flirty]->[teasing] - Sarcasm:
[dry]->[ironic]->[sarcastic]->[mocking] - Determination:
[focused]->[determined]->[driven]->[resolute] - Frustration:
[annoyed]->[irritated]->[frustrated]->[exasperated] - Relief:
[calmed]->[reassured]->[relieved]->[grateful] - Curiosity:
[interested]->[curious]->[inquiring]->[intrigued] - Boredom:
[dull]->[uninterested]->[bored]->[apathetic] - Awe:
[inspired]->[amazed]->[awed]->[wonderstruck] - Suspicion:
[wary]->[suspicious]->[guarded]->[distrustful] - Urgency:
[urgent]->[rushed]->[intense]->[pressured] - Authority:
[firm]->[authoritative]->[directive]->[commanding] - Politeness:
[polite]->[courteous]->[respectful]->[formal] - Gratitude:
[appreciative]->[thankful]->[grateful]->[warm] - Confusion:
[uncertain]->[puzzled]->[confused]->[lost] - Hopelessness:
[resigned]->[defeated]->[hopeless]->[despairing] - Playfulness:
[lighthearted]->[playful]->[fun]->[silly]
Practical Rule Of Thumb
- Use the leftmost tag when you want the emotion to be present but not overpower the sentence.
- Use the middle tags when you want clear emotional color without sounding theatrical.
- Use the rightmost tag when you need the emotion to come through strongly and consistently.
- Nervousness, subtle:
[uneasy] - Nervousness, clear:
[anxious] - Nervousness, strongest:
[trembling]
How To Generalize This To New Emotions
This same principle generalizes well to new emotions:- Start with 3 to 4 tags for the same emotional family.
- Arrange them from balanced to extreme.
- Test them on the same sentence.
- Keep the tag that gives the clearest emotional control without distorting the sentence too much.
- When in doubt, the most extreme tag often gives the strongest controllability.
same emotion family + left-to-right intensity ladder + same test sentence = reliable controllable TTS
Combining Tags
For precise control, combine multiple embedded emotion and delivery tags:Pauses
Add SSML-style breaks anywhere in your text for dramatic pauses:Best Practices
- Use specific tags - Place concise delivery tags near the sentence they should affect
- Match content to emotion - Text and punctuation should reflect the emotional tone
- Place sound effects naturally - Tags like
[laughing],[sighing]work best within sentences - Keep tags short - Tags like
[happy],[sad], or[whispering]work best when focused - Add pauses - Use
<break time='600ms'/>for dramatic effect
Next Steps
Text to Speech
Get started with basic TTS using the Python or TypeScript SDK.
Choosing a Model
Compare mars-instruct with mars-flash and mars-pro.
Voice Cloning
Create custom voices for your emotional speech.
TTS with Accents
Generate speech in 140+ language accents.