Create Text-to-Speech
Convert text to natural-sounding speech with customizable voice settings, allowing you to generate audio files for playback in your applications.
This endpoint transforms your written text into remarkably natural speech, opening up possibilities for narration, accessibility features, and interactive experiences in your applications. When you provide text along with voice preferences, our Model creates an audio file that captures the nuances and natural flow of human speech.
Personalizing Your Voice
Voice Personaltiy Options
You can shape the character of your generated voice by specifying these key attributes:
-
Gender: Select the voice gender with a simple numeric code:
-
Age: Add in the voice age to reflect the intended audience or context.
Selecting Your Voice
Our platform offers two approaches to finding the perfect voice:
-
Choose from our voice library: Browse our collection of ready-to-use voices through the
/list-voices
endpoint to find the perfect match for your project. -
Use your signature voice: Utilize the
/create-custom-voice
endpoint to clone an audio recording containing your signature voice.
Language Support
Our technology supports numerous languages and dialects for global reach. To see what’s available:
- Check the
/source-languages
or/target-languages
endpoints for a complete list of supported languages with their corresponding IDs.
How the Process Works
This endpoint uses an asynchronous workflow, which means your application can keep doing other things while the speech is being generated. Here’s how it works, step-by-step:
Submit Your Request
Start by sending in your text, along with voice preferences such as language, age, and gender. You can also specify a voice from our library or a custom voice you’ve created.
Receive Your `task_id`
Once your request is received, our system instantly returns a unique task_id
. This lets you check back later to see how things are going—just like tracking a package.
Track Progress of Your Speech Generation
Use your task_id
to check the current status of your request by calling the /tts/{id}
endpoint. You’ll know whether your audio is still being generated, completed, or if something went wrong.
Retrieve the Final Audio File
When your task is marked as complete (SUCCESS
), you’ll get a run_id
. Use this ID to download your final audio file from the /tts-result/{run_id}
endpoint.
This approach works particularly well for longer passages or when processing multiple requests simultaneously.
Example: Creating Your First Audio
Here’s a practical example showing how to generate speech about Mars:
With this approach, you can create engaging audio content that brings your text to life while your application continues performing other tasks during processing.
Authorizations
The x-api-key
is a custom header required for authenticating requests to our API. Include this header in your request with the appropriate API key value to securely access our endpoints. You can find your API key(s) in the 'API' section of our studio website.
Body
Response
A JSON that contains the unique identifier for the task. This is used to query the status of the story task that is running. It is returned when a create request is made for a text-to-speech.