Stream Text-to-Speech Audio
Convert text to speech in real-time with customizable voice characteristics, delivering audio content as it is generated for immediate use in your applications.
This endpoint transforms your text into speech in real-time, delivering audio as it is generated rather than waiting for the entire file to complete. This streaming approach allows for immediate playback and faster integration into your applications.
Personalizing Your Voice
Selecting Voice Characteristics
When creating your speech, you can customize the voice characteristics to match your needs:
- Gender: Choose the voice gender by providing a simple number:
Gender | Corresponding Value |
---|---|
Not Known | 0 |
Male | 1 |
Female | 2 |
Not Applicable | 9 |
Finding Your Perfect Voice
You have two options for voices:
- Select from our extensive library of pre-made voices using the
/list-voices
endpoint. - Create your own custom voice with the
/create-custom-voice
endpoint for a truly unique sound.
Choosing Your Language
Our platform supports multiple languages for your speech needs. Discover all available options through our /source-languages
or /target-languages
endpoints.
Understanding the Response
- When you call this endpoint, you will receive the audio content immediately as it is generated. The audio streams in the format you have specified (default is
adts
, but you can request other formats). - Each response includes an
X-Credits-Required
header that transparently shows you the resource cost of your request.
Available Output Formats
You can specify your preferred audio format using the output_format
parameter:
Format | Description | Use Case |
---|---|---|
wav | Uncompressed audio format with excellent quality | High-fidelity applications, audio editing |
flac | Lossless compressed audio format that maintains high quality | Storage-efficient high-quality audio |
adts | Standard streaming audio format with good quality | Default format, streaming applications |
pcm_s16le | Standard quality raw audio format | Basic audio processing, lower memory usage |
pcm_s32le | High-quality raw audio format | Professional audio work, highest quality |
Example Code
The streaming approach gives you immediate access to your audio, making it perfect for interactive applications, voice assistants, and real-time communication tools. By using the standard requests library with stream=True
, you can handle audio generation efficiently even for longer text passages.
Authorizations
The x-api-key
is a custom header required for authenticating requests to our API. Include this header in your request with the appropriate API key value to securely access our endpoints. You can find your API key(s) in the 'API' section of our studio website.
Body
Response
The response is of type file
.