Convert text to speech in real-time with customizable voice characteristics, delivering audio content as it is generated for immediate use in your applications.
| Gender | Corresponding Value |
|---|---|
| Not Known | 0 |
| Male | 1 |
| Female | 2 |
| Not Applicable | 9 |
/list-voices endpoint./create-custom-voice endpoint for a truly unique sound./source-languages or /target-languages endpoints.
adts, but you can request other formats).X-Credits-Required header that transparently shows you the resource cost of your request.output_format parameter:
| Format | Description | Use Case |
|---|---|---|
wav | Uncompressed audio format with excellent quality | High-fidelity applications, audio editing |
flac | Lossless compressed audio format that maintains high quality | Storage-efficient high-quality audio |
adts | Standard streaming audio format with good quality | Default format, streaming applications |
pcm_s16le | Standard quality raw audio format | Basic audio processing, lower memory usage |
pcm_s32le | High-quality raw audio format | Professional audio work, highest quality |
stream=True, you can handle audio generation efficiently even for longer text passages.The x-api-key is a custom header required for authenticating requests to our API. Include this header in your request with the appropriate API key value to securely access our endpoints. You can find your API key(s) in the 'API' section of our studio website.
The content you want converted into spoken audio. This can be anything from a single sentence to paragraphs of text, supporting punctuation for natural speech patterns.
The unique identifier for your selected voice profile. You can obtain available voice IDs from the /list-voices endpoint or create custom voices with the /create-custom-voice endpoint.
The source language of your input text. This helps the system apply the correct pronunciation rules and speech patterns.
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 73, 74, 75, 76, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 139, 140, 141, 142, 143, 144, 145, 146, 148 The preferred gender characteristics of the synthesized voice (0 = Not Specified, 1 = Male, 2 = Female, 9 = Not Applicable). Defaults to null.
0, 1, 2, 9 The approximate age (between 1-100 years) to be reflected in the voice characteristics. This parameter helps fine-tune the timbre and speech patterns to match different age groups.
The audio file format for the generated speech stream.
wav, flac, adts, pcm_s16le, pcm_s32le Generated speech audio in the specified format.
The response is of type file.