audio_type
.
The Generation Process
Task Creation
task_id
you’ll use to track progress and retrieve results (via the returned run_id
).Text Analysis
music
, or acoustic characteristics (materials, space, motion) for sound
.Audio Synthesis
duration
.- Music
duration
behavior: Whenaudio_type
is set tomusic
, theduration
parameter is not enforced. Music generations currently return a fixed-length clip depending on the prompt provided. If you need a precise length for music, trim, loop, or stitch the result in post-production. Theduration
parameter continues to apply tosound
(SFX) requests.
/text-to-sound/{task_id}
endpoint with the task_id
provided in your initial response.
Creating Your First Request
Let’s examine how to initiate a sound generation task using Python:Monitoring Your Sound Generation Progress
After submission, your sound generation task enters our processing pipeline. You can monitor the progress by polling the status endpoint:Prompting Tips
The quality of your generated audio depends significantly on how well you craft your text prompts. Here are some professional recommendations for creating effective descriptions:-
Be Specific:
- Music: “Upbeat indie rock with clean electric guitars, driving drums, and a catchy 4-bar hook.”
- SFX: “Single metal door slam in a concrete hallway, short decay, slight echo.”
-
Include Context:
- Music: “Mellow coffee-shop background loop, no vocals, relaxed vibe.”
- SFX: “Footsteps on wet gravel at night, occasional splashes.”
-
Describe Dynamics:
- Music: “8-second loop, soft intro hit, steady groove, light ending tail.”
- SFX: “Starts distant, approaches quickly, then passes left to right.”
-
Mention Emotional Qualities:
- Music: “Dreamy and nostalgic, lo-fi texture, gentle swing.”
- SFX: “Eerie, tense drone with subtle mechanical whir.”
-
Reference Familiar Sounds:
- Music: “Similar to a chillhop beat with muted trumpet accents.”
- SFX: “Like a UI success chime with a soft shimmer tail.”
Authorizations
The x-api-key
is a custom header required for authenticating requests to our API. Include this header in your request with the appropriate API key value to securely access our endpoints. You can find your API key(s) in the 'API' section of our studio website.
Body
A textual description of the sound you want to generate. This required field should contain a clear, descriptive explanation of the desired audio effect. While our system can process lengthy descriptions, concise prompts typically yield more accurate results.
Specify how long you want your generated audio to be, measured in seconds. This optional parameter defaults to 8.0
seconds if not explicitly set. The duration value directly impacts how the audio evolves over time, with longer durations allowing for more complex sonic development.
Controls the kind of audio to generate. Use music
to create musical content (melody, harmony, rhythm). Use sound
to create non-musical sound effects (foley, ambience, UI cues, impacts).
sound
, music
Enter a distinctive name for your project that reflects its purpose or content. This name will be displayed in your CAMB.AI workspace dashboard and used to organize related assets, transcriptions, etc.. . Choose something memorable that helps you quickly identify this specific project among your other voice, audio and localization tasks.
3 - 255
Provide details about your project's goals and specifications. Include information such as the target languages for translation or dubbing, desired voice characteristics, emotional tones to capture, or specific audio processing requirements, outlining the workflow here can serve as valuable documentation for organizational purposes.
3 - 5000
Specify the organizational folder within your CAMB.AI workspace where this task should be created and stored. The folder must already exist in your workspace and be accessible through your current API key authentication. This helps maintain project organization by grouping related tasks together, making it easier to manage and locate your projects.
x >= 1
Response
Successful Response
A JSON that contains unique identifier for the task. This is used to query the status of the Sound and Music task that is running. It is returned when a create request is made to generate sound from text.