Skip to main content
POST
/
text-to-sound
Create Sound and Music
curl --request POST \
  --url https://client.camb.ai/apis/text-to-sound \
  --header 'Content-Type: application/json' \
  --header 'x-api-key: <api-key>' \
  --data '{
  "project_name": "<string>",
  "project_description": "<string>",
  "folder_id": 2,
  "prompt": "<string>",
  "duration": 8,
  "audio_type": "sound"
}'
{
  "task_id": "<string>"
}
Turn descriptive text into expressive music or sound effects. This endpoint generates either structured musical audio (melodies, harmonies, beats) or non-musical SFX (foley, UI beeps, ambience, impacts) based on your prompt and selected audio_type.

The Generation Process

1

Task Creation

Your request is registered and a processing task is created. The response includes a unique task_id you’ll use to track progress and retrieve results (via the returned run_id).
2

Text Analysis

We analyze your prompt to identify sonic intent—musical attributes (tempo, instrumentation, groove) for music, or acoustic characteristics (materials, space, motion) for sound.
3

Audio Synthesis

Specialized models synthesize the audio to match your description and requested duration.
  • Music duration behavior: When audio_type is set to music, the duration parameter is not enforced. Music generations currently return a fixed-length clip depending on the prompt provided. If you need a precise length for music, trim, loop, or stitch the result in post-production. The duration parameter continues to apply to sound (SFX) requests.
Maximum duration per request is 10 seconds. The default duration is 8.0 seconds.
Throughout this process, you can monitor the status of your generation task by polling the /text-to-sound/{task_id} endpoint with the task_id provided in your initial response.

Creating Your First Request

Let’s examine how to initiate a sound generation task using Python:
import requests
import json
from typing import Literal

# Your API authentication
headers = {
    "x-api-key": "your-api-key",  # Replace with your actual API key
    "Content-Type": "application/json"
}

def create_text_to_sound(prompt: str, audio_type: Literal["sound", "music"], duration=8.0):
    """
    Submits a new text-to-sound generation task and returns the task ID for tracking.

    Parameters:
    - prompt: A descriptive text explaining the sound to generate
    - duration: The desired length of the audio in seconds (default: 8.0)
    """
    try:
        # Prepare the request body
        payload = {
            "prompt": prompt,
            "duration": duration,
            "audio_type": audio_type
        }

        # Submit the generation request
        response = requests.post(
            "https://client.camb.ai/apis/text-to-sound",
            headers=headers,
            data=json.dumps(payload)
        )

        # Verify the request was successful
        response.raise_for_status()

        # Extract the task ID from the response
        result = response.json()
        task_id = result.get("task_id")

        print(f"Sound generation task submitted successfully! Task ID: {task_id}")
        return task_id

    except requests.exceptions.RequestException as e:
        print(f"Error submitting text-to-sound task: {e}")
        if hasattr(e, 'response') and e.response is not None:
            print(f"Response content: {e.response.text}")
        return None

# SFX example
prompt = "A gentle rainfall on a tin roof that swells briefly with a distant thunder roll"
task_id = create_text_to_audio(prompt, audio_type="sound", duration=10.0)

# Music example
prompt = "Warm lo-fi beat at 80 BPM with soft piano chords, vinyl crackle, and mellow bass"
task_id = create_text_to_audio(prompt, audio_type="music")

Monitoring Your Sound Generation Progress

After submission, your sound generation task enters our processing pipeline. You can monitor the progress by polling the status endpoint:
def check_sound_generation_status(task_id):
    """
    Checks the status of a text-to-sound generation task.
    Returns the current status and any available result information.

    Parameters:
    - task_id: The ID of the generation task to check
    """
    if not task_id:
        print("No task ID provided.")
        return None

    try:
        response = requests.get(
            f"https://client.camb.ai/apis/text-to-sound/{task_id}",
            headers=headers
        )

        # Verify the request was successful
        response.raise_for_status()

        # Parse the status information
        status_data = response.json()
        print(f"Current status: {status_data['status']}")

        # If the generation is complete, display the results
        if status_data['status'] == "SUCCESS":
            print("Sound generation completed successfully!")
            print(f"Audio URL: {status_data.get('audio_url')}")

        return status_data

    except requests.exceptions.RequestException as e:
        print(f"Error checking generation status: {e}")
        return None

# Check the status of your generation task
status_info = check_sound_generation_status(task_id)

Prompting Tips

The quality of your generated audio depends significantly on how well you craft your text prompts. Here are some professional recommendations for creating effective descriptions:
  1. Be Specific:
    • Music: “Upbeat indie rock with clean electric guitars, driving drums, and a catchy 4-bar hook.”
    • SFX: “Single metal door slam in a concrete hallway, short decay, slight echo.”
  2. Include Context:
    • Music: “Mellow coffee-shop background loop, no vocals, relaxed vibe.”
    • SFX: “Footsteps on wet gravel at night, occasional splashes.”
  3. Describe Dynamics:
    • Music: “8-second loop, soft intro hit, steady groove, light ending tail.”
    • SFX: “Starts distant, approaches quickly, then passes left to right.”
  4. Mention Emotional Qualities:
    • Music: “Dreamy and nostalgic, lo-fi texture, gentle swing.”
    • SFX: “Eerie, tense drone with subtle mechanical whir.”
  5. Reference Familiar Sounds:
    • Music: “Similar to a chillhop beat with muted trumpet accents.”
    • SFX: “Like a UI success chime with a soft shimmer tail.”

Authorizations

x-api-key
string
header
required

The x-api-key is a custom header required for authenticating requests to our API. Include this header in your request with the appropriate API key value to securely access our endpoints. You can find your API key(s) in the 'API' section of our studio website.

Body

application/json
prompt
string

A textual description of the sound you want to generate. This required field should contain a clear, descriptive explanation of the desired audio effect. While our system can process lengthy descriptions, concise prompts typically yield more accurate results.

duration
number
default:8

Specify how long you want your generated audio to be, measured in seconds. This optional parameter defaults to 8.0 seconds if not explicitly set. The duration value directly impacts how the audio evolves over time, with longer durations allowing for more complex sonic development.

audio_type
enum<string>
default:sound

Controls the kind of audio to generate. Use music to create musical content (melody, harmony, rhythm). Use sound to create non-musical sound effects (foley, ambience, UI cues, impacts).

Available options:
sound,
music
project_name
string | null

Enter a distinctive name for your project that reflects its purpose or content. This name will be displayed in your CAMB.AI workspace dashboard and used to organize related assets, transcriptions, etc.. . Choose something memorable that helps you quickly identify this specific project among your other voice, audio and localization tasks.

Required string length: 3 - 255
project_description
string | null

Provide details about your project's goals and specifications. Include information such as the target languages for translation or dubbing, desired voice characteristics, emotional tones to capture, or specific audio processing requirements, outlining the workflow here can serve as valuable documentation for organizational purposes.

Required string length: 3 - 5000
folder_id
integer | null

Specify the organizational folder within your CAMB.AI workspace where this task should be created and stored. The folder must already exist in your workspace and be accessible through your current API key authentication. This helps maintain project organization by grouping related tasks together, making it easier to manage and locate your projects.

Required range: x >= 1

Response

Successful Response

A JSON that contains unique identifier for the task. This is used to query the status of the Sound and Music task that is running. It is returned when a create request is made to generate sound from text.

task_id
string
I