> ## Documentation Index
> Fetch the complete documentation index at: https://docs.camb.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Create Sound and Music

> Creates a task to generate an audio file—either music or sound effects—from a given text prompt

Turn descriptive text into expressive music or sound effects. This endpoint generates either structured musical audio (melodies, harmonies, beats) or non-musical SFX (foley, UI beeps, ambience, impacts) based on your prompt and selected `audio_type`.

## The Generation Process

<Steps>
  <Step title="Task Creation">
    Your request is registered and a processing task is created. The response includes a unique `task_id` you’ll use to track progress and retrieve results (via the returned `run_id`).
  </Step>

  <Step title="Text Analysis">
    We analyze your prompt to identify sonic intent—musical attributes (tempo, instrumentation, groove) for `music`, or acoustic characteristics (materials, space, motion) for `sound`.
  </Step>

  <Step title="Audio Synthesis">
    Specialized models synthesize the audio to match your description and requested `duration`.
  </Step>
</Steps>

<Note>
  * **Music `duration` behavior:**
    When `audio_type` is set to `music`, the `duration` parameter is **not enforced**. Music generations currently return a fixed-length clip depending on the prompt provided. If you need a precise length for music, trim, loop, or stitch the result in post-production. The `duration` parameter continues to apply to `sound` (SFX) requests.
</Note>

<Warning>
  Maximum duration per request is **10 seconds**. The default duration is **8.0 seconds**.
</Warning>

Throughout this process, you can monitor the status of your generation task by polling the [`/text-to-sound/{task_id}`](get-text-to-sound-status) endpoint with the `task_id` provided in your initial response.

## Creating Your First Request

Let's examine how to initiate a sound generation task using Python:

```python [expandable] theme={null}
import requests
import json
from typing import Literal

# Your API authentication
headers = {
    "x-api-key": "your-api-key",  # Replace with your actual API key
    "Content-Type": "application/json"
}

def create_text_to_sound(prompt: str, audio_type: Literal["sound", "music"], duration=8.0):
    """
    Submits a new text-to-sound generation task and returns the task ID for tracking.

    Parameters:
    - prompt: A descriptive text explaining the sound to generate
    - duration: The desired length of the audio in seconds (default: 8.0)
    """
    try:
        # Prepare the request body
        payload = {
            "prompt": prompt,
            "duration": duration,
            "audio_type": audio_type
        }

        # Submit the generation request
        response = requests.post(
            "https://client.camb.ai/apis/text-to-sound",
            headers=headers,
            data=json.dumps(payload)
        )

        # Verify the request was successful
        response.raise_for_status()

        # Extract the task ID from the response
        result = response.json()
        task_id = result.get("task_id")

        print(f"Sound generation task submitted successfully! Task ID: {task_id}")
        return task_id

    except requests.exceptions.RequestException as e:
        print(f"Error submitting text-to-sound task: {e}")
        if hasattr(e, 'response') and e.response is not None:
            print(f"Response content: {e.response.text}")
        return None

# SFX example
prompt = "A gentle rainfall on a tin roof that swells briefly with a distant thunder roll"
task_id = create_text_to_audio(prompt, audio_type="sound", duration=10.0)

# Music example
prompt = "Warm lo-fi beat at 80 BPM with soft piano chords, vinyl crackle, and mellow bass"
task_id = create_text_to_audio(prompt, audio_type="music")
```

## Monitoring Your Sound Generation Progress

After submission, your sound generation task enters our processing pipeline. You can monitor the progress by polling the status endpoint:

```python [expandable] theme={null}
def check_sound_generation_status(task_id):
    """
    Checks the status of a text-to-sound generation task.
    Returns the current status and any available result information.

    Parameters:
    - task_id: The ID of the generation task to check
    """
    if not task_id:
        print("No task ID provided.")
        return None

    try:
        response = requests.get(
            f"https://client.camb.ai/apis/text-to-sound/{task_id}",
            headers=headers
        )

        # Verify the request was successful
        response.raise_for_status()

        # Parse the status information
        status_data = response.json()
        print(f"Current status: {status_data['status']}")

        # If the generation is complete, display the results
        if status_data['status'] == "SUCCESS":
            print("Sound generation completed successfully!")
            print(f"Audio URL: {status_data.get('audio_url')}")

        return status_data

    except requests.exceptions.RequestException as e:
        print(f"Error checking generation status: {e}")
        return None

# Check the status of your generation task
status_info = check_sound_generation_status(task_id)
```

## Prompting Tips

The quality of your generated audio depends significantly on how well you craft your text prompts. Here are some professional recommendations for creating effective descriptions:

1. **Be Specific**:
   * **Music**: “Upbeat indie rock with clean electric guitars, driving drums, and a catchy 4-bar hook.”
   * **SFX**: “Single metal door slam in a concrete hallway, short decay, slight echo.”

2. **Include Context**:
   * **Music**: “Mellow coffee-shop background loop, no vocals, relaxed vibe.”
   * **SFX**: “Footsteps on wet gravel at night, occasional splashes.”

3. **Describe Dynamics**:
   * **Music**: “8-second loop, soft intro hit, steady groove, light ending tail.”
   * **SFX**: “Starts distant, approaches quickly, then passes left to right.”

4. **Mention Emotional Qualities**:
   * **Music**: “Dreamy and nostalgic, lo-fi texture, gentle swing.”
   * **SFX**: “Eerie, tense drone with subtle mechanical whir.”

5. **Reference Familiar Sounds**:
   * **Music**: “Similar to a chillhop beat with muted trumpet accents.”
   * **SFX**: “Like a UI success chime with a soft shimmer tail.”


## OpenAPI

````yaml post /text-to-sound
openapi: 3.1.0
info:
  title: FastAPI
  version: 0.1.0
servers:
  - url: https://client.camb.ai/apis
security: []
paths:
  /text-to-sound:
    post:
      tags:
        - Apis
        - Text-to-Audio
      summary: Create Sound and Music
      operationId: create_text_to_audio_text_to_audio_post
      requestBody:
        required: true
        content:
          application/json:
            schema:
              $ref: '#/components/schemas/CreateTextToAudioRequestPayload'
      responses:
        '200':
          description: Successful Response
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/TaskID'
                description: >-
                  A JSON that contains unique identifier for the task. This is
                  used to query the status of the Sound and Music task that is
                  running. It is returned when a create request is made to
                  generate sound from text.
        '422':
          description: Validation Error
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/HTTPValidationError'
      security:
        - APIKeyHeader: []
components:
  schemas:
    CreateTextToAudioRequestPayload:
      allOf:
        - $ref: '#/components/schemas/CreateTaskWithNameAndDescription'
        - type: object
          properties:
            prompt:
              type: string
              title: Prompt
              required: true
              description: >-
                A textual description of the sound you want to generate. This
                required field should contain a clear, descriptive explanation
                of the desired audio effect. While our system can process
                lengthy descriptions, concise prompts typically yield more
                accurate results.
            duration:
              type: number
              title: Duration
              description: >-
                Specify how long you want your generated audio to be, measured
                in seconds. This optional parameter defaults to `8.0` seconds if
                not explicitly set. The duration value directly impacts how the
                audio evolves over time, with longer durations allowing for more
                complex sonic development.
              default: 8
            audio_type:
              $ref: '#/components/schemas/TextToAudioType'
      title: CreateTextToAudioRequestPayload
    TaskID:
      properties:
        task_id:
          type: string
          title: Task ID
      type: object
      title: Task ID
    HTTPValidationError:
      properties:
        detail:
          items:
            $ref: '#/components/schemas/ValidationError'
          type: array
          title: Detail
      type: object
      title: HTTPValidationError
    CreateTaskWithNameAndDescription:
      properties:
        project_name:
          type: string
          title: Project Name
          description: >-
            Enter a distinctive name for your project that reflects its purpose
            or content. This name will be displayed in your CAMB.AI workspace
            dashboard and used to organize related assets, transcriptions, etc..
            . Choose something memorable that helps you quickly identify this
            specific project among your other voice, audio and localization
            tasks.
          minLength: 3
          maxLength: 255
          nullable: true
        project_description:
          type: string
          title: Project Description
          description: >-
            Provide details about your project's goals and specifications.
            Include information such as the target languages for translation or
            dubbing, desired voice characteristics, emotional tones to capture,
            or specific audio processing requirements, outlining the workflow
            here can serve as valuable documentation for organizational
            purposes.
          minLength: 3
          maxLength: 5000
          nullable: true
    TextToAudioType:
      type: string
      enum:
        - sound
        - music
      description: >-
        Controls the kind of audio to generate. Use `music` to create musical
        content (melody, harmony, rhythm). Use `sound` to create non-musical
        sound effects (foley, ambience, UI cues, impacts).
      default: sound
    ValidationError:
      properties:
        loc:
          items:
            anyOf:
              - type: string
              - type: integer
          type: array
          title: Location
        msg:
          type: string
          title: Message
        type:
          type: string
          title: Error Type
      type: object
      required:
        - loc
        - msg
        - type
      title: ValidationError
  securitySchemes:
    APIKeyHeader:
      type: apiKey
      in: header
      name: x-api-key
      description: >-
        The `x-api-key` is a custom header required for authenticating requests
        to our API. Include this header in your request with the appropriate API
        key value to securely access our endpoints. You can find your API key(s)
        in the 'API' section of our studio website.

````