> ## Documentation Index
> Fetch the complete documentation index at: https://docs.camb.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Create Voice from Description

> Submits a request to initiate a task for creating a human-like voice from a given text prompt.

This endpoint enables you to generate custom, human-like synthetic voices based on your descriptive text prompts. Rather than selecting from pre-defined voice options, this innovative approach allows you to craft voices tailored to your specific needs by simply describing the voice characteristics you want. The endpoint initiates an asynchronous process, returning a task\_id that you can use to monitor the generation progress and eventually retrieve your custom voice.

<Note>A description consisting of at least 18 words (100+ characters) is required for voice generation. Requests with shorter prompts will not be processed successfully.</Note>

## How Voice Generation Works

The voice generation process follows these steps:

1. You submit a detailed description of the voice you want to create, along with sample text for the voice to speak.
2. The system analyzes your description and generates two sample voices matching those characteristics for you to choose from.
3. The system returns a `task_id` that you can use to track the generation process.
4. You periodically check the status using the [`/text-to-voice/{task_id}`](get-text-to-voice-status) endpoint.
5. Once complete, you can access and use the generated voice in your applications.

## Creating Effective Voice Descriptions

The quality and specificity of your voice description directly impacts the resulting voice. When crafting your description, consider including details about:

* **Gender and age range**: "A middle-aged woman" or "An elderly man"
* **Accent and regional characteristics**: "With a mild Scottish accent" or "Speaking American English with Southern inflections"
* **Emotional qualities**: "A warm, nurturing tone" or "An authoritative, confident delivery"
* **Speaking style**: "Who speaks slowly and deliberately" or "With an energetic, rapid-fire delivery"
* **Cultural context**: "A voice that would be at home narrating documentaries" or "Like a friendly teacher explaining concepts"
* **Vocal characteristics**: "With a slightly raspy quality" or "With a deep, resonant tone"

The more detailed and vivid your description, the more precisely the system can match your desired voice characteristics. Remember that your description must contain at least 18 words (100+ characters) to provide sufficient guidance for the voice generation system.

## Example Request

```json theme={null}
{
  "text": "Welcome to our application. I'll be your guide through all the features and capabilities available to you.",
  "voice_description": "A warm and friendly middle-aged woman with a slight British accent. She speaks clearly and articulately, with a soothing tone that conveys expertise and trustworthiness. Her voice has a natural musical quality without being overly dramatic."
}
```

## Response

Upon successful submission, the endpoint returns a `task_id` that you can use to check the status of your voice generation task:

```json theme={null}
{
  "task_id": "your_task_id"
  "status": "PENDING"
}
```

## Monitoring Generation Progress

Voice generation is a computationally intensive process that typically takes some time to complete. To check the status of your generation task, periodically poll the [`/text-to-voice/{task_id}`](get-text-to-voice-status) endpoint using the `task_id` received in the initial response.

## Best Practices

1. **Be specific in your descriptions**: The more detailed your voice description, the better the system can match your expectations.
2. **Consider the context**: Tailor your voice to match the content and audience of your application.
3. **Start with longer descriptions**: While 18 words is the minimum, starting with more detailed descriptions (30-50 words) often yields better results.
4. **Test variations**: If your first voice isn't exactly what you need, try adjusting specific aspects of your description to refine the results.
5. **Include emotional context**: Describing the emotional quality of the voice significantly improves the naturalness of the generated speech.

## Limitations

* Voice descriptions must be at least 18 words (100+ characters) long.
* Very unusual or contradictory voice descriptions may yield unpredictable results.

By leveraging this endpoint effectively, you can create custom voices that perfectly match your brand identity, content needs, and user expectations, all without the need for professional voice talent or recording studios.


## OpenAPI

````yaml post /text-to-voice
openapi: 3.1.0
info:
  title: FastAPI
  version: 0.1.0
servers:
  - url: https://client.camb.ai/apis
security: []
paths:
  /text-to-voice:
    post:
      tags:
        - Apis
        - Text-to-Voice
      summary: Create Voice from Description
      operationId: create_text_to_voice_text_to_voice_post
      requestBody:
        required: true
        content:
          application/json:
            schema:
              $ref: '#/components/schemas/CreateTextToVoiceRequestPayload'
      responses:
        '200':
          description: Successful response
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/TaskID'
                description: >-
                  A JSON that contains the unique identifier for the task. This
                  is used to query the status of the text to voice task that is
                  running. It is returned when a create request is made for
                  creating a text to voice task.
        '422':
          description: Validation Error
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/HTTPValidationError'
      security:
        - APIKeyHeader: []
components:
  schemas:
    CreateTextToVoiceRequestPayload:
      properties:
        text:
          title: Text
          description: >-
            The text content that will be converted into synthesized speech.
            This text will be spoken by your generated voice and serves as a
            sample of the voice's capabilities.
          type: string
        voice_description:
          title: Voice Description
          description: >-
            A detailed description (minimum 18 words/100+ characters) of the
            desired voice characteristics. Be specific about gender, age,
            accent, emotional tone, speaking style, or cultural context to guide
            the synthesis engine in creating an authentic voice.
          type: string
      required:
        - text
        - voice_description
    TaskID:
      properties:
        task_id:
          type: string
          title: Task ID
      type: object
      title: Task ID
    HTTPValidationError:
      properties:
        detail:
          items:
            $ref: '#/components/schemas/ValidationError'
          type: array
          title: Detail
      type: object
      title: HTTPValidationError
    ValidationError:
      properties:
        loc:
          items:
            anyOf:
              - type: string
              - type: integer
          type: array
          title: Location
        msg:
          type: string
          title: Message
        type:
          type: string
          title: Error Type
      type: object
      required:
        - loc
        - msg
        - type
      title: ValidationError
  securitySchemes:
    APIKeyHeader:
      type: apiKey
      in: header
      name: x-api-key
      description: >-
        The `x-api-key` is a custom header required for authenticating requests
        to our API. Include this header in your request with the appropriate API
        key value to securely access our endpoints. You can find your API key(s)
        in the 'API' section of our studio website.

````