TTS with API - Camb.ai

Overview

While SDKs and frameworks provide convenience, sometimes you need direct control over API calls. This tutorial shows how to call the Camb.ai TTS API directly using HTTP requests.

When to Use Direct API

Building integrations in languages without an SDK
Need fine-grained control over request/response handling
Debugging or testing API behavior
Building custom streaming implementations

Listen to an Example

Prerequisites

Create an account

Get your API key

Go to Settings → API Keys in Studio and copy your key. See Authentication for details.

Basic TTS Request

POST /tts-stream returns a binary audio byte stream (for example audio/wav or audio/mpeg), not Server-Sent Events or JSON chunks. The server sends the Content-Type that matches your output_configuration.format. You can buffer the full body for short clips, or read in chunks for lower latency—see Stream Text-to-Speech Audio.

curl -X POST "https://client.camb.ai/apis/tts-stream" \
  -H "x-api-key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "text": "Hello from the command line!",
    "voice_id": 147320,
    "language": "en-us",
    "speech_model": "mars-8.1-flash-beta",
    "output_configuration": {
      "format": "wav"
    }
  }' \
  --output output.wav

Streaming Response

For real-time playback, iterate over the raw response body as chunks arrive. Always validate the status before reading the stream—non-success responses may return JSON (for example validation errors), not audio. Responses can include the X-Credits-Required header for usage tracking (see the API reference).

import asyncio
import os
import aiohttp


async def stream_tts(text: str, voice_id: int = 147320):
    """Yield audio chunks as they arrive; raises on non-success HTTP status."""
    api_key = os.getenv("CAMB_API_KEY")
    url = "https://client.camb.ai/apis/tts-stream"

    headers = {
        "x-api-key": api_key,
        "Content-Type": "application/json",
    }

    payload = {
        "text": text,
        "voice_id": voice_id,
        "language": "en-us",
        "speech_model": "mars-8.1-flash-beta",
        "output_configuration": {"format": "wav"},
    }

    timeout = aiohttp.ClientTimeout(total=120)

    async with aiohttp.ClientSession(timeout=timeout) as session:
        async with session.post(url, headers=headers, json=payload) as resp:
            print(f"Status: {resp.status}")
            print(f"Content-Type: {resp.headers.get('Content-Type')}")
            print(f"X-Credits-Required: {resp.headers.get('X-Credits-Required')}")
            resp.raise_for_status()

            async for chunk in resp.content.iter_chunked(4096):
                yield chunk


async def main():
    with open("streamed_output.wav", "wb") as f:
        async for chunk in stream_tts("This is streamed audio generation."):
            f.write(chunk)


if __name__ == "__main__":
    asyncio.run(main())

Request Parameters

These fields match Stream Text-to-Speech Audio and the OpenAPI schema for POST /tts-stream.

Required Parameters

Parameter	Type	Description
`text`	string	Text to synthesize (3–3000 characters)
`language`	string	BCP-47 locale (e.g. `en-us`). Case-sensitive lowercase. Unsupported locales for the chosen `speech_model` return 422.
`voice_id`	integer	Voice profile ID from `/list-voices`

Optional Parameters

Parameter	Type	Default	Description
`speech_model`	string	`mars-8.1-flash-beta`	`mars-8.1-flash-beta`, `mars-8.1-pro-beta`, `mars-flash`, `mars-pro`, `mars-instruct`. MARS 8.1 beta models support inline pronunciation and non-verbal tags in `text`; `mars-instruct` uses a different expressive tag set (API reference).
`output_configuration`	object	format defaults to `wav`	`format`, optional `sample_rate`. Supported formats depend on `speech_model` (see the Output format support by model table in the API reference).
`voice_settings`	object	—	`speaking_rate`, reference quality, accent controls (API reference).
`inference_options`	object	—	e.g. `inference_steps` where applicable (API reference).
`enhance_named_entities_pronunciation`	boolean	`false`	Improves named-entity pronunciation when supported. Not supported for `mars-8.1-flash-beta` or `mars-8.1-pro-beta` (same as the API reference note).

More details available in the API Reference. mars-instruct does not support mp3 or pcm_s16be (see Output format support by model in the API reference).

Expressive and pronunciation controls

mars-8.1-flash-beta / mars-8.1-pro-beta: English CMU phoneme overrides (e.g. [B EY1 S]) and non-verbal tags such as [laughter]—see MARS 8.1 Beta Text Controls in the API reference.
mars-instruct: Emotion and pacing tags and SSML-style pauses in text—examples below.

When you use speech_model: "mars-instruct", you can encode expression directly in the text field. English examples:

[speaking slowly] This is very important. Please pay close attention.
[excited] We shipped the feature, and the response has been fantastic!
Let's pause for a moment <break time="400ms"/> and continue clearly.

For a comprehensive guide on emotional expression, pauses, and prosody control, see the Emotional Voice Control tutorial.

Listing Voices

Get available voices:

import asyncio
import os
import aiohttp


async def list_voices():
    """List all available voices."""
    api_key = os.getenv("CAMB_API_KEY")
    url = "https://client.camb.ai/apis/list-voices"

    headers = {"x-api-key": api_key}

    async with aiohttp.ClientSession() as session:
        async with session.get(url, headers=headers) as resp:
            if resp.status == 200:
                voices = await resp.json()
                return voices
            else:
                raise Exception(f"Error: {resp.status}")


async def main():
    voices = await list_voices()

    print(f"Found {len(voices)} voices:\n")
    for voice in voices[:10]:  # Print first 10
        print(f"ID: {voice['id']}, Name: {voice['voice_name']}, Gender: {voice['gender']}")


if __name__ == "__main__":
    asyncio.run(main())

Playing Audio

import asyncio
import io
import os
import wave
import aiohttp
import numpy as np
import sounddevice as sd


async def play_tts(text: str):
    """Generate and play TTS audio."""
    api_key = os.getenv("CAMB_API_KEY")
    url = "https://client.camb.ai/apis/tts-stream"

    headers = {
        "x-api-key": api_key,
        "Content-Type": "application/json",
    }

    payload = {
        "text": text,
        "voice_id": 147320,
        "language": "en-us",
        "speech_model": "mars-8.1-flash-beta",
        "output_configuration": {"format": "wav"},
    }

    timeout = aiohttp.ClientTimeout(total=120)

    async with aiohttp.ClientSession(timeout=timeout) as session:
        async with session.post(url, headers=headers, json=payload) as resp:
            resp.raise_for_status()
            audio_bytes = await resp.read()

    # Parse WAV and extract PCM data
    with wave.open(io.BytesIO(audio_bytes), 'rb') as wav_file:
        sample_rate = wav_file.getframerate()
        audio_data = np.frombuffer(wav_file.readframes(-1), dtype=np.int16)

    sd.play(audio_data, samplerate=sample_rate)
    sd.wait()


if __name__ == "__main__":
    asyncio.run(play_tts("Hello! This audio is playing directly."))

Next Steps

Python SDK

Use the SDK for simpler integration

API Reference

Complete API documentation

https://mintcdn.com/cambai/2LvnefIkletroPxv/images/pipecat-orange.svg?fit=max&auto=format&n=2LvnefIkletroPxv&q=85&s=40cf8e001b8cadc8a4c3c557dea603d5

Voice Agents

Build real-time voice applications

Voice Library

Browse available voices

​Overview

​When to Use Direct API

​Listen to an Example

​Prerequisites

​Basic TTS Request

​Streaming Response

​Request Parameters

​Required Parameters

​Optional Parameters

​Expressive and pronunciation controls

​Listing Voices

​Playing Audio

​Next Steps

Python SDK

API Reference

Voice Agents

Voice Library

Overview

When to Use Direct API

Listen to an Example

Prerequisites

Basic TTS Request

Streaming Response

Request Parameters

Required Parameters

Optional Parameters

Expressive and pronunciation controls

Listing Voices

Playing Audio

Next Steps