Direct API Text-to-Speech

Overview

While SDKs and frameworks provide convenience, sometimes you need direct control over API calls. This tutorial shows how to call the Camb.ai TTS API directly using HTTP requests.

When to Use Direct API

Building integrations in languages without an SDK
Need fine-grained control over request/response handling
Debugging or testing API behavior
Building custom streaming implementations

Authentication

All API requests require your API key in the x-api-key header:

x-api-key: your_camb_api_key

Get your API key from CAMB.AI Studio.

Basic TTS Request

Python with aiohttp

import asyncio
import os
import aiohttp

async def text_to_speech(text: str, voice_id: int = 147320):
    """Convert text to speech using direct API call."""
    api_key = os.getenv("CAMB_API_KEY")
    url = "https://client.camb.ai/apis/text-to-speech/tts"

    headers = {
        "x-api-key": api_key,
        "Content-Type": "application/json",
    }

    payload = {
        "text": text,
        "voice_id": voice_id,
        "language": "en-us",
        "speech_model": "mars-flash",
        "output_configuration": {
            "format": "pcm_s16le"
        },
    }

    async with aiohttp.ClientSession() as session:
        async with session.post(url, headers=headers, json=payload) as resp:
            if resp.status == 200:
                return await resp.read()
            else:
                error = await resp.text()
                raise Exception(f"API error {resp.status}: {error}")


async def main():
    audio_data = await text_to_speech("Hello from the direct API!")

    with open("output.pcm", "wb") as f:
        f.write(audio_data)

    print(f"Saved {len(audio_data)} bytes to output.pcm")


if __name__ == "__main__":
    asyncio.run(main())

cURL

curl -X POST "https://client.camb.ai/apis/text-to-speech/tts" \
  -H "x-api-key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "text": "Hello from the command line!",
    "voice_id": 147320,
    "language": "en-us",
    "speech_model": "mars-flash",
    "output_configuration": {
      "format": "pcm_s16le"
    }
  }' \
  --output output.pcm

Python with requests (Synchronous)

import os
import requests

def text_to_speech(text: str, voice_id: int = 147320) -> bytes:
    """Synchronous TTS API call."""
    api_key = os.getenv("CAMB_API_KEY")
    url = "https://client.camb.ai/apis/text-to-speech/tts"

    headers = {
        "x-api-key": api_key,
        "Content-Type": "application/json",
    }

    payload = {
        "text": text,
        "voice_id": voice_id,
        "language": "en-us",
        "speech_model": "mars-flash",
        "output_configuration": {"format": "pcm_s16le"},
    }

    response = requests.post(url, headers=headers, json=payload)
    response.raise_for_status()
    return response.content


if __name__ == "__main__":
    audio = text_to_speech("Hello world!")
    with open("output.pcm", "wb") as f:
        f.write(audio)

Streaming Response

For real-time applications, stream audio chunks as they’re generated:

import asyncio
import os
import time
import aiohttp


async def stream_tts(text: str, voice_id: int = 147320):
    """Stream TTS audio with timing metrics."""
    api_key = os.getenv("CAMB_API_KEY")
    url = "https://client.camb.ai/apis/text-to-speech/tts"

    headers = {
        "x-api-key": api_key,
        "Content-Type": "application/json",
    }

    payload = {
        "text": text,
        "voice_id": voice_id,
        "language": "en-us",
        "speech_model": "mars-flash",
        "output_configuration": {"format": "pcm_s16le"},
    }

    async with aiohttp.ClientSession() as session:
        start_time = time.perf_counter()
        ttfb = None

        async with session.post(url, headers=headers, json=payload) as resp:
            print(f"Status: {resp.status}")
            print(f"Content-Type: {resp.headers.get('Content-Type')}")

            chunk_count = 0
            total_bytes = 0

            async for chunk in resp.content.iter_chunked(4096):
                if ttfb is None:
                    ttfb = (time.perf_counter() - start_time) * 1000
                    print(f"Time to First Byte: {ttfb:.0f}ms")

                chunk_count += 1
                total_bytes += len(chunk)
                yield chunk

            total_time = (time.perf_counter() - start_time) * 1000
            print(f"\nTotal: {chunk_count} chunks, {total_bytes} bytes")
            print(f"TTFB: {ttfb:.0f}ms, Total: {total_time:.0f}ms")


async def main():
    with open("streamed_output.pcm", "wb") as f:
        async for chunk in stream_tts("This is streamed audio generation."):
            f.write(chunk)


if __name__ == "__main__":
    asyncio.run(main())

Request Parameters

Required Parameters

Parameter	Type	Description
`text`	string	Text to convert to speech (min 3 characters)
`voice_id`	integer	Voice ID to use

Optional Parameters

Parameter	Type	Default	Description
`language`	string	`"en-us"`	BCP-47 language code
`speech_model`	string	`"mars-flash"`	Model: `mars-flash`, `mars-pro`, `mars-instruct`
`output_configuration`	object	`{}`	Output format settings
`enhance_named_entities`	boolean	`false`	Better pronunciation for names/places

Output Configuration

{
  "output_configuration": {
    "format": "pcm_s16le"
  }
}

Supported formats:

pcm_s16le - Raw PCM, 16-bit signed little-endian (recommended for streaming)
wav - WAV file format
mp3 - MP3 compressed audio

Listing Voices

Get available voices:

import asyncio
import os
import aiohttp


async def list_voices():
    """List all available voices."""
    api_key = os.getenv("CAMB_API_KEY")
    url = "https://client.camb.ai/apis/list-voices"

    headers = {"x-api-key": api_key}

    async with aiohttp.ClientSession() as session:
        async with session.get(url, headers=headers) as resp:
            if resp.status == 200:
                voices = await resp.json()
                return voices
            else:
                raise Exception(f"Error: {resp.status}")


async def main():
    voices = await list_voices()

    print(f"Found {len(voices)} voices:\n")
    for voice in voices[:10]:  # Print first 10
        print(f"ID: {voice['id']}, Name: {voice['name']}, Gender: {voice['gender']}")


if __name__ == "__main__":
    asyncio.run(main())

cURL:

curl -X GET "https://client.camb.ai/apis/list-voices" \
  -H "x-api-key: YOUR_API_KEY"

Error Handling

Handle common errors gracefully:

import asyncio
import os
import aiohttp


class CambAPIError(Exception):
    def __init__(self, status: int, message: str):
        self.status = status
        self.message = message
        super().__init__(f"API Error {status}: {message}")


async def text_to_speech_safe(text: str, voice_id: int = 147320):
    """TTS with proper error handling."""
    api_key = os.getenv("CAMB_API_KEY")

    if not api_key:
        raise ValueError("CAMB_API_KEY environment variable not set")

    if len(text) < 3:
        raise ValueError("Text must be at least 3 characters")

    url = "https://client.camb.ai/apis/text-to-speech/tts"

    headers = {
        "x-api-key": api_key,
        "Content-Type": "application/json",
    }

    payload = {
        "text": text,
        "voice_id": voice_id,
        "language": "en-us",
        "speech_model": "mars-flash",
        "output_configuration": {"format": "pcm_s16le"},
    }

    timeout = aiohttp.ClientTimeout(total=60)

    async with aiohttp.ClientSession(timeout=timeout) as session:
        try:
            async with session.post(url, headers=headers, json=payload) as resp:
                if resp.status == 200:
                    return await resp.read()
                elif resp.status == 401:
                    raise CambAPIError(401, "Invalid API key")
                elif resp.status == 400:
                    error = await resp.json()
                    raise CambAPIError(400, error.get("message", "Bad request"))
                elif resp.status == 429:
                    raise CambAPIError(429, "Rate limit exceeded")
                else:
                    raise CambAPIError(resp.status, await resp.text())

        except asyncio.TimeoutError:
            raise CambAPIError(408, "Request timed out")
        except aiohttp.ClientError as e:
            raise CambAPIError(500, f"Connection error: {str(e)}")


async def main():
    try:
        audio = await text_to_speech_safe("Hello world!")
        print(f"Success! Got {len(audio)} bytes")
    except ValueError as e:
        print(f"Validation error: {e}")
    except CambAPIError as e:
        print(f"API error: {e}")


if __name__ == "__main__":
    asyncio.run(main())

Playing Audio

Using sounddevice (Python)

import asyncio
import os
import aiohttp
import numpy as np
import sounddevice as sd


async def play_tts(text: str):
    """Generate and play TTS audio."""
    api_key = os.getenv("CAMB_API_KEY")
    url = "https://client.camb.ai/apis/text-to-speech/tts"

    headers = {
        "x-api-key": api_key,
        "Content-Type": "application/json",
    }

    payload = {
        "text": text,
        "voice_id": 147320,
        "speech_model": "mars-flash",
        "output_configuration": {"format": "pcm_s16le"},
    }

    async with aiohttp.ClientSession() as session:
        async with session.post(url, headers=headers, json=payload) as resp:
            audio_bytes = await resp.read()

    # Convert to numpy array
    audio_data = np.frombuffer(audio_bytes, dtype=np.int16)

    # Play at 22050Hz (mars-flash sample rate)
    sd.play(audio_data, samplerate=22050)
    sd.wait()


if __name__ == "__main__":
    asyncio.run(play_tts("Hello! This audio is playing directly."))

Converting to WAV

import wave

def save_as_wav(pcm_data: bytes, filename: str, sample_rate: int = 22050):
    """Save PCM data as WAV file."""
    with wave.open(filename, "wb") as wav_file:
        wav_file.setnchannels(1)  # Mono
        wav_file.setsampwidth(2)  # 16-bit
        wav_file.setframerate(sample_rate)
        wav_file.writeframes(pcm_data)

# Usage
audio_bytes = await text_to_speech("Hello world!")
save_as_wav(audio_bytes, "output.wav")

Next Steps

Python SDK

Use the SDK for simpler integration

API Reference

Complete API documentation

Voice Agents

Build real-time voice applications

Voice Library

Browse available voices

Getting Started

Developer Integrations

API Reference

Direct API Text-to-Speech

Overview

When to Use Direct API

Authentication

Basic TTS Request

Python with aiohttp

cURL

Python with requests (Synchronous)

Streaming Response

Request Parameters

Required Parameters

Optional Parameters

Output Configuration

Listing Voices

Error Handling

Playing Audio

Using sounddevice (Python)

Converting to WAV

Next Steps

Python SDK

API Reference

Voice Agents

Voice Library

Getting Started

Developer Integrations

API Reference

​Overview

​When to Use Direct API

​Authentication

​Basic TTS Request

​Python with aiohttp

​cURL

​Python with requests (Synchronous)

​Streaming Response

​Request Parameters

​Required Parameters

​Optional Parameters

​Output Configuration

​Listing Voices

​Error Handling

​Playing Audio

​Using sounddevice (Python)

​Converting to WAV

​Next Steps

Python SDK

API Reference

Voice Agents

Voice Library

Overview

When to Use Direct API

Authentication

Basic TTS Request

Python with aiohttp

cURL

Python with requests (Synchronous)

Streaming Response

Request Parameters

Required Parameters

Optional Parameters

Output Configuration

Listing Voices

Error Handling

Playing Audio

Using sounddevice (Python)

Converting to WAV

Next Steps