Documentation Index
Fetch the complete documentation index at: https://docs.camb.ai/llms.txt
Use this file to discover all available pages before exploring further.
Beta. Realtime speech-to-speech translation is available for testing
in the Python SDK. Event shapes, configuration options, audio formats,
and error semantics may change in backwards-incompatible ways before GA.
Pin to the SDK version you test against.
Overview
Speak (or stream a file) in one language and receive the translation as live text and synthesized speech over a single WebSocket. The session exposes a typed event dispatcher, a built-in microphone helper, and a forward-compatibleon_any subscription for events the server may add in
future releases.
Key features:
- Translated text and audio β receive incremental translated text
(
response.text.delta/response.text.done) and translated speech audio (response.audio.delta) as you speak. - Typed events β a single
ServerEventTypeenum with per-event typed payloads. - Microphone + file helpers β
Microphone(viasounddevice) andFileAudioSourceship with the SDK. - Easy extensibility β new server event = one enum entry + one payload type + one parser entry.
Prerequisites
Create an account
Sign up at CAMB.AI Studio if you havenβt already.
Get your API key
Go to Settings β API Keys in Studio and copy your key. See Authentication for details.
Install the SDK
Models
| Model | Cold boot | Notes |
|---|---|---|
iris | None (ready in ~1s) | Low-latency. Recommended for interactive use. |
lilac (default) | ~30s+ | |
violet | ~30s+ | |
orchid | ~30s+ |
iris models cold-boot for 30+ seconds on the first connection; the
server emits session.starting (and WebSocket keepalives) during that
window. session.wait_until_ready() blocks until the session is active
(up to 90s), so you donβt need to handle the boot wait yourself.
Supported languages
source_language and target_language accept the BCP-47 tags below
(case-insensitive). The set is per model: lilac, violet, and
orchid support all 22 languages, while iris supports the 14-language
subset marked below. See the
WebSocket API reference
for the authoritative list.
All supported realtime languages (22)
All supported realtime languages (22)
| Code | Language | Models |
|---|---|---|
ar-ae | Arabic (United Arab Emirates) | all |
ar-eg | Arabic (Egypt) | all |
ar-sa | Arabic (Saudi Arabia) | all |
cs-cz | Czech (Czechia) | lilac, violet, orchid |
de-de | German (Germany) | all |
en-gb | English (United Kingdom) | all |
en-us | English (United States) | all |
es-es | Spanish (Spain) | all |
fi-fi | Finnish (Finland) | lilac, violet, orchid |
fr-ca | French (Canada) | all |
fr-fr | French (France) | all |
hi-in | Hindi (India) | all |
ja-jp | Japanese (Japan) | all |
ko-kr | Korean (Korea) | all |
no-no | Norwegian | lilac, violet, orchid |
pl-pl | Polish (Poland) | lilac, violet, orchid |
pt-br | Portuguese (Brazil) | all |
sv-se | Swedish (Sweden) | lilac, violet, orchid |
tr-tr | Turkish (Turkey) | lilac, violet, orchid |
uk-ua | Ukrainian (Ukraine) | lilac, violet, orchid |
ur-in | Urdu (India) | lilac, violet, orchid |
zh-cn | Chinese (Mandarin, Simplified) | all |
iris supports: ar-ae, ar-eg, ar-sa, de-de, en-gb, en-us,
es-es, fr-ca, fr-fr, hi-in, ja-jp, ko-kr, pt-br, zh-cn.Get Started
Create an API Key
Generate a key at CAMB.AI Studio and export it asCAMB_API_KEY for the snippets below.
Install
sounddevice ships with camb-sdk, so the Microphone and Speaker
helpers work out of the box. On Linux you may need PortAudio system
libraries (e.g. apt install libportaudio2).
Quickstart (microphone)
Speak into your mic; the translated speech plays back through your speakers and the translated text prints as it arrives.Quickstart (file β file)
Useful on machines with no microphone (CI, servers). The input WAV must be 16-bit PCM, mono, 24 kHz; the translated audio is written to an output WAV.Feed the session clear speech. Music, silence, or noisy/low-quality
audio may not be recognized by the speech model, in which case no
transcript or translation is produced for that audio.
Events and Payloads
Supported events
All events are exposed through theServerEventType enum.
| Event | Wire type | Notes |
|---|---|---|
SESSION_STARTING | session.starting | Pipeline is booting (non-iris cold boot). Not yet ready for audio. |
SESSION_CREATED | session.created | Session is authorized and ready. wait_until_ready() resolves here. |
SESSION_UPDATED | session.updated | Echo of the active session configuration. |
TRANSCRIPT_COMPLETED | conversation.item.input_audio_transcription.completed | Final transcript of a user utterance (source language). |
TEXT_DELTA | response.text.delta | Incremental translated text; additive within one response. |
TEXT_DONE | response.text.done | Complete translated text for the current response. |
AUDIO_DELTA | response.audio.delta (or binary frame) | Chunk of synthesized translated speech (event.data is raw PCM16 bytes). |
AUDIO_DONE | response.audio.done | Current translated audio response is complete. |
ERROR | error | Server error, or a handler exception surfaced by the SDK. |
CLOSED | Closed | Synthetic β emitted by the SDK when the WebSocket closes. Carries code and reason. |
Which text events fire depends on the model.
iris emits translated
text (TEXT_DELTA / TEXT_DONE); lilac and orchid also emit the
source-language transcript (TRANSCRIPT_COMPLETED). All models emit
translated audio (AUDIO_DELTA).session.on_any(...)
with the raw payload, so applications stay forward-compatible.
Subscribing to events
Configuration
| Option | Default | Description |
|---|---|---|
source_language | β (required) | BCP-47 tag of the input speech, e.g. en-us. Must be a supported language for the model. |
target_language | β (required) | BCP-47 tag of the translation, e.g. de-de. Must be a supported language for the model. |
model | lilac | One of lilac, violet, iris, orchid. |
output_modalities | ["text", "audio"] | Subset of text and audio. |
More Information
- Speech To Speech WebSocket reference β the underlying wire protocol and full event list.
- Python SDK β the full SDK guide.
- Source:
cambai-python-sdk(examples/realtime_translation_microphone.py,examples/realtime_translation_file.py).