The Realtime WebSocket API creates a real-time speech translation session over a single WebSocket connection. The client sends a session configuration first, then streams microphone audio with JSON events that contain base64-encoded audio. The server responds with session lifecycle events, transcripts, translated text, and translated audio. For the interactive WebSocket playground, see Realtime (WebSocket).Documentation Index
Fetch the complete documentation index at: https://docs.camb.ai/llms.txt
Use this file to discover all available pages before exploring further.
Endpoint
model query parameter is optional.
| Model | Notes |
|---|---|
lilac | Default model. |
violet | Supported realtime model. |
iris | Supported realtime model. |
orchid | Supported realtime model. |
If
session.update.session.model is omitted, the server uses the model query parameter. If both are omitted, the server uses lilac.Session lifecycle
Open the WebSocket
Connect to
wss://realtime.camb.ai/v1/realtime. Send your API key in the x-api-key request header when your WebSocket client supports custom headers.Send `session.update` first
The first WebSocket message must be a JSON
session.update event. The server waits up to 10 seconds for this initial event.Wait for session activation
After authorization and activation, the server sends
session.created, followed by session.updated.Stream microphone audio
Send base64-encoded audio chunks with
input_audio_buffer.append. The server forwards decoded audio bytes into the realtime pipeline.Authentication
Prefer thex-api-key WebSocket request header.
session.update event.
If the request header and
auth object are both present, the request header credential is used.Limits
| Limit | Value |
|---|---|
Initial session.update timeout | 10 seconds |
| Maximum client event text size | 1 MiB |
Maximum decoded audio payload per input_audio_buffer.append | 256 KiB |
Audio encoding in input_audio_buffer.append.audio | Base64-encoded audio bytes |
Session configuration
Send the active session settings insession.update.session.
| Field | Type | Required | Notes |
|---|---|---|---|
model | string | No | Defaults to the model query parameter, then lilac. Must be one of lilac, violet, iris, or orchid. |
source_language | string | Yes | Source language tag, for example en-US. |
target_language | string | Yes | Target language tag, for example de-DE. |
output_modalities | string[] | No | Defaults to ["text", "audio"]. |
Client events
| Event | When to send it | Support |
|---|---|---|
session.update | First message on the connection. | Required for activation. |
input_audio_buffer.append | After the session is active. | Supported. |
input_audio_buffer.clear | Not available in this version. | Recognized, returns error. |
input_audio_buffer.commit | Not available in this version. | Recognized, returns error. |
response.cancel | Not available in this version. | Recognized, returns error. |
session.update
Initializes the realtime session. This must be the first client event.
session.update is recognized but not supported. The server responds with an error event.
input_audio_buffer.append
Appends audio bytes to the realtime input stream.
audio and forwards the decoded bytes into the realtime pipeline. Each decoded payload can be up to 256 KiB.
Unsupported client events
These events are recognized so clients can receive a structured error instead of silent failure.input_audio_buffer.clear
input_audio_buffer.clear
input_audio_buffer.commit
input_audio_buffer.commit
response.cancel
response.cancel
Server events
| Event | Description |
|---|---|
session.created | Sent after the server authorizes, starts, and activates the realtime session. |
session.updated | Sent immediately after session.created with the active session configuration. |
conversation.item.input_audio_transcription.completed | Sent when the pipeline produces a completed user transcript. |
response.text.delta | Sent for incremental assistant translation text. Deltas are additive for the current response. |
response.text.done | Sent when the final assistant translation text is available. |
response.audio.delta | Sent when synthesized output audio bytes are available. |
response.audio.done | Sent when the current assistant audio response is complete. |
error | Sent for recognized unsupported client events and billing stop decisions. |
Session events
session.created includes the durable realtime session ID.
session.updated confirms the active session configuration.
| Field | Type | Notes |
|---|---|---|
session.id | UUID string | Durable realtime session ID. Present on session.created. |
session.model | string | Active realtime model. |
session.source_language | string | Source language tag. |
session.target_language | string | Target language tag. |
session.output_modalities | string[] | Active output modalities. |
Transcript and response events
Errors and billing stops
The server sendserror for recognized but unsupported client events.
error event whose error.message is the billing close reason, then ends the realtime loop. Active sessions are charged in billing windows and finalized on close, failure, or billing stop.