Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.camb.ai/llms.txt

Use this file to discover all available pages before exploring further.

The Realtime WebSocket API creates a real-time speech translation session over a single WebSocket connection. The client sends a session configuration first, then streams microphone audio with JSON events that contain base64-encoded audio. The server responds with session lifecycle events, transcripts, translated text, and translated audio. For the interactive WebSocket playground, see Realtime (WebSocket).

Endpoint

GET /v1/realtime?model=lilac
Host: realtime.camb.ai
x-api-key: <YOUR_API_KEY>
The request upgrades to a WebSocket connection. The model query parameter is optional.
ModelNotes
lilacDefault model.
violetSupported realtime model.
irisSupported realtime model.
orchidSupported realtime model.
If session.update.session.model is omitted, the server uses the model query parameter. If both are omitted, the server uses lilac.

Session lifecycle

1

Open the WebSocket

Connect to wss://realtime.camb.ai/v1/realtime. Send your API key in the x-api-key request header when your WebSocket client supports custom headers.
2

Send `session.update` first

The first WebSocket message must be a JSON session.update event. The server waits up to 10 seconds for this initial event.
3

Wait for session activation

After authorization and activation, the server sends session.created, followed by session.updated.
4

Stream microphone audio

Send base64-encoded audio chunks with input_audio_buffer.append. The server forwards decoded audio bytes into the realtime pipeline.
5

Read realtime output

The server emits completed input transcripts, translated text deltas, final translated text, translated audio chunks, and audio completion events.
Only text WebSocket messages are parsed as realtime events. Binary messages and Pong frames are ignored. Ping frames receive Pong replies.

Authentication

Prefer the x-api-key WebSocket request header.
x-api-key: <YOUR_API_KEY>
You can also send credentials in the initial session.update event.
{
  "type": "session.update",
  "session": {
    "model": "lilac",
    "source_language": "en-US",
    "target_language": "de-DE",
    "output_modalities": ["text", "audio"]
  },
  "auth": {
    "api_key": "<YOUR_API_KEY>"
  }
}
If the request header and auth object are both present, the request header credential is used.

Limits

LimitValue
Initial session.update timeout10 seconds
Maximum client event text size1 MiB
Maximum decoded audio payload per input_audio_buffer.append256 KiB
Audio encoding in input_audio_buffer.append.audioBase64-encoded audio bytes

Session configuration

Send the active session settings in session.update.session.
{
  "model": "lilac",
  "source_language": "en-US",
  "target_language": "de-DE",
  "output_modalities": ["text", "audio"]
}
FieldTypeRequiredNotes
modelstringNoDefaults to the model query parameter, then lilac. Must be one of lilac, violet, iris, or orchid.
source_languagestringYesSource language tag, for example en-US.
target_languagestringYesTarget language tag, for example de-DE.
output_modalitiesstring[]NoDefaults to ["text", "audio"].

Client events

EventWhen to send itSupport
session.updateFirst message on the connection.Required for activation.
input_audio_buffer.appendAfter the session is active.Supported.
input_audio_buffer.clearNot available in this version.Recognized, returns error.
input_audio_buffer.commitNot available in this version.Recognized, returns error.
response.cancelNot available in this version.Recognized, returns error.

session.update

Initializes the realtime session. This must be the first client event.
{
  "type": "session.update",
  "session": {
    "model": "lilac",
    "source_language": "en-US",
    "target_language": "de-DE",
    "output_modalities": ["text", "audio"]
  }
}
After activation, sending another session.update is recognized but not supported. The server responds with an error event.
{
  "type": "error",
  "error": {
    "message": "session.update after activation is not supported in this version"
  }
}

input_audio_buffer.append

Appends audio bytes to the realtime input stream.
{
  "type": "input_audio_buffer.append",
  "audio": "<base64_audio_bytes>"
}
The server base64-decodes audio and forwards the decoded bytes into the realtime pipeline. Each decoded payload can be up to 256 KiB.

Unsupported client events

These events are recognized so clients can receive a structured error instead of silent failure.
{
  "type": "input_audio_buffer.clear"
}
Server response:
{
  "type": "error",
  "error": {
    "message": "input_audio_buffer.clear is not supported in this version"
  }
}
{
  "type": "input_audio_buffer.commit"
}
Server response:
{
  "type": "error",
  "error": {
    "message": "input_audio_buffer.commit is not supported in this version"
  }
}
{
  "type": "response.cancel"
}
Server response:
{
  "type": "error",
  "error": {
    "message": "response.cancel is not supported in this version"
  }
}

Server events

EventDescription
session.createdSent after the server authorizes, starts, and activates the realtime session.
session.updatedSent immediately after session.created with the active session configuration.
conversation.item.input_audio_transcription.completedSent when the pipeline produces a completed user transcript.
response.text.deltaSent for incremental assistant translation text. Deltas are additive for the current response.
response.text.doneSent when the final assistant translation text is available.
response.audio.deltaSent when synthesized output audio bytes are available.
response.audio.doneSent when the current assistant audio response is complete.
errorSent for recognized unsupported client events and billing stop decisions.

Session events

session.created includes the durable realtime session ID.
{
  "type": "session.created",
  "session": {
    "id": "2a4d3cb0-ff62-4e02-a37c-9fcf4e49c8cc",
    "model": "lilac",
    "source_language": "en-US",
    "target_language": "de-DE",
    "output_modalities": ["text", "audio"]
  }
}
session.updated confirms the active session configuration.
{
  "type": "session.updated",
  "session": {
    "model": "lilac",
    "source_language": "en-US",
    "target_language": "de-DE",
    "output_modalities": ["text", "audio"]
  }
}
FieldTypeNotes
session.idUUID stringDurable realtime session ID. Present on session.created.
session.modelstringActive realtime model.
session.source_languagestringSource language tag.
session.target_languagestringTarget language tag.
session.output_modalitiesstring[]Active output modalities.

Transcript and response events

{
  "type": "conversation.item.input_audio_transcription.completed",
  "transcript": "Hello, how are you?"
}
{
  "type": "response.text.delta",
  "delta": "Guten"
}
{
  "type": "response.text.done",
  "text": "Guten Tag, wie geht es Ihnen?"
}
{
  "type": "response.audio.delta",
  "delta": "<base64_audio_bytes>"
}
{
  "type": "response.audio.done"
}

Errors and billing stops

The server sends error for recognized but unsupported client events.
{
  "type": "error",
  "error": {
    "message": "input_audio_buffer.commit is not supported in this version"
  }
}
Billing can also stop a session. In that case, the server sends an error event whose error.message is the billing close reason, then ends the realtime loop. Active sessions are charged in billing windows and finalized on close, failure, or billing stop.

Example message flow

{
  "type": "session.update",
  "session": {
    "model": "lilac",
    "source_language": "en-US",
    "target_language": "de-DE",
    "output_modalities": ["text", "audio"]
  }
}