Realtime WebSocket events

The Realtime WebSocket API creates a real-time speech translation session over a single WebSocket connection. The client sends a session configuration first, then streams microphone audio with JSON events that contain base64-encoded audio. The server responds with session lifecycle events, transcripts, translated text, and translated audio. For the interactive WebSocket playground, see Realtime (WebSocket).

Endpoint

GET /v1/realtime?model=lilac
Host: realtime.camb.ai
x-api-key: <YOUR_API_KEY>

The request upgrades to a WebSocket connection. The model query parameter is optional.

Model	Notes
`lilac`	Default model.
`violet`	Supported realtime model.
`iris`	Supported realtime model.
`orchid`	Supported realtime model.

If session.update.session.model is omitted, the server uses the model query parameter. If both are omitted, the server uses lilac.

Session lifecycle

Open the WebSocket

Connect to wss://realtime.camb.ai/v1/realtime. Send your API key in the x-api-key request header when your WebSocket client supports custom headers.

Send `session.update` first

The first WebSocket message must be a JSON session.update event. The server waits up to 10 seconds for this initial event.

Wait for session activation

After authorization and activation, the server sends session.created, followed by session.updated.

Stream microphone audio

Send base64-encoded audio chunks with input_audio_buffer.append. The server forwards decoded audio bytes into the realtime pipeline.

Read realtime output

The server emits completed input transcripts, translated text deltas, final translated text, translated audio chunks, and audio completion events.

Only text WebSocket messages are parsed as realtime events. Binary messages and Pong frames are ignored. Ping frames receive Pong replies.

Authentication

Prefer the x-api-key WebSocket request header.

x-api-key: <YOUR_API_KEY>

You can also send credentials in the initial session.update event.

{
  "type": "session.update",
  "session": {
    "model": "lilac",
    "source_language": "en-US",
    "target_language": "de-DE",
    "output_modalities": ["text", "audio"]
  },
  "auth": {
    "api_key": "<YOUR_API_KEY>"
  }
}

If the request header and auth object are both present, the request header credential is used.

Limits

Limit	Value
Initial `session.update` timeout	10 seconds
Maximum client event text size	1 MiB
Maximum decoded audio payload per `input_audio_buffer.append`	256 KiB
Audio encoding in `input_audio_buffer.append.audio`	Base64-encoded audio bytes

Session configuration

Send the active session settings in session.update.session.

{
  "model": "lilac",
  "source_language": "en-US",
  "target_language": "de-DE",
  "output_modalities": ["text", "audio"]
}

Field	Type	Required	Notes
`model`	string	No	Defaults to the `model` query parameter, then `lilac`. Must be one of `lilac`, `violet`, `iris`, or `orchid`.
`source_language`	string	Yes	Source language tag, for example `en-US`.
`target_language`	string	Yes	Target language tag, for example `de-DE`.
`output_modalities`	string[]	No	Defaults to `["text", "audio"]`.

Client events

Event	When to send it	Support
`session.update`	First message on the connection.	Required for activation.
`input_audio_buffer.append`	After the session is active.	Supported.
`input_audio_buffer.clear`	Not available in this version.	Recognized, returns `error`.
`input_audio_buffer.commit`	Not available in this version.	Recognized, returns `error`.
`response.cancel`	Not available in this version.	Recognized, returns `error`.

`session.update`

Initializes the realtime session. This must be the first client event.

{
  "type": "session.update",
  "session": {
    "model": "lilac",
    "source_language": "en-US",
    "target_language": "de-DE",
    "output_modalities": ["text", "audio"]
  }
}

After activation, sending another session.update is recognized but not supported. The server responds with an error event.

{
  "type": "error",
  "error": {
    "message": "session.update after activation is not supported in this version"
  }
}

`input_audio_buffer.append`

Appends audio bytes to the realtime input stream.

{
  "type": "input_audio_buffer.append",
  "audio": "<base64_audio_bytes>"
}

The server base64-decodes audio and forwards the decoded bytes into the realtime pipeline. Each decoded payload can be up to 256 KiB.

Unsupported client events

These events are recognized so clients can receive a structured error instead of silent failure.

input_audio_buffer.clear

{
  "type": "input_audio_buffer.clear"
}

Server response:

{
  "type": "error",
  "error": {
    "message": "input_audio_buffer.clear is not supported in this version"
  }
}

input_audio_buffer.commit

{
  "type": "input_audio_buffer.commit"
}

Server response:

{
  "type": "error",
  "error": {
    "message": "input_audio_buffer.commit is not supported in this version"
  }
}

response.cancel

{
  "type": "response.cancel"
}

Server response:

{
  "type": "error",
  "error": {
    "message": "response.cancel is not supported in this version"
  }
}

Server events

Event	Description
`session.created`	Sent after the server authorizes, starts, and activates the realtime session.
`session.updated`	Sent immediately after `session.created` with the active session configuration.
`conversation.item.input_audio_transcription.completed`	Sent when the pipeline produces a completed user transcript.
`response.text.delta`	Sent for incremental assistant translation text. Deltas are additive for the current response.
`response.text.done`	Sent when the final assistant translation text is available.
`response.audio.delta`	Sent when synthesized output audio bytes are available.
`response.audio.done`	Sent when the current assistant audio response is complete.
`error`	Sent for recognized unsupported client events and billing stop decisions.

Session events

session.created includes the durable realtime session ID.

{
  "type": "session.created",
  "session": {
    "id": "2a4d3cb0-ff62-4e02-a37c-9fcf4e49c8cc",
    "model": "lilac",
    "source_language": "en-US",
    "target_language": "de-DE",
    "output_modalities": ["text", "audio"]
  }
}

session.updated confirms the active session configuration.

{
  "type": "session.updated",
  "session": {
    "model": "lilac",
    "source_language": "en-US",
    "target_language": "de-DE",
    "output_modalities": ["text", "audio"]
  }
}

Field	Type	Notes
`session.id`	UUID string	Durable realtime session ID. Present on `session.created`.
`session.model`	string	Active realtime model.
`session.source_language`	string	Source language tag.
`session.target_language`	string	Target language tag.
`session.output_modalities`	string[]	Active output modalities.

Transcript and response events

{
  "type": "conversation.item.input_audio_transcription.completed",
  "transcript": "Hello, how are you?"
}

{
  "type": "response.text.delta",
  "delta": "Guten"
}

{
  "type": "response.text.done",
  "text": "Guten Tag, wie geht es Ihnen?"
}

{
  "type": "response.audio.delta",
  "delta": "<base64_audio_bytes>"
}

{
  "type": "response.audio.done"
}

Errors and billing stops

The server sends error for recognized but unsupported client events.

{
  "type": "error",
  "error": {
    "message": "input_audio_buffer.commit is not supported in this version"
  }
}

Billing can also stop a session. In that case, the server sends an error event whose error.message is the billing close reason, then ends the realtime loop. Active sessions are charged in billing windows and finalized on close, failure, or billing stop.

Example message flow

{
  "type": "session.update",
  "session": {
    "model": "lilac",
    "source_language": "en-US",
    "target_language": "de-DE",
    "output_modalities": ["text", "audio"]
  }
}

{
  "type": "session.created",
  "session": {
    "id": "2a4d3cb0-ff62-4e02-a37c-9fcf4e49c8cc",
    "model": "lilac",
    "source_language": "en-US",
    "target_language": "de-DE",
    "output_modalities": ["text", "audio"]
  }
}

{
  "type": "session.updated",
  "session": {
    "model": "lilac",
    "source_language": "en-US",
    "target_language": "de-DE",
    "output_modalities": ["text", "audio"]
  }
}

{
  "type": "input_audio_buffer.append",
  "audio": "<base64_audio_bytes>"
}

{
  "type": "conversation.item.input_audio_transcription.completed",
  "transcript": "Hello, how are you?"
}

{
  "type": "response.text.delta",
  "delta": "Guten Tag"
}

{
  "type": "response.text.done",
  "text": "Guten Tag, wie geht es Ihnen?"
}

{
  "type": "response.audio.delta",
  "delta": "<base64_audio_bytes>"
}

{
  "type": "response.audio.done"
}

Getting Started

Models

Tutorials

SDK Guides

Hosting Platforms

Integrations

API Reference

Other Products

Release Logs

Realtime WebSocket events

Endpoint

Session lifecycle

Authentication

Limits

Session configuration

Client events

`session.update`

`input_audio_buffer.append`

Unsupported client events

Server events

Session events

Transcript and response events

Errors and billing stops

Example message flow

​Endpoint

​Session lifecycle

​Authentication

​Limits

​Session configuration

​Client events

​session.update

​input_audio_buffer.append

​Unsupported client events

​Server events

​Session events

​Transcript and response events

​Errors and billing stops

​Example message flow

Endpoint

Session lifecycle

Authentication

Limits

Session configuration

Client events

`session.update`

`input_audio_buffer.append`

Unsupported client events

Server events

Session events

Transcript and response events

Errors and billing stops

Example message flow