Documentation Index
Fetch the complete documentation index at: https://docs.camb.ai/llms.txt
Use this file to discover all available pages before exploring further.
Beta. Live transcription is available for testing in the Python and
TypeScript SDKs. Event shapes, configuration options, and error
semantics may change in backwards-incompatible ways before GA. Pin to
the SDK versions you test against.
Overview
Stream raw audio bytes to CAMB over a single WebSocket and receive cumulative transcripts in real time. The session exposes a typed event dispatcher (Ready, Results, Error, Closed), a built-in
microphone helper, and forward-compatible onAny subscription for events
the server may add in future releases.
Key features:
- Interim and final results — interim
Results(is_final: false) carry the cumulative transcript so far for the current utterance; render them as a live preview. A finalResults(is_final: true) closes the utterance; commit it so each utterance is preserved instead of being overwritten by the next one. - Word-level timing — every
Resultspayload includes per-word start/end timestamps and confidence. - Typed events — the same typed event surface in both SDKs, with per-event typed payloads.
- Easy extensibility — new server event = one enum entry + one payload type + one parser entry. Nothing else changes.
- Microphone helpers —
sounddevicein Python,AudioWorkletin the browser,node-record-lpcm16in Node.
How segments and cumulative updates work
Speech arrives as a series of utterances. Within one utterance the server streams cumulative interimResults (is_final: false) — each
adds whole words to the transcript so far (the model emits complete
words, never partial-word fragments). After a short pause in the audio
the server finalizes the utterance with a Results whose is_final is
true, carrying the complete utterance; the next Results then starts a
brand-new utterance from an empty string.
R* marks the final frame. Because the transcript resets on every new
utterance, replacing your UI with the latest transcript and nothing
else makes each utterance erase the previous one — the bug you see as
text “rewriting the same line” after a pause. Instead, show the interim
frames as a live preview, then commit the text when is_final is
true (print it on its own line, or append it to a list) so finished
utterances are preserved. When the client stops sending audio, the
connection closes cleanly with WebSocket close code 1000.
Live Transcription SDK vs Async Transcription SDK
| Live Transcription | Async Transcription | |
|---|---|---|
| Transport | WebSocket (/transcription/listen) | REST (/transcription/transcribe) |
| Input | Streamed PCM | File URL or upload |
| Latency | ~hundreds of ms per partial | Job-based polling |
| Use when | Live captioning, voice UX, agents | Recordings, batch jobs |
Prerequisites
Create an account
Sign up at CAMB.AI Studio if you haven’t already.
Get your API key
Go to Settings → API Keys in Studio and copy your key. See Authentication for details.
Install the SDK
Get Started
Create an API Key
Generate a key at the CAMB API portal and export it asCAMB_API_KEY for the snippets below.
Install
sounddevice as a regular dependency, so the
Microphone helper works out of the box. In Node the microphone
adapter additionally requires the host sox binary. The browser adapter
needs no extra packages — it uses getUserMedia and an inlined
AudioWorklet.
Quickstart
Events and Payloads
Supported events
Both SDKs expose the typed events below through a singleServerMessageType enum. Source tells you who emits each one.
UtteranceEnd is a raw wire event with no dedicated enum member — it
arrives through the onAny catch-all.
| Event | Wire type | Source | Notes |
|---|---|---|---|
Ready | "Ready" | Server | Fires once, immediately after the WebSocket upgrade. |
Results | "Results" | Server | Fires many times. Carries the cumulative transcript for the current utterance, plus word-level timing and confidence. is_final is false for interim refinements and true on the frame that finalizes an utterance; the next Results after a final begins a new utterance. Replace your in-progress line with transcript, then commit it when is_final is true. |
UtteranceEnd | "UtteranceEnd" | Server (VAD) | Boundary marker emitted around finals. No dedicated typed event — delivered via on_any / onAny. Prefer is_final on Results as your commit signal; use this only if you need the raw boundary. |
Error | "Error" | Server or SDK | Server-side: protocol errors (invalid encoding, model failure, etc.). SDK-side: handler exceptions and transport-level failures are re-emitted through the same channel so applications have one place to look. |
Closed | "Closed" | SDK (synthetic) | Emitted by the SDK when the underlying WebSocket closes. Carries the close code and reason (e.g. 1000 for a clean CloseStream, 1008 for an auth failure). |
session.on_any(...) (Python) /
session.onAny(...) (TypeScript) with the raw payload. Applications
stay forward-compatible without forking the SDK.
How events work
The session reads JSON frames off the WebSocket, looks up the wiretype in a parser registry, builds the typed payload, and fans out to
every handler registered for that event. Unknown event types are still
delivered through onAny so applications keep working when the server
adds new messages.
Event payloads
Results (is_final: true) has the exact same shape as an
interim one — only the flag differs. There is no separate Final frame
on the wire; finals arrive through the same Results handler, so branch
on msg.is_final (Python) / msg.isFinal (TypeScript) to decide when to
commit an utterance.
Subscribing to events
Adding a custom event
If you fork the SDK or wrap it for an internal use-case, adding a new server event is a three-step change in either language:- Add a new member to
ServerMessageType. - Define the payload (a Pydantic model in Python, an interface in TypeScript).
- Register a parser in
PARSER_REGISTRY.
Basic Configuration
Every option below is optional. Omit any to inherit the server default documented inapi-reference/websockets/asyncapi.json.
| Option | Default | Description |
|---|---|---|
model | boli-v5 | Transcription model. boli-v5 and boli-v5-transcribe are supported. |
language | en-us | Source language hint. Accepts BCP-47 codes (en-us, pt-br), the Languages enum name (EN_US), or its numeric ID. The model also auto-detects. |
encoding | linear16 | Audio encoding of the bytes you send. One of linear16, linear32, alaw, mulaw. |
sample_rate / sampleRate | 16000 | Sample rate of the bytes you send. The server resamples internally. |
channels | 1 | Channel count of the bytes you send. Multi-channel input is downmixed to mono. |
base_url / baseUrl | wss://client.camb.ai/apis | Override the WebSocket base URL. |
Basic configuration example
Advanced Configuration
KeepAlive
Some intermediaries (load balancers, browser proxies) close idle WebSocket connections after a few seconds of silence. If your audio pipeline can be bursty, send aKeepAlive frame between bursts.
CloseStream
session.close() (Python: same name) sends {"type": "CloseStream"}
and waits for the server’s clean 1000 close. Always prefer this over
just hanging up — it ensures the server flushes any pending transcript.
Bring-your-own transport
The Python SDK’sconnect() accepts a transport argument
implementing the Transport protocol. The TypeScript client accepts a
transport: () => Transport factory. Use this to inject a mock during
testing or to plug in a custom WebSocket implementation.
Microphone Helpers
Python — sounddevice
sounddevice ships with camb-sdk, so no extra install step is needed.
On Linux you may need to install PortAudio system libraries (e.g.
apt install libportaudio2) — sounddevice’s docs cover platform
prerequisites.
TypeScript — browser
getUserMedia, then downsamples to the requested rate inside an
AudioWorklet so the server always sees PCM16 LE little-endian.
TypeScript — Node
node-record-lpcm16, declared in
package.json as an optionalDependencies entry. The host machine also
needs the sox binary on PATH.
Error Handling and Close Codes
Server errors
Whenever the server cannot continue, it emits anError frame and
closes with a non-1000 code:
Transport errors
Connection-level failures (DNS, TLS, mid-stream drops) are surfaced through the sameError event with code: "transport_error" (TypeScript)
or code: "handler_exception" (Python), keeping a single observable
channel for application code.
Close codes
| Code | Meaning |
|---|---|
1000 | Normal close. Either side sent CloseStream / closed cleanly. |
1006 | Abnormal close (transport dropped without a frame). |
4000+ | Application-specific. The server may use these for auth failures and quota errors. |
More Information
/transcription/listenWebSocket reference — the underlying wire protocol.- Python SDK · TypeScript SDK — the full SDK guides.
- Source:
cambai-python-sdk·cambai-typescript-sdk.