Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.camb.ai/llms.txt

Use this file to discover all available pages before exploring further.

Overview

Convert spoken audio or video into accurate, timestamped text with speaker labels. The pipeline is asynchronous: submit a job with a local file or a public URL, poll until it completes, then retrieve the segmented transcript.

Prerequisites

1

Create an account

Sign up at CAMB.AI Studio if you haven’t already.
2

Get your API key

Go to Settings → API Keys in Studio and copy your key. See Authentication for details.
3

Install the SDK

pip install camb-sdk
Skip this step if you’re using the direct API.
4

Set your API key to use in your code

export CAMB_API_KEY="your_api_key_here"

Code

import os
import time
from camb.client import CambAI

client = CambAI(api_key=os.getenv("CAMB_API_KEY"))

def transcribe_audio():
    # Step 1: Submit the transcription job
    response = client.transcription.create_transcription(
        language=1,  # English (US)
        media_url="https://example.com/meeting.mp3"
    )

    task_id = response.task_id
    print(f"Transcription task created: {task_id}")

    # Step 2: Poll until complete
    while True:
        status = client.transcription.get_transcription_task_status(task_id=task_id)
        print(f"Status: {status.status}")

        if status.status == "SUCCESS":
            # Step 3: Retrieve the transcript
            result = client.transcription.get_transcription_result(run_id=status.run_id)
            for segment in result.transcript[:3]:
                print(f"[{segment.start:.2f}-{segment.end:.2f}] {segment.speaker}: {segment.text}")
            break
        elif status.status == "ERROR":
            print(f"Transcription failed: {status.exception_reason}")
            break

        time.sleep(5)

transcribe_audio()

Transcribing a local file

Pass media_file instead of media_url to upload a local file:
with open("meeting.mp3", "rb") as audio_file:
    response = client.transcription.create_transcription(
        language=1,
        media_file=audio_file
    )

Parameters

Required

ParameterTypeDescription
languageintegerSource-language ID for the spoken content (e.g. 1 for English). See Language IDs.
media_file or media_urlfile or stringProvide exactly one — a local audio/video file, or a publicly accessible URL.

Optional

ParameterTypeDescription
project_namestringLabel for the job in your dashboard
project_descriptionstringAdditional notes for the job
folder_idintegerPlace the run inside a specific folder
word_level_timestampsbooleanPassed to get_transcription_result / getTranscriptionResult to return per-word timing in addition to segment-level timing

Result shape

get_transcription_result returns a TranscriptionResult with a transcript array. Each segment contains:
FieldTypeDescription
startfloatSegment start time in seconds
endfloatSegment end time in seconds
textstringTranscribed text for the segment
speakerstringSpeaker label (e.g. Speaker 1)

Language IDs

Unlike dubbing, transcription’s language parameter is a raw integer source-language ID. Some commonly used values:
LanguageID
English1
Spanish54
French76
German31
Mandarin Chinese139
Japanese88
Arabic4

For the full list of supported source languages and their IDs, see the Source Languages reference.

Tips

  • Supported formats: .mp3, .wav, .aac, .flac, .mp4, .mov, and .mxf (MXF is enterprise-only). For best quality use a lossless format like WAV or FLAC.
  • File size: Uploaded files must be under 20 MB. For longer recordings, host the file and pass a media_url instead, or split the recording into chunks.
  • Polling timeout: For long media, cap your polling loop (e.g. 60 attempts x 5s = 5 minutes) and handle the timeout gracefully.
  • Word-level timestamps: Set word_level_timestamps=true when fetching the result to get precise per-word timing — useful for karaoke-style highlighting and subtitle alignment.
  • Pick the right language: Specifying the correct source language ID significantly improves accuracy. For multilingual content, choose the predominant language.

Next Steps

Create Transcription API

Full API reference for the transcription endpoint.

Poll Transcription Result

Status polling endpoint reference.

Get Transcription Run Result

Retrieve the transcript segments and timing data.

Dubbing

Translate a video into another language while preserving the original voice.