Use this file to discover all available pages before exploring further.
The official Python SDK for Camb.ai provides convenient access to text-to-speech, dubbing, translation, transcription, audio separation, voice cloning, and audio generation. It ships synchronous and asynchronous clients with streaming audio helpers and typed language enums.
Get your API key from Camb.ai Studio and set it as an environment variable. The client reads it via os.getenv("CAMB_API_KEY") so your key never has to appear in source code.The SDK ships two clients. Use CambAI for scripts, data pipelines, and anything thread-managed. Use AsyncCambAI for web servers (FastAPI, Sanic) and high-concurrency applications where blocking would hurt throughput.
import osfrom camb.client import CambAI, AsyncCambAI# Synchronous â for scripts and data pipelinesclient = CambAI(api_key=os.getenv("CAMB_API_KEY"))# Asynchronous â for web servers and high-concurrency applicationsasync_client = AsyncCambAI(api_key=os.getenv("CAMB_API_KEY"))
Camb.ai offers MARS models tuned for different quality and latency requirements. Pass the model name as speech_model in any TTS call. If you omit it, the API uses a default model.
Model
Sample Rate
Best For
mars-8.1-flash-beta
48 kHz
Faster MARS 8.1 generation; same quality improvements as mars-8.1-pro-beta
mars-8.1-pro-beta
48 kHz
Improved pronunciation, expressiveness, and prosody over mars-pro
mars-flash
22.05 kHz
Low-latency real-time applications and conversational AI
mars-pro
48 kHz
High-fidelity audio production and long-form content
mars-instruct
22.05 kHz
Fine-grained tone and style control via text instructions
MARS Flash
MARS Pro
MARS Instruct
stream = client.text_to_speech.tts( text="Hey, I can respond much faster.", language="en-us", voice_id=147320, speech_model="mars-flash",)
Best for: Voice agents, real-time applicationsSample rate: 22.05 kHz
stream = client.text_to_speech.tts( text="High-fidelity audio for production use.", language="en-us", voice_id=147320, speech_model="mars-pro",)
Best for: Audio production, dubbing, long-form contentSample rate: 48 kHz
stream = client.text_to_speech.tts( text="[warm, friendly] Great to meet you!", language="en-us", voice_id=147320, speech_model="mars-instruct", user_instructions="Speak warmly and with enthusiasm.",)
Best for: Fine-grained control over tone and deliverySample rate: 22.05 kHz
Call list_voices() to retrieve all voices available on your account, including pre-built library voices and any custom voices you have created. The id field is what you pass as voice_id in TTS calls.
To clone a voice, upload a short reference audio sample alongside a name and gender. Setting enhance_audio=True applies preprocessing that improves cloning quality on recordings with background noise.
import osfrom camb.client import CambAIclient = CambAI(api_key=os.getenv("CAMB_API_KEY"))with open("reference.wav", "rb") as f: result = client.voice_cloning.create_custom_voice( voice_name="My Custom Voice", gender="female", file=f, description="Warm and conversational.", enhance_audio=True, )print(f"Voice created: {result}")
Camb.ai supports up to 158 languages depending on the MARS model â see Language Support for the full per-model locale list. You can pass locale strings directly, but using the Languages enum is recommended because it provides autocomplete in your editor and prevents typos in language codes.
To fetch the full list of supported languages at runtime, use the languages sub-client. Source languages are what you can transcribe or translate from; target languages are what you can translate or dub into.
import osfrom camb.client import CambAIclient = CambAI(api_key=os.getenv("CAMB_API_KEY"))source_languages = client.languages.get_source_languages()target_languages = client.languages.get_target_languages()for lang in source_languages: print(lang)
Languages supported per model are listed at MARS Models.
The dubbing pipeline takes a publicly accessible video URL, translates the audio track into your target language, and synthesizes speech using a clone of the original speakerâs voice. Dubbing is asynchronous: you submit the job, poll for completion, and then fetch the result.
The translation API accepts a list of text strings and returns them translated into the target language in the same order. Like dubbing, translation is asynchronous. You submit a job and poll until the status reaches SUCCESS.
import osimport timefrom camb.client import CambAIfrom camb.types.language_enums import Languagesclient = CambAI(api_key=os.getenv("CAMB_API_KEY"))response = client.translation.create_translation( texts=["Hello, how are you?", "Welcome to Camb.ai."], source_language=Languages.EN_US, target_language=Languages.FR_FR,)while True: status = client.translation.get_translation_task_status(task_id=response["task_id"]) if status.status == "SUCCESS": result = client.translation.get_translation_result(run_id=status.run_id) for text in result.texts: print(text) break time.sleep(2)
create_translation accepts a list of strings under texts. All strings are translated in a single job and returned in order.
Submit an audio or video file for transcription and retrieve the result once processing completes. You can pass a remote URL via media_url or upload a local file via media_file. The result supports optional word-level timestamps.
Audio separation splits a mixed audio track into its vocal and background components. After the job completes, the result contains separate download URLs for each stem so you can use them independently.
import osimport timefrom camb.client import CambAIclient = CambAI(api_key=os.getenv("CAMB_API_KEY"))with open("track.mp3", "rb") as f: response = client.audio_separation.create_audio_separation(media_file=f)while True: status = client.audio_separation.get_audio_separation_status(task_id=response.task_id) if status.status == "SUCCESS": result = client.audio_separation.get_audio_separation_run_info(run_id=status.run_id) print(result) break time.sleep(3)
Text-to-Voice creates a brand-new synthetic voice from a written description of the desired vocal characteristics. The API generates several audio samples so you can audition variations before deciding which voice ID to use in production.
import osimport timefrom camb.client import CambAIclient = CambAI(api_key=os.getenv("CAMB_API_KEY"))response = client.text_to_voice.create_text_to_voice( text="A confident narrator introducing a documentary.", voice_description="Deep, measured baritone with a slight gravel. Calm and authoritative.",)while True: status = client.text_to_voice.get_text_to_voice_status(task_id=response.task_id) if status.status == "SUCCESS": result = client.text_to_voice.get_text_to_voice_result(run_id=status.run_id) print(result) break time.sleep(3)
The result contains multiple sample audio URLs. Preview them and use the voice_id of your preferred sample with client.text_to_speech.tts.
Text-to-Audio generates sound effects or ambient soundscapes from a descriptive text prompt. The job is asynchronous, and the result is a streamable audio file you can save with save_stream_to_file once processing completes.
import osimport timefrom camb.client import CambAI, save_stream_to_fileclient = CambAI(api_key=os.getenv("CAMB_API_KEY"))response = client.text_to_audio.create_text_to_audio( prompt="Heavy rain on a tin roof at night with distant thunder.", duration=15, audio_type="sound",)while True: status = client.text_to_audio.get_text_to_audio_status(task_id=response.task_id) if status.status == "SUCCESS": stream = client.text_to_audio.get_text_to_audio_result(run_id=status.run_id) save_stream_to_file(stream, "soundscape.mp3") print("Saved to soundscape.mp3") break time.sleep(3)
The Stories API takes a document file, structures its content into a narrative, and returns a fully narrated audio output. You can provide a custom narrator voice ID; if omitted, the API selects a default voice appropriate for the source language.
import osimport timefrom camb.client import CambAIfrom camb.types.language_enums import Languagesclient = CambAI(api_key=os.getenv("CAMB_API_KEY"))with open("story.pdf", "rb") as f: response = client.story.create_story( file=f, source_language=Languages.EN_US, title="My Story", )while True: status = client.story.get_story_status(task_id=response.task_id) if status.status == "SUCCESS": result = client.story.get_story_run_info(run_id=status.run_id) print(result) break time.sleep(5)
Translated TTS combines translation and speech synthesis into a single asynchronous job. The text is translated into the target language and then spoken using the voice you specify, without needing to run translation and TTS as separate steps.
Dictionaries contain custom term mappings that the API applies automatically when running dubbing or translation jobs. They are particularly useful for brand names, product terminology, and proper nouns that need consistent handling across languages.
Use add_term_to_dictionary to insert individual term translations and delete_dictionary_term to remove them by ID. Each term takes a source, a target, and the language the target is in:
from camb.types import TermTranslationInput# Add a termclient.dictionaries.add_term_to_dictionary( dictionary_id="dict_123", translations=[ TermTranslationInput(source="Camb.ai", target="ā¤āĨā¤ŽāĨā¤Ŧ.ā¤ā¤ā¤", language="hi-in") ],)# Delete a termclient.dictionaries.delete_dictionary_term( dictionary_id="dict_123", term_id="term_456",)# Delete the dictionaryclient.dictionaries.delete_dictionary(dictionary_id="dict_123")
If you are running MARS on your own infrastructure through Baseten, you can initialize the client with a provider configuration instead of a Camb.ai API key. See Custom Cloud Providers for deployment instructions.
Deploy the MARS model to your Baseten account, then point the client at your deployment URL. Baseten calls require a base64-encoded reference audio sample to be passed with every request: