Use this file to discover all available pages before exploring further.
The official Java SDK for Camb.ai provides convenient access to text-to-speech, dubbing, translation, transcription, audio separation, voice cloning, and audio generation. Requests use a fluent builder pattern; async jobs follow a typed submit-poll-fetch workflow with TaskStatus enums.
The SDK returns an InputStream for TTS audio so you can stream the response directly to disk:
The generated SDK classes in this repository are in the default Java package (no package ...; declaration). To keep this snippet runnable without extra packaging changes, put Main in the default package too.
import resources.texttospeech.requests.CreateStreamTtsRequestPayload;import resources.texttospeech.types.CreateStreamTtsRequestPayloadLanguage;import resources.texttospeech.types.CreateStreamTtsRequestPayloadSpeechModel;import types.OutputFormat;import types.StreamTtsOutputConfiguration;import java.io.FileOutputStream;import java.io.IOException;import java.io.InputStream;public class Main { private static void saveStreamToFile(InputStream stream, String filename) throws IOException { try (InputStream in = stream; FileOutputStream out = new FileOutputStream(filename)) { byte[] buffer = new byte[4096]; int bytesRead; while ((bytesRead = in.read(buffer)) != -1) { out.write(buffer, 0, bytesRead); } } } public static void main(String[] args) { String apiKey = System.getenv("CAMB_API_KEY"); if (apiKey == null || apiKey.isEmpty()) { throw new IllegalStateException("Missing CAMB_API_KEY environment variable."); } CambApiClient client = CambApiClient.builder() .apiKey(apiKey) .build(); InputStream audioStream = client.textToSpeech().tts( CreateStreamTtsRequestPayload.builder() .text("Hello! Welcome to Camb.ai text-to-speech.") .language(CreateStreamTtsRequestPayloadLanguage.EN_US) .voiceId(147320) .speechModel(CreateStreamTtsRequestPayloadSpeechModel.MARSFLASH) .outputConfiguration(StreamTtsOutputConfiguration.builder().format(OutputFormat.WAV).build()) .build() ); try { saveStreamToFile(audioStream, "output.wav"); System.out.println("Audio saved to output.wav"); } catch (IOException e) { throw new RuntimeException("Failed to save TTS output file", e); } }}
The snippets in the sections below assume client is already initialized as shown above.
The textToSpeech().tts(...) call streams audio as an InputStream. Add userInstructions when using MARSINSTRUCT to control delivery style:
InputStream audioStream = client.textToSpeech().tts( CreateStreamTtsRequestPayload.builder() .text("A warm greeting, delivered naturally.") .language(CreateStreamTtsRequestPayloadLanguage.EN_US) .voiceId(147320) .speechModel(CreateStreamTtsRequestPayloadSpeechModel.MARSINSTRUCT) .userInstructions("Speak with a friendly, upbeat tone.") .outputConfiguration(StreamTtsOutputConfiguration.builder().format(OutputFormat.WAV).build()) .build());// Write audioStream to a file using saveStreamToFile as shown in Quick Start.
Text-to-audio is asynchronous. Submit a prompt, poll until it succeeds, then download the resulting audio stream:
OrchestratorPipelineCallResult submitted = client.textToAudio().createTextToAudio( CreateTextToAudioRequestPayload.builder() .prompt("Heavy rain on a tin roof at night with distant thunder.") .duration(15.0) .audioType(TextToAudioType.SOUND) .build());String taskId = submitted.getTaskId().orElseThrow();Integer runId = null;while (true) { OrchestratorPipelineResult status = client.textToAudio().getTextToAudioStatus(taskId); if (status.getStatus() == TaskStatus.SUCCESS) { runId = status.getRunId().orElseThrow(); break; } if (status.getStatus() == TaskStatus.ERROR) throw new RuntimeException("Text-to-audio failed"); Thread.sleep(3000);}InputStream audioStream = client.textToAudio().getTextToAudioResult(Optional.of(runId));saveStreamToFile(audioStream, "soundscape.wav"); // saveStreamToFile defined in Quick Start
The Stories endpoint ingests a document file and generates narrated audio asynchronously. The client returns a union response for submission, so you extract task_id with visit(...):
CreateStoryStoryPostResponse submitted = client.story().createStory( new File("story.pdf"), BodyCreateStoryStoryPost.builder() .sourceLanguage(Languages.EN_US.getValue()) .title("My Story") .build());final String[] taskIdHolder = new String[1];submitted.visit(new CreateStoryStoryPostResponse.Visitor<Void>() { @Override public Void visit(OrchestratorPipelineCallResult value) { taskIdHolder[0] = value.getTaskId().orElseThrow(); return null; } @Override public Void visit(GetSetupStoryResultResponse value) { throw new RuntimeException("Unexpected setup response"); }});String taskId = taskIdHolder[0];Integer runId = null;while (true) { OrchestratorPipelineResult status = client.story().getStoryStatus(taskId); if (status.getStatus() == TaskStatus.SUCCESS) { runId = status.getRunId().orElseThrow(); break; } if (status.getStatus() == TaskStatus.ERROR) throw new RuntimeException("Stories task failed"); Thread.sleep(5000);}Map<String, Object> runInfo = client.story().getStoryRunInfo(Optional.of(runId));System.out.println("Story run info: " + runInfo);
Translated TTS is asynchronous. Create a translated TTS task, poll until success, then inspect the success payload via OrchestratorPipelineResult.getAdditionalProperties():
CreateTranslatedTtsOut created = client.translatedTts().createTranslatedTts( CreateTranslatedTtsRequestPayload.builder() .text("Good morning, welcome to our service.") .voiceId(147320) .sourceLanguage(Languages.EN_US.getValue()) .targetLanguage(Languages.HI_IN.getValue()) .build());while (true) { OrchestratorPipelineResult status = client.translatedTts().getTranslatedTtsTaskStatus(created.getTaskId()); if (status.getStatus() == TaskStatus.SUCCESS) { System.out.println(status.getAdditionalProperties()); break; } if (status.getStatus() == TaskStatus.ERROR) throw new RuntimeException("Translated TTS failed"); Thread.sleep(3000);}
// Create a dictionary from a CSV file.Object created = client.dictionaries().createDictionaryFromFile( new File("terms.csv"), BodyCreateDictionaryFromFileDictionariesCreateFromFilePost.builder() .dictionaryName("Product Terms") .dictionaryDescription("Brand-specific terminology.") .build());// Add a term to an existing dictionary.int dictionaryId = 123;client.dictionaries().addTermToDictionary( dictionaryId, AddDictionaryTermPayload.builder() .translations(Arrays.asList( TermTranslationInput.builder() .translation("Camb.ai") .language(Languages.HI_IN.getValue()) .build() )) .build());// Remove a specific term.client.dictionaries().deleteDictionaryTerm(dictionaryId, /* termId */ 456);
Custom hosting providers are implemented as ITtsProvider instances. You call provider.tts(request, requestOptions) directly instead of routing through CambApiClient:
import core.RequestOptions;import resources.texttospeech.requests.CreateStreamTtsRequestPayload;import com.fasterxml.jackson.databind.ObjectMapper;import okhttp3.MediaType;import okhttp3.OkHttpClient;import okhttp3.Request;import okhttp3.RequestBody;import okhttp3.Response;import java.io.InputStream;import java.io.IOException;import java.util.HashMap;import java.util.Map;// Minimal Baseten provider implementation (based on the SDK example).class BasetenProvider implements ITtsProvider { private final String apiKey; private final String url; private final String referenceAudio; private final String referenceLanguage; private final OkHttpClient httpClient; private final ObjectMapper objectMapper; public BasetenProvider(String apiKey, String url, String referenceAudio, String referenceLanguage) { this.apiKey = apiKey; this.url = url; this.referenceAudio = referenceAudio; this.referenceLanguage = referenceLanguage; this.httpClient = new OkHttpClient(); this.objectMapper = new ObjectMapper(); } @Override public InputStream tts(CreateStreamTtsRequestPayload request, RequestOptions requestOptions) { String language = request.getLanguage().toString().toLowerCase().replace("_", "-"); Map<String, Object> payload = new HashMap<>(); payload.put("text", request.getText()); payload.put("language", language); payload.put("output_duration", null); payload.put("reference_audio", referenceAudio); payload.put("reference_language", referenceLanguage); payload.put("output_format", "flac"); payload.put("apply_ner_nlp", false); request.getOutputConfiguration().ifPresent(config -> { config.getFormat().ifPresent(f -> payload.put("output_format", f.toString().toLowerCase())); }); try { String json = objectMapper.writeValueAsString(payload); RequestBody body = RequestBody.create(json, MediaType.parse("application/json")); Request req = new Request.Builder() .url(this.url) .addHeader("Authorization", "Api-Key " + this.apiKey) .post(body) .build(); Response response = httpClient.newCall(req).execute(); if (!response.isSuccessful()) { String errorBody = response.body() != null ? response.body().string() : "<no body>"; throw new RuntimeException("Baseten API error " + response.code() + ": " + errorBody); } return response.body().byteStream(); } catch (IOException e) { throw new RuntimeException("Network error calling Baseten: " + e.getMessage(), e); } }}// Usage: instantiate the provider and call tts() directly.ITtsProvider provider = new BasetenProvider( System.getenv("BASETEN_API_KEY"), System.getenv("BASETEN_URL"), System.getenv("BASETEN_REFERENCE_AUDIO"), "en-us");InputStream audioStream = provider.tts( CreateStreamTtsRequestPayload.builder() .text("Hello from Java via Baseten Mars8-Flash!") .language(CreateStreamTtsRequestPayloadLanguage.EN_US) .voiceId(1) .build(), null);saveStreamToFile(audioStream, "baseten_output.wav");