POST
/
transcribe
curl --request POST \
  --url https://client.camb.ai/apis/transcribe \
  --header 'Content-Type: multipart/form-data' \
  --header 'x-api-key: <api-key>' \
  --form language=1
{
  "task_id": "<string>"
}

Converting Speech to Text with Precision

Our transcription service transforms spoken content into accurate, readable text, enabling you to make your audio content searchable, accessible, and analytically valuable. This endpoint initiates a transcription task, processing your audio file and returning a unique identifier that allows you to track and retrieve your results.

Understanding Speech Transcription

Speech transcription technology analyzes audio recordings of human speech and converts them into written text. This process employs sophisticated machine learning models trained on diverse speech patterns, accents, and linguistic contexts to deliver high-quality text outputs. Our system handles various audio formats and speaking situations, from clear studio recordings to more challenging environments with background noise.

When you submit an audio file for transcription, our system:

  1. Analyzes the audio signal to identify speech segments
  2. Processes these segments through advanced recognition models
  3. Applies language-specific rules and context awareness
  4. Generates a readable text transcript that captures the spoken content

This transformation creates valuable text assets from your audio content, enabling new ways to search, analyze, and repurpose your spoken material.

Supported Languages

Our transcription service supports a wide range of languages. Some of the most commonly used include:

  • English (1)
  • Spanish (54)
  • French (76)
  • German (31)
  • Mandarin Chinese (139)
  • Japanese (88)
  • Arabic (4)

For a complete list of supported languages and their respective language codes, refer to our Language Support Documentation

Supported Audio Formats

For optimal transcription quality, we recommend using high-quality audio with clear speech and minimal background noise. Our service accepts the following audio formats:

  • MP3 (.mp3)
  • WAV (.wav)
  • AAC (.aac)
  • FLAC (.flac)

Request Example

You can create a transcription task by either uploading an audio file or providing a URL to an audio resource. Here are examples using cURL::

  • Using a local audio file:
curl -X POST "https://client.camb.ai/apis/transcribe" \
  -H "x-api-key: YOUR_API_KEY" \
  -F "language=en" \
  -F "file=@/path/to/your/audio_file.mp3"
  • Using a rempte audio URL:
curl -X POST "https://client.camb.ai/apis/transcribe" \
  -H "x-api-key: YOUR_API_KEY" \
  -F "language=en" \
  -F "audio_url=https://example.com/audio_file.mp3"

Here’s how to handle both scenarios in Python with the requests library:

import requests

def create_transcription_task(api_key, language="en", audio_file_path=None, audio_url=None):
    """
    Initiates a transcription task using either an audio file or URL.

    Args:
        api_key (str): API authentication key
        language (str): Language code (default: "en")
        audio_file_path (str, optional): Path to local audio file
        audio_url (str, optional): URL of remote audio resource

    Returns:
        dict: Response containing task_id for status tracking
    """
    url = "https://client.camb.ai/apis/transcribe"
    headers = {"x-api-key": api_key}
    data = {"language": language}
    files = None

    # Validate input
    if not (audio_file_path or audio_url):
        raise ValueError("Must provide either audio_file_path or audio_url")
    if audio_file_path and audio_url:
        raise ValueError("Provide only one of audio_file_path or audio_url")

    # Prepare request components
    if audio_file_path:
        with open(audio_file_path, "rb") as audio_file:
            files = {"file": (audio_file_path.split("/")[-1], audio_file)}
    else:
        data["audio_url"] = audio_url

    # Execute request
    response = requests.post(url, headers=headers, files=files, data=data)

    if response.ok:
        return response.json()
    else:
        print(f"Error {response.status_code}: {response.text}")
        return None

# Example with file upload
file_result = create_transcription_task(
    api_key="your_api_key",
    audio_file_path="meeting_recording.mp3",
    language="en"
)

# Example with audio URL
url_result = create_transcription_task(
    api_key="your_api_key",
    audio_url="https://storage.example.com/interview.mp3",
    language="es"
)

Processing Time Considerations

Transcription processing time depends on several factors:

  • Audio Duration: Longer files naturally take more time to process
  • Audio Quality: Clear, high-quality recordings process more efficiently
  • Language Complexity: Some languages may require more processing time
  • System Load: Processing time can vary based on current system demand

Next Steps: Monitoring Your Transcription Task

After submitting your transcription request, you’ll want to monitor its progress and retrieve the results once processing completes. To do this:

  1. Use the /transcribe/{task-id} endpoint to check your task’s status
  2. Poll the status endpoint at reasonable intervals (we recommend 5-15 second intervals for most cases)
  3. Once the status shows as SUCCESS, you can retrieve your full transcript

Best Practices for Optimal Results

To get the most accurate transcriptions from our service:

  1. Use High-Quality Audio: Whenever possible, use audio recorded in quiet environments with minimal background noise.

  2. Appropriate Audio Format: Submit uncompressed audio formats like WAV or FLAC for best quality, or high-bitrate MP3s if file size is a concern.

  3. Speaker Clarity: Encourage clear speaking with moderate pace for best recognition accuracy.

  4. Specify the Correct Language: Always provide the correct language parameter to ensure our models apply the right language patterns.

  5. Segment Longer Content: For very long recordings (over 2 hours), consider splitting into multiple smaller files for more efficient processing.

Handling Common Issues

If you encounter problems with your transcription tasks, these troubleshooting steps may help:

Rejected File Uploads

Problem: Your request returns an error about the file upload.

Potential Solutions:

  • Verify your file is in one of our supported formats (.mp3, .wav, .aac, .flac)
  • Check that your file isn’t corrupted or empty
  • Ensure your file doesn’t exceed our size limit (20 MB)

Incorrect Language Specification

Problem: Transcription results appear inaccurate or contain many errors.

Potential Solutions:

  • Verify you specified the correct language code
  • For multilingual content, choose the predominant language

Taking Your Transcriptions Further

Once you’ve successfully transcribed your audio content, consider these next steps:

  1. Semantic Analysis: Extract key topics, sentiments, and entities from your transcribed text
  2. Content Indexing: Make your audio searchable by indexing the transcript content
  3. Accessibility Compliance: Use transcripts to make your audio content accessible to all users
  4. Translation: Convert your transcript into other languages for global reach
  5. Summary Generation: Create concise summaries of longer transcribed content

Authorizations

x-api-key
string
header
required

The x-api-key is a custom header required for authenticating requests to our API. Include this header in your request with the appropriate API key value to securely access our endpoints. You can find your API key(s) in the 'API' section of our studio website.

Body

multipart/form-data

Response

200
application/json

Successful Response

A JSON that contains unique identifier for the task. This is used to query the status of the transcription task that is running. It is returned when a create request is made to process speech into text.