Skip to main content
The Speech to Text (STT) service allows you to transcribe audio files into text. It supports multiple Ugandan languages and English.

Supported Languages

LanguageCode
Acholiach
Atesoteo
Englisheng
Lugandalug
Lugbaralgg
Runyankolenyn

Audio Requirements

  • Formats: MP3, WAV, OGG, M4A, AAC
  • Duration Limit: Files are processed up to 10 minutes. Audio longer than 10 minutes will be trimmed to the first 10 minutes.
  • File Size: Direct uploads are supported for files up to 100MB.

Transcribing Audio

Use the /tasks/stt endpoint for standard uploads.

Example Request

import requests

url = "https://api.sunbird.ai/tasks/stt"
files = {
    'audio': open('recording.mp3', 'rb')
}
headers = {
    "Authorization": "Bearer <YOUR_TOKEN>"
}

response = requests.post(url, files=files, headers=headers)
print(response.json())

Response

{
  "output": {
    "text": "This is the transcribed text from the audio file.",
    "language": "eng"
  }
}

Handling Large Files

For files larger than 100MB, or to bypass server timeouts, we recommend using the Upload URL workflow.
  1. Generate an Upload URL: Call /tasks/generate-upload-url to get a signed Google Cloud Storage URL.
  2. Upload File: PUT the file directly to that URL.
  3. Process: Call /tasks/stt_from_gcs with the blob name to start transcription.

Large File Upload Guide

See the API reference for detailed steps on handling large files.