Text to Speech

Generate Text-to-Speech Audio

curl --request POST \
  --url https://api.sunbird.ai/tasks/modal/tts \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "text": "<string>",
  "speaker_id": 248,
  "response_mode": "url"
}
'

{
  "success": true,
  "audio_url": "<string>",
  "expires_at": "2023-11-07T05:31:56Z",
  "file_name": "<string>",
  "duration_estimate_seconds": 123,
  "text_length": 123,
  "speaker_id": 241,
  "speaker_name": "<string>"
}

POST

tasks

modal

tts

Generate Text-to-Speech Audio

curl --request POST \
  --url https://api.sunbird.ai/tasks/modal/tts \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "text": "<string>",
  "speaker_id": 248,
  "response_mode": "url"
}
'

{
  "success": true,
  "audio_url": "<string>",
  "expires_at": "2023-11-07T05:31:56Z",
  "file_name": "<string>",
  "duration_estimate_seconds": 123,
  "text_length": 123,
  "speaker_id": 241,
  "speaker_name": "<string>"
}

Authorizations

Authorization

string

header

required

The access token received from the authorization server in the OAuth 2.0 flow.

Body

application/json

Request model for TTS generation.

text

string

required

Text to convert to speech

Required string length: 1 - 10000

Example:

"Hello, this is a text-to-speech test."

speaker_id

enum<integer>

default:248

Speaker voice for TTS generation

Available options:

241,

242,

243,

245,

246,

248

Examples:

248

246

response_mode

enum<string>

default:url

How to return the audio: 'url' for signed URL, 'stream' for streaming, 'both' for streaming with final URL

Available options:

url,

stream,

both

Response

Audio generated successfully

Response model for TTS generation (URL mode).

success

boolean

required

Whether the request was successful

audio_url

string

required

Signed URL to access the audio file

expires_at

string<date-time>

required

When the signed URL expires

file_name

string

required

Name of the audio file in storage

duration_estimate_seconds

number | null

Estimated audio duration in seconds

text_length

integer | null

Length of the input text

speaker_id

enum<integer> | null

Speaker voice used for generation

Available options:

241,

242,

243,

245,

246,

248

speaker_name

string | null

Human-readable speaker name

Speech to Text Language ID

⌘I

Overview

Authentication

Speech

Language

Sunflower Chat

Authorizations

Body

Response