Convert Speech to text

Speech to Text

curl --request POST \
  --url https://api.pawa-ai.com/v1/voice/speech-to-text \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: multipart/form-data' \
  --form files='@example-file' \
  --form model=pawa-stt-v1-20240701 \
  --form language=English \
  --form is_speaker_diarization=false \
  --form 'prompt=Nipe maneno yaliyokwenye hii audio' \
  --form temperature=0.1

{
  "success": true,
  "message": "Audio transcribed succesfully",
  "data": {
    "transcriptions": [
      {
        "filename": "innocent.wav",
        "transcript": "Hello, my name is Innocent Charles, welcome to Pawa AI"
      }
    ]
  }
}

POST

voice

speech-to-text

Speech to Text

curl --request POST \
  --url https://api.pawa-ai.com/v1/voice/speech-to-text \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: multipart/form-data' \
  --form files='@example-file' \
  --form model=pawa-stt-v1-20240701 \
  --form language=English \
  --form is_speaker_diarization=false \
  --form 'prompt=Nipe maneno yaliyokwenye hii audio' \
  --form temperature=0.1

{
  "success": true,
  "message": "Audio transcribed succesfully",
  "data": {
    "transcriptions": [
      {
        "filename": "innocent.wav",
        "transcript": "Hello, my name is Innocent Charles, welcome to Pawa AI"
      }
    ]
  }
}

Authorizations

Authorization

string

header

required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

multipart/form-data

files

file

required

[Note: In real app you can upload mulitple or list files but here the docs allow only one for testing] . Audio files to be transcribed (max 10). Types accepted are: audio/mp3, audio/mpeg, audio/x-mp3, audio/wav, audio/wave, audio/x-wav, audio/x-pn-wav, audio/aac, audio/m4a, audio/x-m4a, audio/x-mp4, audio/ogg, audio/opus, audio/x-ms-wma, audio/wma

model

enum<string>

required

Available options:

pawa-stt-v1-20240701

Example:

"pawa-stt-v1-20240701"

language

enum<string>

required

Available options:

English,

Swahili,

Luo,

Meru,

Kamba,

Kikuyu,

Hausa,

Igbo,

Yoruba,

Pidgin,

Zulu,

Tswana,

Afrikaans,

Xhosa,

Nyankole,

Ganda,

Lugbara

Example:

"English"

is_speaker_diarization

enum<boolean>

Available options:

true,

false

Example:

false

prompt

string

Example:

"Nipe maneno yaliyokwenye hii audio"

temperature

number

default:0.1

Example:

0.1

Response

Audio files transcribed successfully

success

boolean

required

Example:

true

message

string

required

Example:

"Audio transcribed succesfully"

data

object

required

Show child attributes

Convert Text to speech

Introduction

Getting Started

Models

Chat

Voice

Agents

Vectors

Documents

Storage

Authorizations

Body

Response