Skip to main content

Text-to-Speech (TTS)

Convert text responses into natural, expressive speech across multiple languages and voices.
Ideal for voice assistants, narration, accessibility, and interactive experiences.
Features:
  • Expressive, natural voices
  • Low-latency streaming
  • Multilingual support (Swahili, English, and other African languages)
  • Flexible output formats (MP3, WAV, streaming response)

Speech-to-Text (STT)

Accurately transcribe audio into text in real-time or batch mode. Perfect for meeting notes, captions, accessibility, and voice commands.
Features:
  • High transcription accuracy
  • Real-time streaming or async batch processing
  • Multilingual speech recognition (optimized for African accents + Swahili)
  • Works with noisy environments

Voice models include:

  • Pawa Text-to-Speech (TTS) → Natural, expressive multilingual voices (supports Swahili, English, and African accents). Optimized for real-time streaming and low latency.
  • Pawa Speech-to-Text (STT) → Accurate, multilingual speech recognition tuned for African languages and noisy environments. Supports streaming transcription and batch mode.
I