Text-to-Speech (TTS)
Convert text responses into natural, expressive speech across multiple languages and voices.Features:
Ideal for voice assistants, narration, accessibility, and interactive experiences.
- Expressive, natural voices
- Low-latency streaming
- Multilingual support (Swahili, English, and other African languages)
- Flexible output formats (MP3, WAV, streaming response)
Speech-to-Text (STT)
Accurately transcribe audio into text in real-time or batch mode. Perfect for meeting notes, captions, accessibility, and voice commands.Features:
- High transcription accuracy
- Real-time streaming or async batch processing
- Multilingual speech recognition (optimized for African accents + Swahili)
- Works with noisy environments
Voice models include:
- Pawa Text-to-Speech (TTS) → Natural, expressive multilingual voices (supports Swahili, English, and African accents). Optimized for real-time streaming and low latency.
- Pawa Speech-to-Text (STT) → Accurate, multilingual speech recognition tuned for African languages and noisy environments. Supports streaming transcription and batch mode.