Our Language is Our Strength
1800
Audio Hours
1.9+M
Authentic Sentences
2.0+M
Clips
NaijaVoices | Our Language is Our Strength
Across the globe, major speech recognition platforms like Amazon’s Alexa, Apple’s Siri, and Google’s Home dominate the market, yet they neglect the rich tapestry of African languages. Not a single native African language is supported by the current voice technologies widely used.Our groundbreaking project and community, NaijaVoices, aims to change this narrative by compiling extensive audio datasets in African languages. So far, our community has created 1,800 hours of authentic speech (from over 5,000 diverse speakers!) and expert curated text in Igbo, Hausa, and Yoruba.
This dataset will not only fuel advancements in machine learning but also catalyze the development of cutting-edge speech-related technologies in artificial intelligence. From education and healthcare to agriculture and finance, the impact of this initiative will be felt across diverse sectors, driving rapid advancements in AI-related innovations.
Explore Our Resources
NaijaVoices Dataset (10.57967/hf/3257)
Description: The largest Nigerian speech dataset encoompassing more than 5000 speakers.
Features: 1,800+ hours, 3 languages, 5000+ speakers.