Open Voice Technology in Rwanda

Deutsche Gesellschaft für
Internationale Zusammenarbeit (GIZ)

THE REPRESENTATION PROBLEM OF VOICE-TECHNOLOGY

The future of human-machine interaction lies in voice control. According to Gartner, by 2020 30% of web browsing sessions will be done by voice interaction. However, developers, researchers and startups around the globe working on voice-recognition technology face one problem alike: a lack of freely available voice data in their respective language to train AI-powered speech-to-text engines.

Indeed, while there are approximately 7,100 living languages in the world today, more than 50% of the Internet content is in English. Local languages are already severely underrepresented, and voice-enabled applications and services might further increase this inequality of access and inclusion.

„Currently, neither Amazon’s Alexa, Apple’s Siri, nor Google Home, the main players in the global voice assistants market, support a single native African language.“

Remy Muhire

Community Lead, Voice Technology at Mozilla

BUILDING COMMON VOICE-DATA INFRASTRUCTURES

Although machine-learning algorithms like Mozilla’s Deep Speech are in the public domain, training data is limited. Most of the voice data used by large corporations is not available to the majority of people, expensive to obtain or simply non-existent for languages not globally spread. The innovative potential of this technology is widely untapped.

In providing open datasets, this project aims to take away the onerous tasks of collecting and annotating data, which eventually reduces one of the main barriers to voice-based technologies and makes front-runner innovations accessible to more entrepreneurs.

With voice interaction available in their own language, this project may provide millions of people access to information, make technologies more inclusive and ultimately foster a just, locally rooted yet global digital transformation.

The GIZ Innovation Fund Team and Mozilla co-hosted an ideation hackathon in Kigali on how to create the right incentives for data collection in Kinyarwanda and to lay the foundation for local voice-recognition applications. The hackathon gave way to Digital Umuganda, a Rwandan startup that proposed a solution using a mechanism that is rooted in the Rwandan culture, ‘Umuganda’. It is a concept of self-help and cooperation in Rwanda. Every last Saturday of the month people gather around in their communities, bring efforts together to build physical infrastructures such as roads, schools, and more. Digital Umuganda brought this concept to the digital edge to help in building digital infrastructure such as voice data.

Innovation Fund Team – Local project coordination	Jan Krewer
FAIR Forward – Artificial Intelligence for All (GIZ)	Lea Gimpel / Balthas Seibold
Common Voice (Mozilla Foundation)	Jane Polak Scowcroft
Digital Umuganda	Audace Niyonkuru
Mainlevel Consulting	Daniel Brumund

Open Voice Technology in Rwanda

THE REPRESENTATION PROBLEM OF VOICE-TECHNOLOGY

„Currently, neither Amazon’s Alexa, Apple’s Siri, nor Google Home, the main players in the global voice assistants market, support a single native African language.“

Remy Muhire

BUILDING COMMON VOICE-DATA INFRASTRUCTURES

1,211

hours of Kinyarwanda voice data was collected by Digital Umuganda in less than a year, with a diverse set of over 420 contributors

CONTRIBUTE TO MAKING VOICE TECH MORE OPEN !