The Accent Gap: What it is, and how to tackle it

The Accent Gap: What it is, and how to tackle it.
Here is a quick introduction to the Accent Gap.
What is the Accent Gap?

The accent gap is the inability of voice technologies to understand speakers with non-native or regional accents with the same accuracy as the majority of speakers.
It results from the under-representation of certain segments of the population in the voice datasets used for training the underlying AI models.
The African-American English speakers’ problem with the accent gap has been strongly reflected upon. However, it is present in other languages as well—for instance, Spanish or French.
Because of the above, non-native speakers’ dialogue-based interactions are less frequent compared with those of native speakers, and with moderate levels of satisfaction.
For instance, a study has shown that non-native English speakers in Thailand are more satisfied performing simple tasks on their voice-based devices than complex ones that involve dialogue-based scenarios and multiple levels of interaction.
Poorly representative datasets can exhibit other biases that also jeopardise the accuracy of voice technologies. The most notable is gender bias.

How to Tackle the Accent Gap.

The situation can only be improved by collecting more voice data from under-represented segments of the population.
Yet, this may hinder the users’ privacy.
However, COMPRISE trains privacy-preserving models from anonymised voice data, and adapts them to every user’s accent locally on the user’s device.
The novel COMPRISE approach improves the accuracy of speech and dialogue recognition for all users, thus boosting user experience and inclusiveness.

Developer survey: Since you are here and interested in our project, could you please spare a moment to share your concerns and answer 12 questions related to developing voice-enabled apps.