The Accent Gap: What it is, and how to tackle it.
Here is a quick introduction to the Accent Gap.
What is the Accent Gap?
- The accent gap is the inability of voice technologies to understand speakers with non-native or regional accents with the same accuracy as the majority of speakers.
- It results from the under-representation of certain segments of the population in the voice datasets used for training the underlying AI models.
- The African-American English speakers’ problem with the accent gap has been strongly reflected upon. However, it is present in other languages as well—for instance, Spanish or French.
- Because of the above, non-native speakers’ dialogue-based interactions are less frequent compared with those of native speakers, and with moderate levels of satisfaction.
- For instance, a study has shown that non-native English speakers in Thailand are more satisfied performing simple tasks on their voice-based devices than complex ones that involve dialogue-based scenarios and multiple levels of interaction.
- Poorly representative datasets can exhibit other biases that also jeopardise the accuracy of voice technologies. The most notable is gender bias.
How to Tackle the Accent Gap.
- The situation can only be improved by collecting more voice data from under-represented segments of the population.
Yet, this may hinder the users’ privacy. - However, COMPRISE trains privacy-preserving models from anonymised voice data, and adapts them to every user’s accent locally on the user’s device.
- The novel COMPRISE approach improves the accuracy of speech and dialogue recognition for all users, thus boosting user experience and inclusiveness.
Developer survey: Since you are here and interested in our project, could you please spare a moment to share your concerns and answer 12 questions related to developing voice-enabled apps.