||Thanks to COMPRISE SDK, developers can create multilingual, voice-enabled applications in a faster, cost-effective, and privacy-driven way. The SDK is made for Smartphone applications developed with the Ionic framework, with Angular as a foundation. It consists of:
- the COMPRISE Personal Server, which allows the execution of large related services outside the Smartphone, while still preserving privacy;
- the COMPRISE App Wizard, which helps developers to do all necessary configuration in a quick and easy way;
- the COMPRISE Client Library, which can be deployed on any Android or iOS device and integrates all required voice functionalities (Speech-to-Text, Spoken Language Understanding, Dialog Management, Spoken Language Generation, Text-to-Speech) together with Machine Translation.
These three components shall be installed and configured in the above order.
||The COMPRISE Platform is a cloud-based platform designed to:
- collect anonymized speech and text data from Smartphone apps operating the COMPRISE SDK,
- curating and labelling this data,
- training Speech-to-Text and Spoken Language Understanding models on this data,
- providing access to these models.
All these functionalities can be accessed via a web service API and interfaces for Developers, Data Annotators, and Administrators.
COMPRISE VOICE TRANSFORMER
||COMPRISE Voice Transformer is part of COMPRISE SDK. It increases privacy by converting each person’s voice into another person’s voice, while preserving the spoken message. It:
- ensures that any information extracted from the transformed voice can hardly be traced back to the original speaker, as validated through state-of-the-art biometric protocols;
- preserves the utility of the transformed data for training Speech-to-Text models;
- leverages cutting-edge deep learning and speech processing technology;
- can be followed by the Voice Builder, which further discards sensitive words and expressions.
COMPRISE TEXT TRANSFORMER
||COMPRISE Text Transformer is part of COMPRISE SDK. It allows users in various application domains to mask out critical information in a text that would otherwise threaten the privacy of third parties, while preserving the sentence structure. It:
- replaces words and expressions carrying personal information by random alternatives, focusing on persons’ names, organisations, locations, dates and times;
- is applicable to all kinds of text documents in addition to spoken dialogues;
- leverages cutting-edge deep learning and natural language processing technology;
provides formal differential privacy guarantees.
COMPRISE WEAKLY SUPERVISED STT
||COMPRISE Weakly Supervised STT is part of COMPRISE Platform. It makes it possible to train Speech-to-Text models while reducing the need for time-consuming and expensive manual data transcription. It consists of two modules:
- an Automated Labelling module that processes untranscribed speech utterances and outputs one or more text transcriptions for every utterance that exploit specific information about the dialogue domain;
- a Machine Learning module that takes the transcribed sentences as inputs (and possibly additional manually transcribed sentences) and outputs trained acoustic and language models to be used by a Speech-to-Text system.
COMPRISE WEAKLY SUPERVISED NLU
||COMPRISE Weakly Supervised NLU is part of COMPRISE Platform. It enables customers to reduce the need for time-consuming and expensive manual data labelling when training a Natural Language Understanding system. It consists of two modules:
- an Automated Sequence Labelling module that processes unlabelled text sentences and outputs a (noisy) label for each sentence or each token in a sentence;
- a Machine Learning module that learns Natural Language Understanding models from these noisy labels.
COMPRISE SPEECH-TO-TEXT TRANSLATION
||COMPRISE Speech-to-Text Translation combines Speech-to-Text and Machine Translation in a smart manner in order to allow every user to speak his/her own language when interacting with a dialogue system that internally uses a different language. Instead of combining separately trained Speech-to-Text and Machine Translation models in a pipeline, the Machine Translation system is trained to handle Speech-to-Text errors and disfluencies, so as to reduce translation errors.