Speech Recognition in Estonian Courts

How Technology is Changing Court Reporting

A Brief history of Court Reporting and Stenography

As Adam Brooks writes in his essay The History of Court Reporting and Stenography, technological advancements in the world of court reporting and stenography begin almost with the beginning of writing itself. In ancient Mesopotamia, the Sumerians developed a system or writing called cuneiform. Latin for ‘wedge’ this form of writing used pictographs which were formed in clay by a wedge-shaped stylus. Cuneiform survived as a writing system in this region from 3500-3000 BC until some point after 100 BC, utilizing phonograms (a symbol or combination of symbols that represent a sound) in its maturation. Chinese imperial courts were using shorthand to expedite the recording of confessions. The development of Western forms of shorthand began with the Romans and Cicero. The courts of the Roman Republic used abbreviated Latin letters, Greek symbols and other signs, which later systemized to around 5,000 various marks to create Tironian Notes as were used by monks in the Middle Ages who rejuvenated the system after a long period of shorthand being associated with witchcraft and magic.
Fast forward to the 17th century, where a variety of systems of shorthand and manuals on shorthand were introduced by many innovators, including John Willis, Thomas Shelton, Jeremiah Rich, and William Mason. One of the few prominent systems that came out of the 18th century was crafted by Samuel Taylor, whose work inspired Sir Isaac Pitman to improve the artform by integrating the recording of sounds as symbols in his own system, Stenographic Sound-Hand that he published in 1837. Benn Pitman, Sir Isaac’s brother, applied minor modifications to this system when he introduced it in America. Because the symbols captured sounds instead of letters, Pitman’s shorthand system was easily usable by a wide range of languages around the world. In 1877, Miles Bartholomew created the first shorthand machine, and nearly 30 years after that, in 1906, an American court reporter/stenographer Ward Stone Ireland introduced the first commercial Stenotype machine. This ground-breaking machine’s design is still present in the keyboards that stenographers and court reporters use today.
As for the use of the sound recording devices in court hearings, the history goes back to the late 19th century when Thomas Edison, one of the greatest American inventors, further developed the phonautograph that was patented by the French printer, bookseller and inventor Édouard-Léon Scott de Martinville, and in 1877 introduced the phonograph – a device for the mechanical recording and reproduction of sound. This great invention was tested on a wide range of occasions including recording speeches of judges and defence lawyers at court hearings in the US and elsewhere. But these were rather occasional experiments as such technology did not see wider use in courts due to the complexity of the recording process and poor quality of the sound.

The Digital Transformation of Modern Courts

For hundreds of years the main persons responsible for preparing written records of court hearings around the globe have been stenographers or court secretaries. And for hundreds of years these people have maintained and developed a craft of notetaking. In the 21st century, two of the most recent advancements – online document repositories and real-time reporting i.e., Certified Access Realtime Translation or CART – have enabled faster turnaround for and easy availability of highly accurate court hearing transcripts. But this is just the tip of the iceberg for this new technological renaissance we find ourselves in. But let’s look at the technological advancements and changes affecting houses of law a little closer.
Well, we all have certain opinions on how things work in courts. If you follow some popular sitcoms or movies from the 80s or 90s, probably the first idea you have when thinking of the court reporting process will be a person sitting in the corner of a courtroom – a stenographer, writing down every single spoken word with the highest accuracy and speed. Some might point out – well, they have machines with keys, not letters, to type words and phrases phonetically, and they hit more than one key at a time. And both of those opinions are right, but only if we are talking about court reporting of the past.
Modern technologies have changed the way we live and do business, and legal industry is not an exception. Recent developments and the introduction of machine learning algorithms in speech recognition technologies made possible the critical break-through for their use in the court setting. These technologies have been around for many decades but only very recently they have reached the appropriate level of maturity and quality to provide for reliable receipt of transcripts, facilitating court document preparation, and automating many hours of human work required to convert human voice into a textual format. Modern automatic speech recognition (ASR) solutions help make usable transcripts from almost any type of audio input (including real-time recordings) and store them safely taking into consideration the highest confidentiality standards applied at courts.

With a slightly simplified view, there are two main critical factors affecting the quality of automatic speech recognition – the quality of the audio recording and the domain adaptation of the ASR models used. While the former heavily depends on the equipment that must tackle multiple acoustic challenges, the latter requires more sophisticated handling – preparation of large amounts of training data, selection of appropriate software development tools and methods as well as training the custom ASR models, as court recordings contain rather broad but specific terminology, phrasing and jargon. But this is an effort worth making as such solutions ultimately provide a variety of benefits – starting with saving time and effort of human personnel, speeding up the preparation of court documents as well as providing possibilities to conduct content analysis of not only recent but also historical audio recordings of court sessions, which is widely required by legal scholars and practitioners.

Bringing the Technological Shift to National Courts in Estonia

This works not just in theory. Tilde – a leading European language technology company – in collaboration with the Estonian national Centre of Registers and Information Systems (RIK) has developed a solution for automated transcription of court hearings in all 9 national and regional courts of the country. The solution has brought significant improvements to the efficiency and speed of producing court session recordings and protocolling. Speech recognition models have been adapted for the rather rich domain content – more than 800 hours of transcribed audio files have been used for the development of the acoustic models and over 800 million words worth of textual data have been used for the custom language model development. As a result, the solution which provides for real-time, as well as offline speech recognition has high quality of speech recognition, where word error rates (WER) vary between 8 and 15%, and this is a very good result given the actual conditions at court hearings. This solution is built to run on the RIK’s infrastructure thus safeguarding the security and availability to authorized users only.

The New Reality

The coronavirus outbreak is not only pushing for the digital transformation across different industries, but also forcing change and adjustments to the ways courts and tribunals operate, as open face-to-face hearings are not often feasible. These challenges are affecting the ways courts operate – daily routines for judges, law professionals, court administrative staff, as well as those attending court hearings. This will eminently affect the use of different technologies in courts including speech recognition. And we strongly believe these changes are there to stay, and that there is more to come…

Written by:
Kaspars Kauliņš
International Business Development Director at Tilde

Developer survey: Since you are here and interested in our project, could you please spare a moment to share your concerns and answer 12 questions related to developing voice-enabled apps.

Comments are closed.