Tour de CLARIN: Finland presents AaltoASR tool

Submitted by karolina@clarin.eu on 22 May 2017

Blog post written by Darja Fišer and Jakob Lenardič

The AaltoASR project, which is led by Professor Mikko Kurimo at Aalto University, focuses on the development of an Automatic Speech Recognition system that is able to transcribe spoken Finnish language with a very high accuracy rate. The system, which started as a relatively simple spoken-language recognizer in the 1980s that was at first capable of handling around 1000 Finnish words, is today a complex piece of software that can recognize and transcribe not only isolated words but also spontaneous speech. The AaltoASR system comprises of complex procedures that accurately transform audio signals into linguistically-modelled speech units on the basis of a complex network of probabilistic distributions, thus making the system easily adaptable to various domains and styles.

By focussing on complex agglutinative languages such as Finnish and Estonian and under-resourced languages such as the Sami ones, the AaltoASR team continues to make groundbreaking progress in the development of a successful large vocabulary speech recognizer that is able to tackle complex inflectional and compounding systems, which otherwise make it difficult to perform rule-based morphology analysis and, by extent, speech recognition.

A demonstration of the AaltoASR tool is available on the website of the Aalto Department of Signal Processing and Acoustics. Furthermore, a demonstration video on YouTube showcases the tool in action as it simultaneously transcribes a broadcast video of Finnish news. The AaltoASR tool is open source, and the developer version can be found on the tool’s GitHub page, with instructions in English found here. AaltoASR is available for research use via the Language Bank. ASR systems built on top of AaltoASR tools are also being used by companies for subtitling TV broadcasts in Finland and Sweden.

The most recent papers by Prof. Kurimo and his colleagues include:

André Mansikkaniemi, Peter Smit and Mikko Kurimo. Automatic Construction of the Finnish Parliament Speech Corpus.
In Interspeech 2017. Stockholm, August 2017. To appear.

Peter Smit, Sami Virpioja, Mikko Kurimo. Improved subword modeling for WFST-based speech recognition.
In Interspeech 2017. Stockholm, August 2017. To appear.

Click here to read more about Tour de CLARIN