- FIN-CLARIN: University of Helsinki and CSC, Finland
- CLARINO: University of Tromsø, Norway
- CLARIN-LV: Institute of Mathematics and Computer Science, University of Latvia
- CLARIN-LT: Vytautas Magnus University, Lithuania
SAFMORIL brings together linguists as well as researchers and developers in the area of computational morphology and its application during language processing. The focus of SAFMORIL is on actual, working systems and frameworks based on linguistic principles and on providing linguistically motivated analyses and/or generation on the basis of linguistic categories. Such systems are relevant in particular for languages with rich morphologies, as is the case of Nordic and Baltic languages (such as Finnish, Swedish, Norwegian, Latvian, Lithuanian as well as the Sámi languages) and more generally Fenno-Ugric languages, Inuit languages, Canadian First Nation languages and Babylonian languages.
SAFMORIL offers online courses for developing and teaching morphologies, tokenizers and spell-checkers, a repository for storing morphologies, and an environment for creating tokenizers and spell-checkers. SAFMORIL serves linguists and computational linguists developing and adapting morphologies as well as digital humanities scholars, linguists, and computer scientists processing language data. Researchers are welcome to get in touch with SAFMORIL regarding any matters related to morphology (computational or otherwise) via the safmoril [at] kielipankki.fi (SAFMORIL Helpdesk (safmoril[at]kielipankki[dot]fi)).
The four member institutions of SAFMORIL – that is, FIN-CLARIN, CLARINO, CLARIN-LV, and CLARIN-LT – each offers its own unique technologies and services for working with morphology. The Finnish member FIN-CLARIN focuses on creating novel morphology systems and frameworks. The two main tools that FIN-CLARIN contributes as a member of SAFMORIL are Mylly, which is used for analyzing and visualizing data sets, and HFST – Helsinki Finite-State Technology, which is a compilation and runtime software, with some source morphologies.
FIN-CLARIN offers online tutorials for XFST-based Morphology Development (by Erik Axelson, Kimmo Koskenniemi and Mathias Creutz at the University of Helsinki) and Morphology Construction (developed by Jack Rueter at FIN-CLARIN) as well as documentation for experimental two-level rule compilation using Python HFST (by Kimmo Koskenniemi at FIN-CLARIN). Lastly, FIN-CLARIN offers Finland Swedish Online, which is a free online course in Swedish as spoken in Finland. It is designed based on the model of Icelandic Online and includes a variety of texts, videos, sounds clips and exercises to help you learn Swedish. Morphology is practised implicitly through reading and listening to Swedish where we take care to repeat forms and patterns that are being practised, and it can be practised in self-correcting exercises. Finland Swedish Online currently consists of two courses but a third course is soon to be launched and soon there will also be a special course designed for librarians.
based tools, morphology teaching service GiellaLT ICALL, and offers tutorials for making computer tools for your language. In addition, the aforementioned HFST (Helsinki Finite-State Technology) toolkit has been applied extensively in the GiellaLT infrastructure, and is also a core part of the proofing tools provided by it.
In 2020, another morphologically rich language, Lithuanian, was included in SAFMORIL. Although the CLARIN-LT consortium already had a Helpdesk on corpus linguistics and natural language processing methods for Lithuanian, the team of Lithuanian researchers was very glad to expand their knowledge sharing with regards to the Lithuanian morphology, syntax, semantics and tools for linguistic analysis (e.g., those produced within the project SEMANTIKA-2) with an international audience. As a member of SAFMORIL, the CLARIN-LT team looks forward to exchanging experiences, opening new opportunities for cooperation, and the further development of resources and tools relevant for the analysis of morphologically rich languages.