Skip to main content



TalkBank, which was recognized as a CLARIN Knowledge Centre in 2016, is the world’s largest open access integrated repository for spoken language data. It provides language corpora and other audio resources to support researchers in Psychology, Linguistics, Education, Computer Science, and Speech Pathology.


The South African Centre for Digital Language Resources ( SADiLaR) is seeking a technical manager. The position has a dual dual purpose: (i) providing technical support and leadership for all projects taking place within SADiLaR, including the development, distribution, and evaluation of language


In 2015, researchers from the Jožef Stefan Institute in Ljubljana, Slovenia released the first emoji sentiment lexicon, called Emoji Sentiment Ranking 1.0, and published it as a resource in the public language resource repository CLARIN.SI. With 78,500 downloads to date, the lexicon is the most downloaded resource in the CLARIN.SI repository.

Andreas Witt is Professor of Computational Humanities and Text Technologies at the University of Mannheim and heads the department of Digital Linguistics at the Leibniz Institute for the German Language in Mannheim.

Read the most recent CLARIN Newsflash: June 2019 here


We are very pleased to announce that on 1 May 2019 Cyprus officially became a member of CLARIN !


CLARIN.SI joined CLARIN in 2015 and is a B-certified centre which offers a LINDAT/D-Space repository that currently contains around 110 language resources for Slovenian as well as for other languages, especially Croatian and Serbian.


Tour de CLARIN highlights prominent User Involvement (UI) activities of a particular CLARIN national consortium. This time the focus is on Denmark and Klaus Nielsen, the chief editor at the Grundtvig Study Centre. The interview was conducted via Skype by Jakob Lenardič. 1. What is your scholarly


The selected papers from the CLARIN Annual Conference 2018 are now published online.


The workshop objective is to foster collaboration between social sciences and humanities researchers in Central and Eastern Europe and the research communities in these fields represented in CLARIN and in the EU funded PARTHENOS Infrastructure project.


During the SSHOC Kick-off meeting partners in the SSHOC project were asked what we should expect from the Social Sciences and Humanities Open Cloud.


The collection Grundtvig’s Works are published by the Grundtvig Center at the University of Aarhus and will contain 1000 text critical and commented editions of the printed authorship by N.F.S. Grundtvig when finalized in 2030. Since the Grundtvig Center itself does not offer the possibility for downloading the underlying files, CLARIN-DK was approached as a repository provider.


Read the most recent CLARIN Newsflash: May 2019 here


CLARIN wants to reinforce its external communication and outreach capacity and has created a job opening for an External Relations Officer (60-100% of a full-time position)


On 23 and 24 May the CLARIN ParlaFormat workshop was held in Amersfoort, the Netherlands. This workshop was organized by the CLARIN Interoperability Committee.


On 9 May, Inguna Skadiņa, the national coordinator of CLARIN Latvia organized a hands-on workshop on how to search in the Latvian Treebank


Lemmatizers generalize over the different forms of a word used in free text and provide its lemma, which is the base or dictionary look-up form. The CST lemmatizer learns lemmatization rules not only from word endings, and recognizes a wide variety of derivational patterns; e.g., prefixation, infixation, suffixation.

Centre News icon

In this issue: Virtual Language Observatory 4.7 beta, poll, Job opening at CLARIN : System and Software Engineer, Maintenance announcements


On 21 November 2018, CLARIN-DK experts organized an interactive workshop where they presented the use of Voyant Tools to lecturers and researchers at the Department of Nordic Studies and Linguistics at the University of Copenhagen.


Blog post by Tanja Wissik who used a CLARIN Mobility Grant to visit the Jožef Stefan Institute and learn more about encoding parliamentary data in .