Thanks to federated login, the following applications and data sets are available to anyone with an academic computer account from many European countries – for a complete list, see participating identity federations. We are working hard to extend this to the rest of Europe. In the meanwhile, you can request a CLARIN account if you want to access these services from another country or from an institution that does not participate in these identity federations.
Resource or Tool | Description | Provided by |
---|---|---|
Bavarian Archive for Speech Signals | Mostly German spoken language resources | Bayerisches Archiv für Sprachsignale |
CLARIN-DK repository | Danish language resources | The Clarin center at University of Copenhagen |
CLARIN.SI repository |
Includes the following corpora: |
CLARIN.SI Language Technology Centre |
CLARINO repository | The Norwegian-Spanish Parallel Corpus | CLARINO |
Corpus Hedendaags Nederlands | Written corpus of contemporary Dutch | Instituut voor de Nederlandse Taal |
Corpuscle |
A corpus management platform for annotated corpora (Norwegian Bokmål, Norwegian, English, Spanish, Bulgarian, German, Abkhazian, Georgian, Slovenian, Scots) Includes the following ICAME corpora:
|
CLARINO |
DWDS Corpus tools |
Linguistic analysis tools for German Includes the following corpora: |
Berlin-Brandenburg Academy of Sciences and Humanities |
FAME | A search interface that discloses the archive of radio broadcasts from Omrop Fryslân in the period 1955–2000. The raw audio and speech recognition results are available for download. | Centre for Language and Speech Technology |
Glossa |
Includes the following corpora: |
CLARINO Text Laboratory Centre |
INESS | Platform for building, accessing, searching and visualising treebanks | CLARINO |
Korp at the Language Bank of Finland |
A corpus management platform for annotated corpora Includes the following corpora: |
Fin-CLARIN |
CoANZSE | A 195-million-word corpus of speech transcripts from Australia and New Zealand, including audio and forced alignment files | Steven Coats - University of Oulu |
LINDAT/CLARIN Repository |
Includes the following corpora: |
LINDAT-Clarin |
MPI/TLA archive | Various, e.g. endangered languages | MPI for Psycholinguistics |
Nederlab | A corpus management platform for large Dutch text collections | Meertens Instituut |
OpenSoNaR | Mver 500 million word Dutch reference corpus | Instituut voor de Nederlandse Taal |
VU-DNC | Diachronic Dutch newspaper corpus | INL (Instituut voor Nederlandse Lexicologie) |
Tündra | Treebank search application | Eberhard Karls Universität Tübingen |
Virtual Collection Registry | Tool to manage virtual collections | Leibniz-Institut für Deutsche Sprache |
WebLicht | Webservice chaining tool | Eberhard Karls Universität Tübingen |