Concept-based resources include onomasiological lexical resources such as wordnets, framenets, thesauri and ontologies. Such resources are typically interlinked with semantic relations (e.g. hypernymy, hyponymy). There are 29 conceptual resources in the CLARIN infrastructure. Most (22) of the conceptual resources are monolingual, accounting for 14 languages (Ancient Greek, Danish, Greek, Brazilian Portuguese, Dutch, Estonian, Finnish, Italian, Maltese, Polish, Portuguese, Swedish, Slovenian), while the rest (7) include both bilingual and multilingual language combinations (e.g., Swedish-English, Polish-English). In the vast majority of the cases, the conceptual resources can be directly downloaded from the national repositories or queried through easy-to-use online search environments.
For comments, changes of the existing content or inclusion of new resources, send us an resource-families [at] clarin.eu (email).
Conceptual resources in the CLARIN infrastructure
Monolingual Resources
Corpus | Language | Description | Availability |
---|---|---|---|
Open Ancient Greek WordNet 0.5
Size: 7,447 synsets
|
Ancient Greek |
This is a wordnet that is available for download through ILC4CLARIN. |
|
Ontology for the area of Nanoscience and Nanotechnology
Size: 511 entries
|
Brazilian Portuguese |
This is an ontology of concepts related to nanoscience and nanotechnology available for download from CLARIN PORTULAN. |
|
DanNet, Danish Wordnet (v 2.2)
Size: 65,000 entries
|
Danish |
This wordnet is available for download from the CLARIN-DK repository. The resource is available in the .csv and .owl formats. For the relevant publication, see Pedersen et al. (2009) |
|
STO semantics (The Danish SIMPLE Lexicon) - LMF format
Size: 12,609 entries
|
Danish |
This resource presents a unified, ontology-based semantic model – the so-called SIMPLE model – representing an extended Qualia Structure from the Danish SIMPLE Lexicon. The resource is available for download from CLARIN-DK. |
|
Size: 130,000 entries
|
Dutch |
This is a resource that combines two resources with different semantic organisations: the Dutch Wordnet with its synset organisation and the Dutch Reference Lexicon which includes definitions, usage constraints, selectional restrictions, syntactic behaviours, illustrative contexts, etc. The resource is available for online browsing through a dedicated webpage. |
|
Size: 115,318 keywords, 84,150 synets
|
Estonian |
This is a wordnet that is available for download from META-SHARE (CELR distribution). |
|
Size: 866 frames
|
Finnish |
This is a wordnet that is available for online browsing through FIN-CLARIN. For the relevant publication, see Lindén et al. (2017) |
|
Size: 61,659 synsets
|
German |
This wordnet is available for download from a dedicated webpage hosted by Uni Tübingen/CLARIN-D. For the relevant publication, see Henrich et al. (2011) |
|
Polytropon EL Conceptual Lexicon
Size: 4,000 Multi Word units; 15,000 tokens
|
Greek |
This is a lexicon of lexical-semantic relations (e.g. synonymy, antonymy); lexical relations (e.g. word families, allomorphs, syntactic variants); morphosyntactic features (PoS, gender, declension, etc.), which is not yet available for download. |
|
Annotation: lemmas, concept-based relations
|
Icelandic |
This is a database of words, categorizations and word relations. The new version consists of a single, large RDF file that houses the Wordweb’s content and is encoded with a standardized vocabulary. The Wordweb is available for download through the CLARIN.IS repository and is available for online browsing through a dedicated webpage. For the relevant publication, see Jónsson and Úlfarsdóttir (2011) |
|
Size: 49,514 synsets
|
Italian |
This is a wordnet available for download from ILC4CLARIN. |
|
Size: 49,350 synsets
|
Italian |
This is a wordnet available for download from ILC4CLARIN. For the relevant publication, see Bartoliniet al. (2014) |
|
Size: 49,350 synsets |
Italian |
This is an RDF–Linguistic Open Data version of the ItalWordNet v.2. The resource is available for download through ILC4CLARIN. |
|
Maltese automatically produced distributional thesaurus
Size: 36,034 entries
|
Maltese |
This is a thesaurus that is available for download from CLARIN PORTULAN. |
|
Size: 120 terms
|
Polish |
This conceptual resource provides a mapping between named entities types, SUMO categories and plWordNet synsets. The resource is available for download from the CLARIN-PL repository. |
|
Size: 175,635 synsets mapped to SUMO
|
Polish |
This conceptual resource provides a mapping of plWordNet onto the SUMO ontology. The resource is available for download from the CLARIN-PL repository. |
|
Size: 701,209 concepts
|
Portuguese |
This is an ontology of Portuguese geographic concepts. It is available for download from CLARIN PORTULAN. |
|
MWNPT-International WordNet of Portuguese
Size: 17,200 synsets
|
Portuguese |
This is a wordnet that is available if contacting the resource manager. |
|
Thesaurus of Modern Slovene 1.0
Size: 105,473 entries
|
Slovenian |
This is a thesaurus of the modern Slovenian language that is available for download from CLARIN.SI For the relevant publication, see Krek et al. (2017) |
|
Size: 148,815 entries
|
Swedish |
This is a digital version of Bring's thesaurus (1930) that is available for download from the SWE-CLARIN repository. |
|
Size: 131,020 entries
|
Swedish |
This is an extensive lexicon resource for modern Swedish written language. The resource can be download from the SWE-CLARIN repository and can be queried online through KARP. |
|
Size: 1,195 entries
|
Swedish |
This is a Swedish conceptual resource that employs the FrameNet++ annotation. The resource can be download from the SWE-CLARIN repository and can be queried online through KARP. |
|
Size: 15,010 entries
|
Swedish |
This is a Swedish wordnet that can be downloaded from the SWE-CLARIN repository and can be queried online through KARP. |
Multilingual Resources
Corpus | Language | Description | Availability |
---|---|---|---|
Size: 170,000 synsets
|
Finnish, English |
This is a wordnet that is available for download from FIN-CLARIN as well as for online browsing. For the relevant publication, see Lindén and Carlson (2010) |
|
The Sanat Version of the Finnish TransFrameNet
Size: 866 frames
|
Finnish, English |
This is a framenet that is available for online browsing through FIN-CLARIN. For the relevant publication, see Lindén et al. (2019) |
|
Size: 72,572 lexical relations
|
French, English, Polish, Serbian |
This is an ontology of place names available for download from Ortolang. |
|
Size: 506,815 senses, 347,564 synsets
|
Polish, English |
This is a lexico-semantic network which reflects the lexical system of the Polish language with projection to the English language, Słowosieć. The resource is available for download and browsing. For the relevant publication, see Maziarz et al. (2016) |
|
Size: 282 concepts
|
Portuguese, English, Spanish, French |
This is an ontology of concepts from the accommodation sector is available for download from CLARIN PORTULAN. |
|
Semantic lexicon of Slovene sloWNet 3.1
Size: 43,460 synsets
|
Slovenian, English |
This is a wordnet available for download from CLARIN.SI and for online browsing through a dedicated environment. |
|
Size: 6,989 entries
|
Swedish, English |
This wordnet represents a link between SALDO senses and Core WordNet. The resource can be download from the SWE-CLARIN repository and can be queried online through KARP. |
Publications
[Bartolini et al. 2014] Roberto Bartolini, Valeria Quochi, Irene De Felice, Irene Russo, and Monica Monachini. 2014. From Synsets to Videos: Enriching ItalWordNet Multimodally.
[Henrich et al. 2011] Verena Henrich, Erhard Hinrichs, and Tatiana Vodolazova. 2011. Aligning GermaNet Senses with Wiktionary Sense Definitions.
[Jónsson and Úlfarsdóttir 2011] Jón Hilmar Jónsson and Þórdís Úlfarsdóttir. 2011. Íslenskt orðanet: Et skritt mot en allmennspråklig onomasiologisk ordbok.
[Krek et al. 2017] Simon Krek, Cyprian Laskowski, andMarko Robnik-Šikonja. 2017. From Translation Equivalents to Synonyms: Creation of a Slovene Thesaurus Using Word co-occurrence Network Analysis.
[Lindén and Carlson 2010] Krister Lindén and Lauri Carlson. 2010. FinnWordNet – WordNet på finska via översättning.
[Lindén et al. 2017] Krister Lindén, Heidi Haltia, Juha Luukkonen, Antti O. Laine, Henri Roivainen, and Niina Väisänen. 2017. FinnFN 1.0: The Finnish frame semantic database.
[Lindén et al. 2019] Krister Lindén, Heidi Haltia, Antti Laine, Juha Luukkonen, Jussi Piitulainen, and Niina Väisänen. 2019. Embeddings.FinnTransFrame: translating frames in the FinnFrameNet project.
[Maziarz et al. 2016] Marek Maziarz, Maciej PiaseckiA Ewa Rudnicka, Stan Szpakowicz, and Paweł Kędzia. 2016. plWordNet 3.0 – a Comprehensive Lexical-Semantic Resource.
[Pedersen et al. 2009] Bolette S. Pedersen, Sanni Nimb, Jørg Asmussen, Nicolai Hartvig Sørensen, Lars Trap-Jensen, and Henrik Lorentzen. 2009. DanNet: the challenge of compiling a wordnet for Danish by reusing a monolingual dictionary.