Conceptual resources

Introduction

Concept-based resources include onomasiological lexical resources such as wordnets, framenets, thesauri and ontologies. Such resources are typically interlinked with semantic relations (e.g. hypernymy, hyponymy). There are 29 conceptual resources in the CLARIN infrastructure. Most (22) of the conceptual resources are monolingual, accounting for 14 languages (Ancient Greek, Danish, Greek, Brazilian Portuguese, Dutch, Estonian, Finnish, Italian, Maltese, Polish, Portuguese, Swedish, Slovenian), while the rest (7) include both bilingual and multilingual language combinations (e.g., Swedish-English, Polish-English). In the vast majority of the cases, the conceptual resources can be directly downloaded from the national repositories or queried through easy-to-use online search environments.

For comments, changes of the existing content or inclusion of new resources, send us an email.

This website was last updated on 3 June 2021.

Conceptual resources in the CLARIN infrastructure

Monolingual resources

Resource

 

Language Description Availability

Open Ancient Greek WordNet 0.5

Size: 7,447 synsets
Licence: CC-BY-SA 4.0

Ancient Greek

This is a wordnet that is available for download through ILC4CLARIN.

Download

Ontology for the area of Nanoscience and Nanotechnology

Size: 511 entries
Linguistic information: semantic relations
Licence: MS NC-NoReD-ND

Brazilian Portuguese

This is an ontology of concepts related to nanoscience and nanotechnology available for download from CLARIN PORTULAN.

Download

DanNet, Danish Wordnet (v 2.2)

Size: 65,000 entries
Linguistic information: hypernymy, hyperonymy
Licence: DanNet 1.0 Licence

Danish

This wordnet is available for download from the CLARIN-DK repository. The resource is available in the .csv and .owl formats.

For a related publication, see Pedersen et al. (2009).

Download

STO semantics (The Danish SIMPLE Lexicon) - LMF format

Size: 12,609 entries
Licence: CC-BY 4.0

Danish

This resource presents a unified, ontology-based semantic model – the so-called SIMPLE model – representing an extended Qualia Structure from the Danish SIMPLE Lexicon. The resource is available for download from CLARIN-DK.

Download

Cornetto-LMF

Size: 130,000 entries
Linguistic information: semantic relationships and combinatorial information
Licence: other

Dutch

This is a resource that combines two resources with different semantic organisations: the Dutch Wordnet with its synset organisation and the Dutch Reference Lexicon which includes definitions, usage constraints, selectional restrictions, syntactic behaviours, illustrative contexts, etc. The resource is available for online browsing through a dedicated webpage.

Browse

Estonian Wordnet 2.1

Size: 115,318 keywords, 84,150 synets
Linguistic information: PoS-tags, synonymy, antonymy, hypernymy, hyponymy, meronymy
Licence: CC-BY-SA

Estonian

This is a wordnet that is available for download from META-SHARE (CELR distribution).

Download

Finnish FrameNet

Size: 866 frames
Licence: CC-BY 4.0

Finnish

This is a wordnet that is available for online browsing through FIN-CLARIN.

For a related publication, see Lindén et al. (2017).

Browse

GermaNet

Size: 61,659 synsets
Linguistic information: MSD-tags, lexical relations (hypernymy, hyponymy)
Licence: CLARIN ACA

German

This wordnet is available for download from a dedicated webpage hosted by Uni Tübinger/CLARIN-D.

For a related publication, see Henrich et al. (2011).

Download

Polytropon EL Conceptual Lexicon

Size: 4,000 Multi Word units:
15,000 tokens
Linguistic information: lemmas, MSD-tags, Lexical relations
Licence: under negotiation

Greek

This is a lexicon of lexical-semantic relations (e.g. synonymy, antonymy); lexical relations (e.g. word families, allomorphs, syntactic variants); morphosyntactic features (PoS, gender, declension, etc.), which is not yet available for download.

 

ItalWordNet Kyoto

Size: 49,514 synsets
Licence: CC-BY-NC-SA 4.0

Italian

This is a wordnet available for download download from ILC4CLARIN.

Download

ItalWordNet v.2

Size: 49,350 synsets
Linguistic information: equivalence relations between Italian synsets and closest concepts in an Inter-Lingual index
Licence: CC-BY-NC-SA 4.0

Italian

This is a wordnet available for download from ILC4CLARIN.

For a related publication, see Bartoliniet al. (2014).

Download

IWN-LOD

Size: 49,350 synsets

Italian

This is an RDF–Linguistic Open Data version of the ItalWordNet v.2. The resource is available for download through ILC4CLARIN.

Download

Maltese automatically produced distributional thesaurus

Size: 36,034 entries
Linguistic information: lemmas
Licence: CC-BY-SA

Maltese

This is a thesaurus that is available for download from CLARIN PORTULAN.

Download

NE_SUMO_PLWN_mapping

Size: 120 terms
Licence: CC BY SA 4.0

Polish

This conceptual resource provides a mapping between named entities types, SUMO categories and plWordNet synsets. The resource is available for download from the CLARIN-PL repository.

Download

PLWordNet to Sumo mapping

Size: 175,635 synsets mapped to SUMO
Licence: CC-BY-NC-SA 3.0

Polish

This conceptual resource provides a mapping of plWordNet onto the SUMO ontology. The resource is available for download from the CLARIN-PL repository.

Download

Geo-Net-PT 02

Size: 701,209 concepts
Linguistic information: qualia structure and lexical relations (hyponyms, synonyms)
Licence: CC-BY

Portuguese

This is an ontology of Portuguese geographic concepts. It is available for download from CLARIN PORTULAN.

Download

MWNPT-International WordNet of Portuguese

Size: 17,200 synsets
Linguistic information: hyponymy and hypernymy
Licence: MS NC-NoReD-ND

Portuguese

This is a wordnet that is available if contacting the resource manager.

 

Thesaurus of Modern Slovene 1.0

Size: 105,473 entries
Linguistic information: core and near synonymy
Licence: CC BY-SA 4.0

Slovenian

This is a thesaurus of the modern Slovenian language that is available for download from CLARIN.SI

For a related publication, see Krek et al. (2017).

Download

Bring (2015-05-08)

Size: 148,815 entries
Licence: CC-BY 4.0

Swedish

This is a digital version of Bring's thesaurus (1930) that is available for download from the SWE-CLARIN repository.

Download

Saldo 

Size: 131,020 entries
Licence: CC-BY 4.0

Swedish

This is an extensive lexicon resource for modern Swedish written language. The resource can be download from the SWE-CLARIN repository and can be queried online through KARP.

Browse

Download

Swedish FrameNet (2017-10-16)

Size: 1,195 entries
Licence: CC-BY 4.0

Swedish

This is a Swedish conceptual resource that employs the FrameNet++ annotation. The resource can be download from the SWE-CLARIN repository and can be queried online through KARP.

Browse

Download

Swesaurus (2017-10-16)

Size: 15,010 entries
Licence: CC-BY 4.0

Swedish

This is a Swedish wordnet that can be downloaded from the SWE-CLARIN repository and can be queried online through KARP.

Browse

Download

Multilingual resources

Resource

 

Language Description Availability

Finnish WordNet

Size: 170,000 synsets
Linguistic information: PoS-tags, synonymy, antonymy, hypernymy, hyponymy, meronymy
Licence: CC-BY 3.0

Finnish, English

This is a wordnet that is available for download from FIN-CLARIN as well as for online browsing.

For a related publication, see Lindén and Carlson (2010).

Browse

Download

The Sanat Version of the Finnish TransFrameNet

Size: 866 frames
Licence: CC-BY 4.0

Finnish, English

This is a framenet that is available for online browsing through FIN-CLARIN.

For a related publication, see Lindén et al. (2019).

Browse

 

Prolex

Size: 72,572 lexical relations
Linguistic information: inflected forms
Licence: Licence Publique Générale Amoindrie GNU

French, English, Polish, Serbian

This is an ontology of place names available for download from Ortolang.

Download

plWordNet 4.0

Size: 506,815 senses, 347,564 synsets
Licence: plwordnet-2

Polish, English

This is a lexico-semantic network which reflects the lexical system of the Polish language with projection to the English language, Słowosieć. The resource is available for download and browsing.

For a related publication, see Maziarz et al. (2016).

Browse

Download

Hontology

Size: 282 concepts
Linguistic information: terms correlation, rules (lexical patterns) and synonyms
Licence: CC-BY-NC-SA

Portuguese, English, Spanish, French

This is an ontology of concepts from the accommodation sector is available for download from CLARIN PORTULAN.

Download

Semantic lexicon of Slovene sloWNet 3.1

Size: 43,460 synsets
Linguistic information: lexical semantic relations
Licence: CC-BY-SA 4.0

Slovenian, English

This is a wordnet available for download from CLARIN.SI and for online browsing through a dedicated environment.

Browse

Download

WordNet-SALDO (2017-10-16)

Size: 6,989 entries
Licence: CC-BY 4.0

Swedish, English

This wordnet represents a link between SALDO senses and Core WordNet. The resource can be download from the SWE-CLARIN repository and can be queried online through KARP.

Browse

Download

 

Publications

[Bartolini et al. 2014] Roberto Bartolini, Valeria Quochi, Irene De Felice, Irene Russo, and Monica Monachini. 2014. From Synsets to Videos: Enriching ItalWordNet Multimodally.

[Henrich et al. 2011] Verena Henrich, Erhard Hinrichs, and Tatiana Vodolazova. 2011. Aligning GermaNet Senses with Wiktionary Sense Definitions.

[Krek et al. 2017] Simon Krek, Cyprian Laskowski,  andMarko Robnik-Šikonja. 2017.  From Translation Equivalents to Synonyms: Creation of a Slovene Thesaurus Using Word co-occurrence Network Analysis.

[Lindén and Carlson 2010] Krister Lindén and Lauri Carlson. 2010. FinnWordNet – WordNet på finska via översättning.

[Lindén et al. 2017] Krister Lindén, Heidi Haltia, Juha Luukkonen, Antti O. Laine, Henri Roivainen, and Niina Väisänen. 2017. FinnFN 1.0: The Finnish frame semantic database.

[Lindén et al. 2019] Krister Lindén, Heidi Haltia, Antti Laine, Juha Luukkonen, Jussi Piitulainen, and Niina Väisänen. 2019. Embeddings.FinnTransFrame: translating frames in the FinnFrameNet project.

[Maziarz et al. 2016] Marek Maziarz, Maciej PiaseckiA Ewa Rudnicka, Stan Szpakowicz, and Paweł Kędzia. 2016. plWordNet 3.0 – a Comprehensive Lexical-Semantic Resource.

[Pedersen et al. 2009] Bolette S. Pedersen, Sanni Nimb, Jørg Asmussen, Nicolai Hartvig Sørensen, Lars Trap-Jensen, and Henrik Lorentzen. 2009. DanNet: the challenge of compiling a wordnet for Danish by reusing a monolingual dictionary.