‘Voices from Ravensbrück’ Project
Read about the ‘Voices from Ravensbrück’ project which aims to give voice to women survivors from Ravensbrück concentration camp and to make the data openly accessible to researchers, schools and communities.
Read about the ‘Voices from Ravensbrück’ project which aims to give voice to women survivors from Ravensbrück concentration camp and to make the data openly accessible to researchers, schools and communities.
In the words of the famous language philosopher Ludwig Wittgenstein: "the meaning of a word is its usage in the language" (Philosophical Investigations, Part I, section 43). In other words, the meaning of a word can be revealed by the context in which it appears. An ambiguous word such as ‘bank’ can be be disambiguated given its context: the ‘bank’ bounding a body of water will tend to occur together with terms like “river”, “lake”, or “slope”, while the ‘bank’ which is a financial institution will tend occur together with expressions like “money”, “cheque”, or “go to”.
Changes in a word's meaning will therefore often be directly associated with changes in its characteristic combinations (the set of words with which it typically occurs together, its collocates). Even political, cultural, or social changes relating to a central term can be revealed and traced through its typical combinations (see the example for ‘revolution’ below).
DiaCollo is a software tool for the discovery, comparison, and interactive visualization of the typical word combinations for a user-specified target term. Characteristic word combination profiles based on various underlying text corpora can be requested for a particular time period, as well as direct comparisons between different time periods. In addition to traditional static tabular display formats, a number of intuitive interactive online visualizations for query result data are also available.
You can modify the basic recipe above in various ways, for example by changing the queried time period (DATE) and/or the size of the intervals on the time-line (SLICE). You can also change the maximal number of displayed collocates (KBEST) or the mode of visual presentation (FORMAT). Additional corpora and further modes of application are also available. For instance, you can use DiaCollo to display the differences or the similarities between two different words on the basis of their typical collocates over a given time period, or to directly compare the typical collocates of a single word in two different time periods. Further details and examples can be found in the full CLARIN-D DiaCollo use-case (in German), as well as in DiaCollo's online help pages.
A more detailed guide with examples in German is available in PDF format.
DiaCollo is a use case of the CLARIN-D centre in the Berlin-Brandenburg Academy of Sciences and Humanities (BBAW).
Participating projects:
Related CLARIN-D tools and services
Nederlab is a user-friendly and tool-enriched open access web interface that aims at containing all digitized texts relevant for the Dutch national heritage and the history of Dutch language and culture (c. 800 - present).
The Nederlab project aims to bring together all digitized texts relevant to Dutch national heritage, the history of Dutch language and culture (c. 800 -present) in one user-friendly and tool-enriched open access web interface, allowing scholars to simultaneously search and analyze data from texts spanning the full recorded history of the Netherlands, its language and culture. The project builds on various initiatives: for corpora Nederlab collaborates with the scientific libraries and institutions, for infrastructure with CLARIN (and CLARIAH), for tools with eHumanities programmes such as Catch, IMPACT and CLARIN (TICCL, frog).
Nederlab allows researchers to search and refine its content on basis of metadata, text and several layers of annotations for this text, such as lemmata, part-of-speech tags, named entities or syntactic annotations. These enrichments are added during a preprocessing stage that also applies automatic spelling normalization. Search results can of course be inspected one-by-one, via lists or keyword-in-context concordances, but also in several aggregated forms. For example, results can simultaneously be grouped on basis of publication date and genre and then displayed as visualisations or exported. Or they can be presented as collocations. Statistics about the result set are available as well, as are frequency lists over any subcollection. Search results can be stored as virtual collections in the researcher’s personal workspace. A range of tools will be available in this workspace to analyse the collections or to compare them to each other.
The first version of Nederlab was launched in early 2015, it’ll be expanded until the end of 2017.
Meertens Instituut in collaboration with Huygens ING and the Institute for Dutch Lexicology.
prof. dr. Hans Bennis
Hennie Brugman
Netherlands
Dutch
Nederlab is financed by NWO, KNAW, CLARIAH and CLARIN-NL.
The integrated language bank of the Dutch Language Institute offers online access to a number of historical dictionaries, including Old, Middle and Modern Dutch, and the Frisian language.
A Corpus is a collection of texts in electronic form used for linguistic research, using provided with digital tools to allow searching, analysis and research. Users can use these tools to find words and collocations in their original contexts, and determine their frequency in the corpus. The Czech National Corpus (CNC) is an academic project focusing on building a large electronic corpus of mainly written Czech. The Institute of the Czech National Corpus (ICNC), Faculty of Arts, Charles University in Prague oversees the development of the CNC, including its use in teaching, and advancing the field of the corpus linguistics.
Das Digitale Wörterbuch der deutschen Sprache (Digital Dictionary of the German Language) provides a wealth of information in its contemporary and historical forms, with more than 410,000 entries from five dictionaries, 1.8 million words in 15 corpora, and word profiles and trends based on frequencies.
The Korp online resource offers the opportunity to search a wealth of language resources (mostly) in the Finnish language, from a range of time periods. The Korp software was originally developed and is actively maintained by The Swedish Language Bank.