Skip to main content

Workshop on Comparable and Interoperable Corpora of Academic Texts @CLARIN2024

, -
Promotional image for the CLARIN workshop 2024 in Barcelona.

General Information

Date: 17 October 2024
Time: 14:00 - 18:00 CEST
Location: Barcelo Sants, Barcelona, Spain (at CLARIN2024)

Key Deadlines

Submission deadline: 15 July 2024
Notification of acceptance: 9 August 2024 (closed)
If you are attending the CLARIN Annual Conference and would like to join this workshop, feel free to send an email to events[at]clarin[dot]eu.

About the Workshop

The CLARIN 2024 Post-Conference workshop on Comparable and Interoperable Corpora of Academic Texts aims to bring together experts and enthusiasts from CLARIN partners to discuss the creation, management, and application of academic text corpora.

Academic texts, such as academic papers and theses serve as sources for sharing innovative research findings, theories, and methodologies across the academic community. Often accessible via open-source infrastructures, they create invaluable resources for comparative large-scale language research, including terminology extraction. Analysing academic texts from various disciplines also promotes cross-disciplinary insights and interdisciplinary collaboration.

The workshop will mainly focus on written text data; research on other data can be presented as well.



14:00 Welcome & opening

Presentation 1: Tomaž Erjavec 

Corpora of Slovenian academic texts


Presentation 2:  Vanja Štefanec, Daša Farkaš, Matea Filko and Marko Tadić 

Croatian Scientific Corpus


Presentation 3: Roberts Darģis

Corpus of Latvian PhD Theses


Presentation 4: Marc Kupietz, Peter Leinen and Nils Diewald

Towards a Very Large German Academic Corpus - Step 1: Building and Making Available a Corpus of 10,000 Doctoral Dissertations

15:30 Coffee break

Presentation 5: Anje Müller Gjesdal and Marita Kristiansen

Academic corpora and specialised neology: examples from the nature and environment subject fields


Presentation 6: Sofia Nasopoulou

Transforming research publications into a Knowledge Graph

16:30 Conclusion and outlook: next steps


Topics of Interest

We welcome submissions on a wide range of topics related to the development and utilisation of comparable and interoperable corpora of academic texts, including but not limited to:
•    Design and creation of monolingual and multilingual academic text corpora
•    Linguistic annotation and metadata of academic text corpora
•    Standards and methods for interoperability and comparability of academic text corpora
•    Use cases and applications for academic text corpora across various academic disciplines
•    Ethical and legal considerations in data collection.

Submission Guidelines

Authors are invited to present their ideas and existing resources at the workshop. Extended abstracts should include a description of the activities, as well as the names and affiliations of the presenters. Please prepare your abstracts according to the following guidelines:

•    Length: Extended abstracts of 500 to 1000 words (without references)
•    Language: Submissions should be written in English
•    Submission: Via the conference management system [link].

Submission for this workshop is closed.

If you are attending the CLARIN Annual Conference and would like to join this workshop, feel free to send an email to events[at]clarin[dot]eu.


Selected workshop abstracts will be published in the CLARIN2024 Conference Proceedings. Full papers based on workshop presentations may be submitted for publication in the CLARIN2024 post-conference volume.

Accommodation Information

Funds to cover accommodation expenses (max. 2 nights) are available for workshop participants. For attendees of the CLARIN Annual Conference 2024 one additional night of accommodation is covered.

Contact Information

For more information about the workshop or in case of any questions, please contact Andreas Witt (witt [at] (witt[at]ids-mannheim[dot]de)) and Laura Herzberg (herzberg [at] (herzberg[at]ids-mannheim[dot]de)).

We look forward to your submissions and to welcoming you to Barcelona in October!

The Workshop Committee
Tomaž Erjavec
Laura Herzberg
Tanja Wissik
Andreas Witt

