General Information
Key Deadlines
About the Workshop
The CLARIN 2024 Post-Conference workshop on Comparable and Interoperable Corpora of Academic Texts aims to bring together experts and enthusiasts from CLARIN partners to discuss the creation, management, and application of academic text corpora.
Academic texts, such as academic papers and theses serve as sources for sharing innovative research findings, theories, and methodologies across the academic community. Often accessible via open-source infrastructures, they create invaluable resources for comparative large-scale language research, including terminology extraction. Analysing academic texts from various disciplines also promotes cross-disciplinary insights and interdisciplinary collaboration.
The workshop will mainly focus on written text data; research on other data can be presented as well.
Programme
14:00 | Welcome & opening |
14:10 |
Presentation 1: Tomaž Erjavec Corpora of Slovenian academic texts |
14:30 |
Presentation 2: Vanja Štefanec, Daša Farkaš, Matea Filko and Marko Tadić Croatian Scientific Corpus |
14:50 |
Presentation 3: Roberts Darģis Corpus of Latvian PhD Theses |
15:10 |
Presentation 4: Marc Kupietz, Peter Leinen and Nils Diewald Towards a Very Large German Academic Corpus - Step 1: Building and Making Available a Corpus of 10,000 Doctoral Dissertations |
15:30 | Coffee break |
15:50 |
Presentation 5: Anje Müller Gjesdal and Marita Kristiansen Academic corpora and specialised neology: examples from the nature and environment subject fields |
16:10 |
Presentation 6: Sofia Nasopoulou Transforming research publications into a Knowledge Graph |
16:30 | Conclusion and outlook: next steps |
Topics of Interest
We welcome submissions on a wide range of topics related to the development and utilisation of comparable and interoperable corpora of academic texts, including but not limited to:
• Design and creation of monolingual and multilingual academic text corpora
• Linguistic annotation and metadata of academic text corpora
• Standards and methods for interoperability and comparability of academic text corpora
• Use cases and applications for academic text corpora across various academic disciplines
• Ethical and legal considerations in data collection.
Submission Guidelines
Authors are invited to present their ideas and existing resources at the workshop. Extended abstracts should include a description of the activities, as well as the names and affiliations of the presenters. Please prepare your abstracts according to the following guidelines:
• Length: Extended abstracts of 500 to 1000 words (without references)
• Language: Submissions should be written in English
• Submission: Via the conference management system [link].
Submission for this workshop is closed.
If you are attending the CLARIN Annual Conference and would like to join this workshop, feel free to send an email to events[at]clarin[dot]eu.
Publication
Selected workshop abstracts will be published in the CLARIN2024 Conference Proceedings. Full papers based on workshop presentations may be submitted for publication in the CLARIN2024 post-conference volume.
Accommodation Information
Funds to cover accommodation expenses (max. 2 nights) are available for workshop participants. For attendees of the CLARIN Annual Conference 2024 one additional night of accommodation is covered.
Contact Information
For more information about the workshop or in case of any questions, please contact Andreas Witt (witt [at] ids-mannheim.de (witt[at]ids-mannheim[dot]de)) and Laura Herzberg (herzberg [at] ids-mannheim.de (herzberg[at]ids-mannheim[dot]de)).
We look forward to your submissions and to welcoming you to Barcelona in October!
The Workshop Committee
Tomaž Erjavec
Laura Herzberg
Tanja Wissik
Andreas Witt
Barcelona
Spain