Conference Programme Outline
9:00 – 10:30 |
|
|
10:30 - 11:00 | Coffee break | |
11:00 - 13:00 |
|
|
13:00 - 14:00 | Lunch break | |
14:00 - 15:30 |
|
|
15:30 - 16:00 | Coffee break | |
16:00 - 16:15 |
|
Aula
|
16:15 - 17:00 | Keynote Jörg Tiedemann |
Aula
|
17:00 - 18:00 | Papers (Poster Format) | Dining room wing |
18:15-18:30 | Walk to Leuven Town Hall | |
18:30 - 19:20 | Welcome Reception |
Historic Town Hall
Grote Markt 9
|
19:30 - 22:00 | Welcome Dinner |
Domus
Tiensestraat 8
|
09:00 - 09:10 | Presentation by Programme Committee Chair | Aula |
09:10 - 09:15 | Presentation by Local National Coordinator | Aula |
09:15 - 10:00 | Pitches by CLARIN Committees | Aula |
10:00 - 10:30 | State of the Technical Infrastructure | Aula |
10:30 - 11:00 | Coffee Break | |
11:00 - 13:00 | Abstract Presentations (Infrastructure) | Aula |
11:00 - 13:00 | Teachers' Workshop: Using CLARIN in Training and Education | CR2 |
13:00 - 13:45 | Lunch | |
13:45 - 14:30 | PhD Poster Session | Dining room wing |
14:30 - 15:30 | Abstract Presentations (ParlaMint) | Aula |
15:30 - 16:00 | Coffee Break | |
16:00 - 17:20 | Abstract Presentations (Tools) | Aula |
17:30 - 19:00 |
For an overview off all posters, please consult the Bazaar page
|
|
19:30 - 22:30 | Conference Dinner |
Faculty Club
Groot Begijnhof 14
|
09:00 - 10:20 | Abstract Presentations (Corpora) | Aula |
10:20 - 11:00 | Group Photo and Coffee Break | |
11.00 - 11:45 | Keynote by Laurence Devillers | Aula |
11:45 - 12:45 |
Abstract Presentations (Metadata and Annotations)
|
Aula |
12:45 - 13:00 |
Closing Remarks
|
Aula |
13:00 - 14:00 | Lunch | |
14:00 - 16:00 | SAB Meeting | Board room |
14:00 - 17:00 |
K-centre Workshop (Part I) (Invite-only)
SSH Marketplace Workshop
EuReCo Workshop (Invite-only)
|
|
Keynotes
Lost in Meaning - Found in Translation Jörg Tiedemann University of Helsinki Monday 16 October, 16:15 - 17:00 |
Ethical Issues of Generative AI Laurence Devillers University Paris-Sorbonne IV/LIMSI CNRS Wednesday, 18 October, 11:00 - 11:45 |
Conference Programme Details
Day One
Time | Monday 16 October 2023 | Room |
9:00 – 10:30 |
|
|
10:30 - 11:00 | Coffee break |
|
11:00 - 13:00 |
|
|
13:00 - 14:00 | Lunch break | |
14:00 - 15:30 |
|
|
15:30 - 16:00 | Coffee break | |
Start of the Conference | ||
16:00 - 16:15 |
|
Aula
|
16:15 - 17:00 |
Keynote by Jörg Tiedemann Lost in Meaning - Found in Translation: Natural Language Understanding with Multilingual Data (slides) The task of translation involves language understanding and generation and, in this way, naturally combines the two essential challenges in computational linguistics and language technology. In the FoTran project, we are interested in the ability of neural translation models to pick up linguistic properties and to generalise to meaningful representations when trained on large amounts of multilingual data. Our focus is on the effect of linguistic diversity on abstraction and generalisation. In order to study this, we need to create the necessary resources and infrastructure. In this talk, I will first introduce the OPUS ecosystem that fuels our research. In the second part, I will concentrate on the experiments, studies and developments that this ecosystem enables within and outside of FoTran. I also welcome discussions on further directions that can be taken with the multilingual infrastructure we build, looking forward to your input.
|
Aula |
17:00 - 18:00 |
Papers (Poster Format) Linguistic Resources and Tools for Ukrainian: Grounds for Creating a K-Centre
Olha Kanishcheva and Maria Shvedova
The Making of the CLARIN Resource Family for Oral History: Lessons Learned from ‘Voices from Ravensbrück’ (poster)
Stefania Scagliola, Silvia Calamai, Henk Van Den Heuvel and Christoph Draxler
Libraries as Data Infrastructures
Martin Wynne, Andreas Witt, Leinen Peter and Sally Chambers
(CI) Workflow for Quality Assurance Checks for Corpora of Multimodal Interaction (poster)
Anne Ferger, André Frank Krause and Karola Pitsch: A Continous Integration
The LiRI Corpus Platform (poster)
Jonathan Schaber, Johannes Graën, Daniel McDonald, Igor Mustač, Nikolina Rajović, Gerold Schneider and Noah Bubenhofer
DBBErt: Part-of-Speech Tagging of Pre-Modern Greek Text
Colin Swaelens, Els Lefever and Ilse De Vos
A Multilingual Database for Icelandic L2 Flashcards
Xindan Xu, Þórunn Arnardóttir and Anton Karl Ingason
Korpusnik: A Corpus Summarizing Tool for Slovene
Iztok Kosem, Jaka Cibej, Kaja Dobrovoljc and Simon Krek
Topics in Swedish News on Climate Change: A Timeline 2016 - 2023
Maria Skeppstedt
Sharing the Finnish Dark Web Marketplace Corpus (FINDarC) (poster)
Krister Lindén, Teemu Ruokolainen, Lasse Hämäläinen and Tuomas Harvianen
Swissdox@LiRI – A Large Database of Media Articles Made Accessible to Researchers (poster)
Johannes Graën, Igor Mustač, Nikolina Rajović, Jonathan Schaber, Gerold Schneider and Noah Bubenhofer
Analyses of Information Security Standards on Data Crawled from Company Web Sites Using SweClarin Resources
Arne Jönsson, Subhomoy Bandyopadhyay, Svjetlana Pantic Dragisic and Andrea Fried
Building and Consolidating a FAIR-Compliant Ecosystem of Infrastructures
Cristina Grisot, Noah Bubenhofer, Andrea Malits, Stefanie Strebel, Johannes Graën and Stefan Buerli
Dynamically Chaining APIs: from Dracor to TEITOK
Maarten Janssen
The ACoDe Project: Creating a Dementia Corpus for Icelandic
Elena Callegari, Anton Karl Ingason and Agnes Sólmundsdóttir Emotion and Abstractness in Austrian Parliamentary Discourse
Tanja Wissik and Klaus Hofmann
Developing Manually-Annotated Corpora for Teaching and Learning Purposes of Brazilian Portuguese, Dutch, Estonian, and Slovene (the CrowLL Project)
Tanara Zingano Kuhn, Carole Tiberius, Špela Arhar Holdt, Kristina Koppel, Iztok Kosem and Rina Zviel Girshin and Ana R. Luís |
Dining room wing |
18:15-18:30 | Walk to Town Hall Leuven | |
18:30 - 19:30 | Welcome Reception |
Historic Town Hall
Grote Markt 9
|
19:30 - 22:00 | Welcome Dinner |
Domus
Tiensestraat 8
|
Day Two
Time | Tuesday 17 October 2023 | Room |
09:00 - 09:10 | Presentation by Programme Committee Chair (slides) | Aula |
09:10 - 09:15 | Presentation by Local National Coordinator |
Aula
|
09:15 - 10:00 | Pitches by CLARIN Committees (slides) |
Aula
|
10:00 - 10:30 | State of the Technical Infrastructure (slides) | Aula |
10:30 - 11:00 | Coffee Break | |
11:00 - 13:00 |
Thematic Session: Infrastructure Chair: Jurgita Vaičenonienė |
Aula
|
11:00 - 11:20 |
Standards Information System for CLARIN Centres and Beyond (slides)
Piotr Banski and Eliza Margaretha Illig
|
|
11:20 - 11:40 |
The CLARIN:EL Infrastructure (slides) Maria Gavriilidou, Stelios Piperidis, Dimitrios Galanis, Juli Bakagianni, Penny Labropoulou, Athanasia Kolovou, Dimitris Gkoumas, Miltos Deligiannis, Kanella Pouli, Iro Tsiouli, Leon Voukoutis and Katerina Gkirtzou
|
|
11:40 - 12:00 |
NB DH-LAB: A Corpus Infrastructure for Social Sciences and Humanities (slides)
Magnus Breder Birkenes, Lars G. Johnsen and Andre Kåsen
|
|
12:00 - 12:20 |
CORLI CLARIN K-Centre: Development and Perspectives (slides)
Christophe Parisse and Céline Poudat
|
|
12:20 - 12:40 |
The SSH Open Marketplace and CLARIN (slides)
Alexander König, Laure Barbot, Cristina Grisot, Michael Kurzmeier and Edward J. Gray
|
|
12:40 - 13:00 |
CLARIN-IT: Texts, Documents and New Contexts (slides)
Federico Boschetti, Angelo Mario Del Grosso, Riccardo Del Gratta, Francesca Frontini and Monica Monachini
|
|
11:00 - 13:00 |
Teachers' workshop: Using CLARIN in Training and Education (slides)
Click on Details to view the programme. For more information about the abstracts, please visit the workshop programme page.
11:00 - 12:00 Presentations of Accepted Abstracts 11:00 - 11:10 Welcome and Introduction
Francesca Frontini
11:10 - 11:20 Privacy by Design in Linguistic Research
Henk van den Heuvel
11:20 - 11:30 Teaching Syntax with CLARIN Corpora and Resources Antonio Balvet
11:30 - 11:40 Learning Programming in Python for Linguistics and Language Studies Koenraad De Smedt
11:40 - 11:50 NLP Annotation for Digital Scholars Maarten Janssen and Silvie Cinková
11:50 - 12:00 DH-Course Registry: A Bridge Between Infrastructures, DH Masters Degrees and Industry?
Amelia Sanz, Vicky Garnett, Tom Gheldof, Adeline Joffres, Iulianna van der Lek, Edward Gray,
12:00 - 12:10 Discussion 12:10 - 13:00 Demo of the CLARIN Learning Content in the UPSKILLS project 12:10-12:20 Introduction to the UPSKILLS Project
Stavros Assimakopoulos
12:20 -12:35 Introduction to Language Data: Standards and Repositories
Iulianna van der Lek
12:35 -12:50 Automatic Speech Recognition and Force Alignment
Louis ten Bosch
12:50 - 13:00 Discussion & Wrap-Up |
CR2 |
13:00 - 13:45 | Lunch |
|
13:30 - 14:30 | PhD Poster Session | Dining room wing |
14:30 - 15:30 |
Thematic Session: ParlaMint Chair: Maciej Piasecki |
Aula |
14:30 - 14:50 |
The ParlaMint Project: Ever-Growing Family of Comparable and Interoperable Parliamentary Corpora (slides) Maciej Ogrodniczuk, Petya Osenova, Tomaž Erjavec, Darja Fišer, Nikola Ljubešić, Çagrı Çöltekin, Matyáš Kopp, Katja Meden and Taja Kuzman |
|
14:50 - 15:10 |
Workflow and Metadata Challenges in the ParlaMint Project: Insights from Building the ParlaMint-UA Corpus (slides) Anna Kryvenko and Matyáš Kopp
|
|
15:10 - 15:30 |
Adding Political Orientation Metadata to ParlaMint Corpora (slides) Tomaž Erjavec, Katja Meden and Jure Skubic
|
|
15:30 - 16:00 | Coffee Break |
|
16:00 - 17:20 |
Thematic Session: Tools Chair: Vincent Vandeginste |
Aula
|
16:00 - 16:20 |
MATEO: Machine Translation Evaluation for Users and Developers (slides) Bram Vanroy
|
|
16:20 - 16:40 |
Domain-Specific Languages for Epigraphy: The Case of ItAnt (slides)
Luca Rigobianco, Federico Boschetti and Valeria Quochi
|
|
16:40 - 17:00 |
Finding Dutch Multiword Expressions (slides) Jan Odijk, Martin Kroon, Tijmen Baarda, Ben Bonfil and Sheean Spoel
|
|
17:00 - 17:20 |
Automatic Anonymisation of Human Faces in Images of Authentic Social Interaction: A Web Application (slides)
André Frank Krause, Anne Ferger and Karola Pitsch
|
|
17:30 - 19:00 | Bazaar Poster Session | Dining room wing |
19:30 - 22:30 | Conference Dinner |
Faculty Club
Groot Begijnhof 14
|
Day Three
Time | Wednesday 18 October 2023 | Room |
09:00 - 10:20 |
Thematic Session: Corpora Chair: Tomaž Erjavec |
Aula |
09:00 - 09:20 |
A Spoken Academic Belgian Dutch Corpus (slides)
Vincent Vandeghinste, Jolien Mathysen, Patrick Wambacq and Elke Peters
|
|
09:20 - 09:40 |
NGT-HoReCo and GoSt-ParC-Sign: Two New Sign Language - Spoken Language Parallel Corpora (slides)
Mirella De Sisto, Dimitar Shterionov, Lien Soetemans, Vincent Vandeghinste and Caro Brosens
|
|
09:40 - 10:00 |
Teaching Syntax with Clarin Corpora and Resources (slides)
Antonio Balvet
|
|
10:00 - 10:20 |
A New CLARIN Resource Family for Lexical Semantic Change Research (slides)
Paola Marongiu, Fahad Khan and Barbara McGillivray
|
|
10:20 - 11:00 | Group Photo and Coffee Break |
|
11:00 - 11:45 |
Keynote by Laurence Devillers Ethical Issues of Generative AI (slides)
In this keynote, I offer studies and reflections on the ethical issues of generative artificial intelligence (AI). The special feature of generative artificial intelligence systems is that they are based on generative models that can produce multiple outputs: generation of text or images for various purposes such as translation, production of computer code, chatbots, decision support and so on. These models, pre-trained on large datasets, can be optimised to produce a new application using little additional data specific to that task. The social and economic impact of generative AI systems is likely to be major in many potential uses, for example, in the environment or in healthcare. However, these generative AI systems raise many ethical, epistemological, anthropological, psychological, economic, social, political and cultural questions. Some of these issues will continue to occur as these technologies are put to new uses, and it is not yet possible to predict all the effects they will have on individuals and society. Since the end of 2022, economic and political actors in several countries have been discussing the impact of language models built with these generative AI systems. Some of these models have an impressive number of parameters. The race for the largest model is ongoing, but it is not certain that larger models would deliver higher performance. I was involved as a co-writer of the opinion n°7 of the ethical issues of generative artificial intelligence in the CNPEN (National Pilot Committee for Digital Ethics). In this opinion, CNPEN focuses on the most important ethical issues in light of current experience with generative AI systems, mainly on language models.
|
Aula
|
11:45 - 12:45 |
Thematic session: Metadata and Annotations Chair: Andreas Witt
|
Aula
|
11:45 - 12:05 |
Documenting Corpus Annotation in CMDI: State of Affairs (slides) Jakob Lenardič
|
|
12:05 - 12:25 |
Do Chatbots Dream of Copyright? Copyright in AI-generated Language Data (slides)
Pawel Kamocki, Toby Bond, Krister Lindén and Thomas Margoni
|
|
12:25 - 12:45 |
Between Lexicon and Grammar: Towards Integrated Valencies for Bulgarian (slides)
Petya Osenova and Kiril Simov
|
|
12:45 - 13:00 |
|
Aula |
13:00 - 14:00 | Lunch |
|
14:00 - 16:00 | SAB Meeting |
Board room
|
14:00 - 17:00 |
K-Centre Workshop (Part I) (Invite-only)
Annual workshop for K-centre representatives, see the event page.
This workshop aims at supporting researchers interested in creating a workflow in the SSH Open Marketplace. Following a brief presentation of what the SSH Open Marketplace is and how it works, participants will be supported by members of the Editorial Board of this discovery portal to write and document their research scenarios, based on the use of CLARIN tools, services and data - for example the CLARIN Resource Families or tools from the Language Resource Switchboard. Workflows are an ideal way to share one’s research resources, and harness the power of the SSH Open Marketplace to contextualise tools and services with publications, datasets, and training resources, thus presenting a research activity from A to Z in an easy to follow and reproducible way.
EuReCo Workshop (Invite-only)
The EuReCo workshop brings together representatives of National Corpora from CLARIN countries. Its aim is to explore the possibilities of launching an initiative toward a large multilingual and distributed reference corpus for European languages that would connect these existing resources. Such an initiative could potentially develop into a new CLARIN flagship project. It would enable linguists to explore corpora of different languages, especially annotated ones, by means of the CLARIN infrastructure. Eventually, this project could lead to the creation of a large comparable corpus of European languages accessible through a single access point. For more details, including the agenda, please refer to this link.
You can find the agenda via this link.
|
|
Day Four
Time | Thursday 19 October 2023 | Room |
09:00 - 13:00 | K-Centre Workshop (Part II) (Invite-only) | CR2 |