There will be a large number of people who are involved in building or using CLARIN resources at the 10th Language Resources and Evaluation Conference, to be held 23-28 May 2016 in Portorož, Slovenia. Below is a snapshot of some of the workshops and papers with which they are involved. You can also see the latest news from CLARIN at LREC on the Twitter feed Tweets by @CLARINERIC.
Workshops
ID | Workshop title | CLARIN contact |
W5 | Cross-Platform Text Mining and Natural Language Processing Interoperability | Richard Eckart de Castilho (Technische Universität Darmstadt, Germany) |
W18 | Translation evaluation – From fragmented tools and data sets to an integrated ecosystem | Jan Hajic (Charles Unviersity, Czech Republic) |
W31 | Improving Social Inclusion using : Tools and resources | Ineke Schuurman (University of Leuven, Belgium) |
W9 | Resources and ProcessIng of linguistic and extra-linguistic Data from people with various forms of cognitive/psychiatric impairments | Jens Edlund (KTH - Royal Institute of Technology, Sweden) |
W42 | Legal Issues | Erik Ketzan & Andreas Witt (Institute für Deutsche Sprache, Mannheim, German), Stelios Piperidis (Athena Research Center/ILSP, Athens, Greece) |
W23 | 4Real - Research Results Reproducibility and Resources Citation in Science and Technology of Language | António Branco (University of Lisbon) |
W36 | Normalisation and Analysis of Social Media Texts (NormSoMe) | Andrius Utka (Vytautas Magnus University, Kaunas) |
Papers
19 | A corpus of images and text in online news | Laura Hollink, Adriatik Bedjeti, Martin van Harmelen and Desmond Elliott |
73 | VPS-GradeUp: Graded Decisions on Usage Patterns | Baisa Vít, Cinková Silvie, Krejčová Ema, Vernerová Anna |
104 | Falling silent, lost for words ... Tracing personal involvement in interviews with Dutch war veterans | Henk van den Heuvel and Nelleke Oostdijk |
223 | Curation of Dutch Regional Dictionaries | Nicoline van der Sijs, Eric Sanders, Henk van den Heuvel and Aukje Borkent |
306 | The SemDaX corpus – sense annotations with scalable sense inventories | Bolette Pedersen, Anna Braasch, Anders Johannsen, Héctor Martínez Alonso, Sanni Nimb, Sussi Olsen, Anders Søgaard and Nicolai Hartvig Sørensen |
337 | South African National Centre for Digital Language Resources | Justus Roux |
348 | Universal Dependencies v1: A Multilingual Treebank Collection | Nivre Joakim, de Marneffe Marie-Catherine, Ginter Filip, Goldberg Yoav, Hajič Jan, Manning Christopher, McDonald Ryan, Petrov Slav, Pyysalo Sampo, Silveira Natalia, Tsarfaty Reut, Zeman Daniel |
361 | Corpus-based diacritic restoration for South Slavic languages | Nikola Ljubešić, Tomaž Erjavec and Darja Fišer |
362 | AfriBooms: An online treebank for Afrikaans | Liesbeth Augustinus, Peter Dirix, Daniel Van Niekerk, Ineke Schuurman, Vincent Vandeghinste, Frank Van Eynde and Gerhard Van Huyssteen |
401 | CINTIL DependencyBank PREMIUM - A corpus of grammatical dependencies for Portuguese | Rita de Carvalho, Andreia Querido, Marisa Campos, Rita Valadas Pereira, João Silva and António Branco |
419 | Using a Language Technology Infrastructure for German in order to Anonymize German Sign Language Corpus Data | Thomas Hanke |
476 | FLAT: constructing a CLARIN compatible home for language resources | Menzo Windhouwer, Marc Kemps-Snijders, Paul Trilsbeek, André Moreira, Bas Van der Veen and Guilherme Silva |
486 | Poly-GrETEL: Cross-Lingual Example-based Querying of Syntactic Constructions | Liesbeth Augustinus, Vincent Vandeghinste and Tom Vanallemeersch |
502 | The BAS speech data repository | Uwe Reichel, Florian Schiel, Thomas Kisler and Christoph Draxler |
506 | Graded and Word-Sense-Disambiguation decisions in Corpus Pattern Analysis: a pilot study | Baisa Vít, Cinková Silvie, Krejčová Ema, Vernerová Anna |
526 | CLARIAH in the Netherlands | Jan Odijk |
572 | European Union Language Resources in Sketch Engine | Vít Baisa, Jan Michelfeit and Marek Medveď |
596 | OCR post-correction evaluation of Early Dutch Books Online -- revisited | Martin Reynaert |
613 | Leveraging RDF Graphs for Crossing Multiple Bilingual Dictionaries | Marta Villegas, Maite Melero, Núria Bel and Jorge Gracia |
668 | BAS Speech Science Web Services - an Update of Current Developments | Thomas Kisler, Uwe Reichel, Florian Schiel, Christoph Draxler, Bernhard Jackl and Nina Pörner |
709 | If You Even Don't Have a Bit of Bible: Learning Delexicalized POS Taggers | Yu Zhiwei, Mareček David, Zeman Daniel, Žabokrtský Zdeněk |
766 | Fostering the Next Generation of European Language Technology: Recent Developments – Emerging Initiatives – Challenges and Opportunities | Georg Rehm, Jan Hajic, Josef van Genabith and Andrejs Vasiļjevs |
783 | Facilitating metadata interoperability in CLARIN-DK | Lene Offersgaard and Dorte Haltrup Hansen |
811 | Corpus vs. lexicon supervision in morphosyntactic tagging: the case of Slovene | Nikola Ljubešić and Tomaž Erjavec |
880 | The Public License Selector: Making Open Licensing Easier | Pawel Kamocki, Pavel Straňák and Michal Sedlák |
887 | Towards Comparability of Linguistic Graph Banks for Semantic Parsing | Oepen Stephan, Kuhlmann Marco, Miyao Yusuke, Zeman Daniel, Cinková Silvie, Flickinger Dan, Hajič Jan, Ivanova Angelina, Urešová Zdeňka |
936 | Czech Legal Text Treebank 1.0 | Vincent Kríž, Barbora Hladká, Zdeňka Urešová |
990 | CLARIN-EL Web-based Annotation Tool | Ioannis Manousos Katakis, Georgios Petasis and Vangelis Karkaletsis |
1012 | QTLeap WSD/NED corpora: Semantic annotation of parallel corpora in six languages | Arantxa Otegi, Nora Aranberri, António Branco, Jan Hajic, Martin Popel, Kiril Simov and Eneko Agirre |
1070 | NLP Infrastructure for the Lithuanian Language | Daiva Vitkutė-Adžgauskienė, Andrius Utka, Darius Amilevičius and Tomas Krilavičius |
1131 | MWEs in Treebanks: From Survey to Guidelines | Victoria Rosén, Koenraad De Smedt, Gyri Smørdal Losnegaard, Eduard Bejček, Agata Savary, Adam Przepiórkowski and Verginica Mitetelu |
1141 | Improving corpus search via parsing | Natalia Klyueva and Pavel Straňák |
1137 | Corpus Query Lingua Franca (CQLF) | Piotr Banski, Elena Frick and Andreas Witt |
1150 | Providing a Catalogue of Language Resources for Commercial Users | Bente Maegaard, Lina Henriksen, Andrew Joscelyne, Vesna Lusicky, Margaretha Mazura, Sussi Olsen, Claus Povlsen, and Philippe Wacker |
1154 | Corpus Analysis based on Structural Phenomena in Texts: Exploiting TEI Encoding for Linguistic Research | Susanne Haaf |
1236 | Hidden resources – strategies to acquire and exploit potential spoken language resources in national archives | Jens Edlund and Joakim Gustafson |
The full list of papers and workshops for the conference can be seen on the LREC2016 proceedings website.