CLARIN Workshop: NLP Tools for Historical Documents

Monday, 9 September 2019 , 00:00 - Wednesday, 11 September 2019 , 00:00

Experts on tools for working with historical texts will meet to exchange ideas, experiences about tools and methods, and develop a resource guide, and a plan of action to integrate more tools into the CLARIN infrastructure. Participants will be invited from across the CLARIN community.

Workshops aim

The workshop will bring together people who are creating or working with NLP tools (especially tokenizers, normalizers, morphological analyzers, part of speech taggers and lemmatizers) for historical language varieties, especially European languages in the period 1500-1800. This historical period (roughly covered by the term ‘Early Modern’ in English) is selected since it represents the time covered by many digitization programmes of early printed works, and a time when many languages were still recognizably similar in form to contemporary varieties, but with significant differences which mean that standard software tools often cannot be applied to them with acceptable levels of accuracy. This workshop will focus on the adaption of NLP tools trained on or designed for modern language varieties, as well as custom tools designed specifically for particular historical varieties. The workshop will be an opportunity for sharing expertise, know-how, tools and resources.

Workshops outcome

During the workshop the experts on NLP tools for working with historical documents exchange ideas, experiences about tools and methods. The outputs included a draft resource guide, and a plan of action to integrate more tools into the CLARIN infrastructure. Read the full blog post

Address

BBAW
Jägerstrasse 22/23
10117
Berlin
Germany