Hands-on session 1
By: Laura Hollink (CWI, The Netherlands)
Topic: A handhow to query and analyze the data of the European and Dutch parliaments, published as Linked Open Data in the Talk of Europe and Polimedia projects.
Abstract: First, we will walk through examples of complex, structured (SPARQL) queries. These involve, for example, the spoken words of the parliamentary debates (including translations thereof in any of the EU languages), the topic annotations, the agendas, the speakers, their role, party and the country they represent. We will also explore the use of external sources of knowledge, such as Wikipedia, to complement the official parliament data. Participants will be able to adapt the example queries to match their interests. Secondly, we will analyse the resulting data in R using summary statistics, graphs, and other visualizations, and sample the data for close reading. Finally, we will run queries and visualisations to explore the quality and completeness of the data (e.g., with respect to missing translations), and discuss how to deal with these issues in a transparent way.
(Return to CLARIN-PLUS Workshop “Working with Parliamentary Records”.)
Hands-on session 2
By: Andreas Blätte (University of Duisberg, Germany)
Tiopic: From basic corpus analysis to more complex workflows: Using the ‘polmineR’-package as a toolkit for analysing parliamentary speeches
Abstract: In the context of the PolMine project (http://polmine.github.io), the R package ‘polmineR’ was developed to serve an environment to analyse corpora of parliamentary debates. It interfaces to the Corpus Workbench (CWB) which serves a backend to manage corpora efficiently. The workshop will introduce the core functionality of polmineR to perform basic tasks in corpus analysis, such as inspecting concordances, getting (dispersions) of counts, preparing co-occurrence statistics, or generating term document and term co-occurrence matrices. A particular focus of the package is to create and work with subcorpora, and to retrieve the full text of speeches. The polmineR package is intended to serve as a basis for implementing more complex workflows. In the second part of the session, I will explore how we might use the package to analyse diachronic meaning change, to perform (a simple dictionary-based) sentiment analysis, to generate training data for machine learning tasks, or to work with annotation data we may have from an annotation project.
(Return to CLARIN-PLUS Workshop “Working with Parliamentary Records”.)