CLARIN Resource Families: Multimodal Corpora

Submitted by Linda Stokman on 9 December 2020

The CLARIN Resource Families initiative provides a user-friendly overview of the available language resources in the CLARIN infrastructure for researchers from digital humanities, social sciences and human language technologies.

This month CLARIN highlights multimodal corpora: data collections used to study how two or more modalities interface with one another in human communication. Multimodal corpora are often collections of video and speech recordings accompanied with transcriptions and gesture annotations, and multimodal corpora of textual data supplemented with images exist as well.

CLARIN currently offers 16 multimodal corpora. These corpora are richly annotated for various verbal and non-verbal elements of communication, such as body gesture, gaze direction, and head, eye, and lip movement.