Skip to main content

Tour de CLARIN: Iceland

IceTaboo: Offensive Word Database with Commercial Application

The Project, The IceTaboo database is a novel resource for processing offensive words in Icelandic. Developed by a small team at the Language and Technology Lab at
'Our lab is really focused on collaborating with industry. We really want our work to benefit the public.'  Agnes Sólmundsdóttir, btn-arrow-circle, rannsoknarstofa_logo_whiteOnBlack.png, image-left
 
Methodology, The IceTaboo database consists of a list of words in Icelandic that may be considered inappropriate, taboo or loaded in use or meaning. The list inclu
Access the IceTaboo database, btn-arrow-circle, clarin-logo-is.png, image-left
 
Outcome, As part of the GreynirCorrect automatic proofreading software, the IceTaboo database is already being used to highlight inappropriate words at the Ice
  This screenshot from the correction software interface shows how it appears to users. Here, IceTaboo has flagged the word 'hjúkrunarkona', explai
‘Our lab and the language technology community in Iceland emphasises licences that make all products easily reusable. In this case we used the Creativ, btn-arrow-circle, image-right
According to project leader Agnes Sólmundsdóttir, other Icelandic companies working with text have also shown interest in integrating the correction s
GreynirCorrect on Github, btn-arrow-circle, pro_greynir.dd95ec5836c19bfcc27e.svg, image-left
 
Views on CLARIN, 'We deposited our database at CLARIN. It’s a really well-respected platform for language technology tools. Our lab and the language technology communi
 
Anton Karl Ingason, Associate Professor at the University of Iceland, and Director of the Language and Technology Lab Agnes Sólmundsdóttir, Researc
 
Miðeind:, btn-arrow-circle

Tour de CLARIN: IceNLP

IceNLP is an open source toolkit for processing and analysing Icelandic text that is available through the CLARIN-IS repository.