Standards and Formats

Basic Principles

CLARIN adheres to the following principles:

  • Open standards are preferred over proprietary standards
  • Formats and protocols should be:
    • Well-documented
    • Verifiable
    • Proven (being used in practice)
  • Text-based formats are (where possible) preferred over binary formats
  • In the case of digitisation of an analogue signal, using no or lossless compression is recommended.

Learning More

Relevant Formats

Several CLARIN centres have published information on what formats they recommend for language research data depositions:

Relevant Standards

The CLARIN Standards Information System (provided by IDS Mannheim) provides information on standards in general and on standards used by the particular centres. As of spring 2020, the system will undergo modifications to reflect centres' recommendations concerning formats for data depositions.