The road to oral archives, from Tuscany to Wien

Submitted by Linda Stokman on 17 March 2020

Blog post written by Silvia Calamai (University of Siena) who received a CLARIN Mobility Grant to visit the Phonogrammarchiv, the world’s oldest sound archive and CLARIN (K)nowledge Centre to collaborate with the archive’s experts on long-term preservation of audio-visual data and archival practices (e.g. metadata). The visit took place from 16 to 18 February 2020.

The background – why I am here 

I knew hardly anything about CLARIN when I started working on Italian oral and speech archives 13 years ago. In 2011, the Region of Tuscany supported Prof. Pier Marco Bertinetto (Scuola Normale Superiore, Pisa) and me by funding a two-year project called Grammo-foni. Le soffitte della voce (, a project jointly conducted by the Scuola Normale Superiore and the University of Siena (Regione Toscana PAR FAS 2007-13). Although we conceived the project as linguists, we immediately realized how cross- disciplinary it was, and that we needed help from several disciplines. The creation of an archive incorporating the main oral archives of the region has involved different, interconnected stages of work. It has been necessary to lay the foundations for an interdisciplinary dialogue between linguistics, law, anthropology, informatics, and archival science. Tuscany is a privileged area for working on oral documents, as it abounds with both public and private sound archives, collected in different fields of research by scholars as well as amateurs. The majority of these archives are analogue and therefore risk deterioration unless they are transferred to the digital domain. The project has undertaken the challenging task of gathering different kinds of expertise and building a digitization and cataloguing system with the aim of creating a regional network for the management of sound archives ( Nevertheless, represented, again, only a starting point. Such a project can exist and live only in a broader universe, in order to overcome the fragmentation of single research projects carried out more or less in isolation by independent research groups.

I had the chance to present some challenges related to the project at two different CLARIN conferences (Aix-en Provence 2016, with Francesca Frontini; and Budapest 2017, where I also had the opportunity to get to know the Legal Issue Committee closely). Aleksei Kelli helped me a lot in understanding the conflicting issues arising when dealing with oral archives and Intangible Cultural Heritage, as the demand for open access conflicts with ownership and authorship rights and ethical issues. Thus, my research on sound archives has begun to analyze the possibility of reaching a balance between two conflicting demands: the need for openness and accessibility of Intangible Cultural Heritage - in my case represented by the content of several oral archives - vs. respecting all the rights related to ICH, e.g. copyright, intellectual property, and privacy law. In the most recent years of my academic work, I devoted special attention to the dissemination of oral heritage via new technologies, which requires a thorough reflection not only from the technological point of view, but also from the legal one, since most of the recordings that constitute our oral heritage were collected at a time when little or no attention was paid to the legal aspects of ICH.

The “Archivio Vi.vo” project and CLARIN-IT

I am now part of the speech archives task force for CLARIN-IT. In 2019 Regione Toscana decided to support the Archivio Vi.vo, project, which deals with the description and cataloguing  of Caterina Bueno’s oral archives and is also meant to support and advance activities regarding this topic in the CLARIN-IT consortium. The following partners are involved: Università degli Studi di Siena (Silvia Calamai), Soprintendenza Archivistica e Bibliografica della Toscana (Maria Francesca Stamuli), CNR-ILC and CLARIN-IT (Monica Monachini), and Unione dei comuni del Casentino (Pierangelo Bonazzoli). As far as copyright is concerned, the Association “Caterina Bueno”, which supports the project, has already been contacted.Caterina Bueno (1943-2007) was an Italian ethnomusicologist and singer (Photo 1). Her work as a researcher has been highly appreciated for its cultural value, as it resulted in the collection of many Tuscan and Central Italy’s folk songs that have been passed down orally from one generation to the next until the 20th century (when this century-old tradition began to vanish). She started travelling through the Tuscan countryside and villages recording Tuscan peasants, artisans, common men and women singing any kind of folk songs: lullabies, ottave (rhyming stanzas sung during improvised competitions between poets), stornelli (monostrophic songs), narrative songs, social and political songs, and much more.

Photo 1. Caterina Bueno together with Francesco De Gregori and Antonio de Rose (1971). Source:

Her audio archive was separated between two different locations: part of it was stored at Caterina’s heirs’ house, while the rest was kept by the former culture counsellor of the Municipality of San Marcello Pistoiese, in the Montagna Pistoiese, where a multi-media library was supposed to be set up. Unfortunately, disagreements and misunderstandings between the two parties have so far made the archive fragmented and inaccessible to the community. Both owners, independently, have turned to me for the reassembly of the whole archive in the digital domain, in respect of the artist’s wishes. After digitizing, the carriers were returned to their owners, who helped in finding an arrangement for the sound archive, which can be divided according to the following categories:

  1. field-research (investigations carried out in the Tuscan countryside from the late 50s to the end of the artist’s life);
  2. live performances (recordings of concerts and events);
  3. pre-performance rehearsals (recordings of rehearsals with musicians).

Photo 2. Compact cassette and its case along with a sheet of handwritten notes ( project)

Caterina Bueno’s sound archive is composed of 476 carriers (audio reels and compact cassettes), corresponding to nearly 714 hours of recording and was digitized during the PAR-FAS project (Grammo-foni. Le soffitte della voce, UNISI & SNS, The digitized audio recordings and their metadata will be systemized in a searchable electronic CLOUD- based environment (under the CLARIN-IT domain). This will foster the use of these audio materials in a wider scientific context as well as make them accessible to the general public.

Towards the future – what I am doing now

Now that the CLARIN world has become familiar to me, it represents my starting point. Last February I came to Wien with the following research topics in mind: i) Metadata standard for the description of dialectal and ethnomusicological data pertaining to the Bueno archive, and ii) Long-term preservation of audio data. I arrived at the Phonogrammarchiv on a clear and blue Monday morning. It has been exciting to enter the world’s oldest sound archive.

Photo 3. Silvia and Christian entering the Phonogrammarchiv, 17.02.2020

I spent the duration of my visit working with Dr. Christian Huber, but I also had the chance to discuss different research topics with Efstratios Nikolaros (about Greek speech areas in Southern Italy), Christian Liebl and Gerda Lechleitner (about the PhA’s Historical Collections ), Christiane Fennesz-Juhasz (about Roma oral archives), Bernhard Graf (about digitizing audio recordings on magnetic tape), and Johannes Spitzbart and Benjamin Fischer (about metadata standards and re-organizing metadata in an existent database). As for metadata description, we discussed the architecture and the vocabulary. Archives, collections, series, recording sessions, actors, bundles, access copy and conservation copy, we first had to find a common vocabulary to start our collaborative work on a common background. We do need to agree on the labels, in order to avoid misunderstanding. Oral historians usually employ a terminology that is different from that of phoneticians, and phoneticians use a terminology that differs from that of ethnomusicologists, and so on.

Photo 4. Christian, Silvia and Johannes

We then addressed the relationship between the document and the carrier (e.g. the compact cassette). Philology aims at reconstituting the ‘original’ document, as near as possible to the author’s intention. This apparently clear definition turns into a thorny problem in the domain of oral archives for the following reasons:

  1. Oral archives are usually created by different “authors” (i.e. interviewers, interviewees, secondary participants in the communicative event, archive owners, clients who commissioned the archives, etc.…);
  2. The documents often need to be re-organized during the cataloguing process with the content privileged at the expense of the carrier;
  3. Thus the concept of ‘document’ itself is controversial in many ways.

Before leaving, I gave a talk on Italian oral archives – discussing the problems we faced from an archival and a legal point of view. After the talk, I was somehow relieved listening to my colleagues’ work at the Phonogrammarchiv: a problem shared is a problem halved.

All in all, the mobility grant visit was indispensable for the best organization of my work with archives according to CLARIN best practices. Not only did the in-person collaboration allow me to pre-emptively address many technical issues, it also paved the way for my future research. Once back, I transferred what I learned to the post-doc students working in the Archivio Vi.vo project.

I would really like to thank all the staff at the Phonogrammarchiv for their kind and warm hospitality and I very much hope that I will have the opportunity to visit the Phonogrammarchiv again in the future!