Large-scale digitisation in the last decades changed the way of engaging and using Galleries, Libraries, Archives and Museums’ (GLAMs) collections. From personal observation and study of individual objects, it paved the way to rethinking collections as data and is a defining factor in two dimensions: first, scaling up analysis of artefacts and second, analysis of hybrid collections representing different types of heritage (movable and immovable; tangible and intangible; written; audio-visual; performing arts).
Innovation labs emerged in many GLAM institutions serving as a human connector between the users, technology and digitised collections. The International GLAM Labs community helps to understand how current practices of digital transformation in the GLAM sectors develop and is instrumental in sharing knowledge about collections as data – a transition in the making since mid-2010s.
A new recently published preprint by an international team of practitioners from the GLAM sectors and academics is exploring how GLAM institutions can introduce efficient services around the collections as data mindset. It offers a checklist that can be used for both creating and evaluating digital collections suitable for computational use.
Datafication is also an essential condition for the contribution of institutions which serve as data owners to the emerging data space for cultural heritage spearheaded by Europeana.
The paper provides an overview of recent work in this area, supported by data from a survey with GLAM Institutions. The survey gives an idea of some stumbling blocks for institutions which already experimented with collections as data. For example, the question “What were the main issues that you encountered when starting to prepare Collections as data?” has as the most popular response “data preparation” followed by “structure of the dataset” and “licencing”. Aspects related to API access, publishing platforms and metadata were mentioned by some of the respondents as well.
The survey also confirms the need for more practical knowledge and training – for example, the most popular answers to the question “What information would you like to have/have liked to have had when starting to work towards Collections as data? What knowledge would have made it easier?” got responses such as “implementation examples”, “data preparation”, “institutional strategy and support” and “general how-to information” with areas of less demand including “machine-readable metadata”, “storage”, “GDPR”, “computational expertise” and “data selection”.
The literature review, the survey and the institutional experiences of the authors informed the development of a checklist which should help institutions prepare datasets for publishing and computational use. This checklist was used to assess a selection of datasets from institutions, including The British Library, the National Library of Scotland, the Library of Congress, the Royal Danish Library, Meemoo (offering access to objects from Flemish museums and cultural institutions), Miguel de Cervantes Virtual Library.
The paper also includes two extensive and inspirational case studies from Belgian libraries. The first one discussed the use of the checklist as a tool for implementing the Collections as data principle at KU Leuven Libraries, and the second one describes it place within the development of a data platfrom at KBR, the Royal Library of Belgium.
While all these assessments of the checklist and in-depth exampes come from large institutions, the validation of the checklist against the practice of some of the international leaders within the collections as data movement is a useful exercise demonstrating its potentially useful place within the institutional practice.
The publication of the checklist attracted quickly a substantial interest.The preprint got a very high Attention Score on Altmetric compared to outputs of the same age and source (arXiv) (within the 98th percentile) in the first few days after the publicaiton.
Further work will be needed to experiment with how the checklist can be applied by smaller institutions which have a different set of issues related to datafication. However, this initial experience also suggests that it would be useful to integrate information on the checklist into different professional development programmes within the GLAM sectors.