REPOSITORY > RESULTS

Doctoral dissertation

Machine learning methodology for automatic metadata assignment in cultural heritage archives

Author(s): Luis Rei (Author), Dunja Mladenić (Supervisor)

Thesis defense date: 21.06.2024

Organization: MPŠ - Mednarodna podiplomska šola Jožefa Stefana

PID: 20.500.12556/ReVIS-13709

Views: 4 | Downloads: 6

Abstract

This thesis introduces a novel machine learning methodology for automatically assigning
metadata to digitized artifacts in cultural heritage. Cultural heritage is an example of a
domain that requires expert labeling, with few pre-existing labeled datasets and where simply
getting more labeled data is challenging. The societal importance of cultural heritage
lies in its role of safeguarding history, enhancing comprehension of the past, and nurturing
a sense of belonging for current and future generations. Digitization, in turn, helps protect
cultural artifacts from degradation and loss, while improving the accessibility of valuable
artifacts and documents. Metadata plays a crucial role in empowering the search and exploration
of large catalogs of artifacts, especially across country and language boundaries.
We create a method for deducing metadata from textual descriptions of digitized items,
enhancing its versatility by incorporating multiple languages and tasks to maximize the
utility of existing labeled data sources using transfer learning. We augment this method
with multimodal late fusion, combining text, image, and tabular classification. We show
that the multimodal model significantly outperforms any single modality in metadata inference.
Finally, we extend this work to a particular type of cultural heritage, literature.
Literature offers a platform for exploring human emotions, thoughts, and societal issues,
allowing readers to engage with different perspectives and narratives. In literature analysis,
the text itself can be considered the artifact, in contrast to the analysis of text descriptions
of digitized artifacts. We use semi-supervised methods to enable fine-grained multi-label
emotion analysis of literature without requiring pre-existing labeled data. Emotion detection
from text provides an excellent use case since emotions play a crucial role in shaping
the narrative and identifying emotions in text is challenging to both humans and automated
algorithms alike. We show our approach is a viable alternative to expensive and
time-consuming manual labeling for creating supervised datasets. Our approach is underpinned
by neural network-based representation learning, employing transfer learning in one
instance and semi-supervised techniques in another. We validate our methodology through
empirical findings across domains such as digitized silk fabrics, literature, and healthcare.
We showcase the effectiveness of transfer learning in these diverse contexts, and underscore
the benefits of multimodal approaches.

Attachments

Cite this work