Computational Humanities


Computational Analysis of Music Audio Recordings:
A Cross-Version Approach (WE 6611/3-1)

Emmy Noether Programme,  DFG

Cooperating partners: 

Project duration: 2024–2027 (2030)


The computational analysis of music audio recordings constitutes a highly interdisciplinary research area, involving domain knowledge from musicology and music theory as well as methods from signal processing and machine learning. From a computer science perspective, the variety and complexity of music audio data poses enormous challenges, which are specific to this domain. First, music is multi-faceted, being characterized by different semantic dimensions such as time, pitch, timbre, or style, which need to be disengtangled to obtain interpretable representations. Second, music analysis comprises hierarchically related tasks such as the estimation of pitches, chords, local keys, and global keys or the detection of onsets, beats, downbeats, and structural boundaries, suggesting the use of multi-task approaches. Third, music data is complex, consisting of highly correlated sources whose components overlap in time and frequency. Furthermore, musical notions are often ambiguous and subjective, thus demanding for interpretable methods and multiple annotators. Fourth, music scenarios are often data-scarce, lacking the availability of large amounts of annotated data. As a consequence, analysis methods frequently overfit to implicit biases in the training datasets, become sensitive to small perturbations, and do not generalize well to unseen data. The data scarcity poses particular challenges for deep-learning approaches, which are nowadays dominating the field. These approaches achieved substantial improvements for many music analysis tasks but often hit a kind of “glass ceiling” above which further progress is hard to achieve and to measure. To overcome this problem, this project adopts a cross-version approach by exploiting datasets of classical music, which contain several modalities (score and audio), several performances (interpretations and arrangements), and several annotations (multiple experts) for each musical work. Such datasets allow for transferring annotations between versions and for systematically evaluating the robustness of deep-learning methods by testing generalization along different dimensions, e. g., to other versions of a work, other works by a composer, or other composers from a historical period. As a main conceptual contribution, we apply and further develop such cross-version strategies, exploiting them to better understand the analysis methods and to improve these methods using suitable training and fusion strategies. Based on this cross-version approach, we address the specific challenges of music data, aiming for analysis methods that are of particular use for  computational musicology, and progressing towards novel methodological strategies in the wider field of the digital humanities.

Computer-Assisted Analysis of Harmonic Structures 
(MU 2686/7-2, KL 864/4-2)

Cooperating partners: 

Project duration: 2019–2024


This is a follow-up project continuing the previous DFG-funded project "Computergestützte Analyse harmonischer Strukturen" [MU 2686/7-1, KL 864/4-1]. Our interdisciplinary project deals with the development of automated techniques for the analysis of harmonic structures. On a broader level, we aim at investigating to which extent musicology may benefit from using computer-based methods and, vice versa, musicological research may introduce new scientific challenges into computer science. In addition to the development of computer-based analysis techniques, our further goal is to explore novel navigation and visualization concepts that allow researchers to browse, search, and analyze large music collections with regard to harmonic structures in an intuitive and interactive way. The concepts are paradigmatically developed, verified, and discussed on the basis of concrete music corpora. In particular, in the case of the tetralogy "Der Ring des Nibelungen" by Richard Wagner, unknown structural relationships may be discovered, thus gaining new musicological insights. In this follow-up project, we significantly extend the objectives of the previous project. By considering further parameters, we aim at expanding and refining the harmonic analyses. In addition to harmonic structures, musical aspects such as motifs, instrumentation, and performance practice as well as their interrelations are subject of our computer-assisted analyses. The two main corpora prepared in the first project phase, Beethoven's piano sonatas and Wagner's "Ring" (including the symbolically encoded scores and annotated music recordings), provide an excellent basis for these subsequent studies. The continuation of the project shall deepen a spirit of openness, mutual interest, and long-term thinking, which may serve as a positive example of interdisciplinary collaboration in the field of Digital Humanities.

Learning Tonal Representations of Music Signals Using Deep Neural Networks (WE 6611/1-1)

DFG research fellowship

Cooperating partner / host: Audio Data Analysis and Signal Processing group, Department Image, Data, Signal  at University Télécom Paris. Main supervisor: Prof. Geoffroy Peeters

Project duration: 01/2021–12/2021 (finished)


With the growing impact of technology, musicological research is subject to a fundamental transformation. Digitized data and specialized algorithms enable systematic analyses of large music corpora. Recently, such corpus studies were performed based on audio recordings involving methods from digital signal processing and machine learning. In this context, the tonal analysis of the music signals regarding chords, scales, or keys plays a significant role. Traditional analysis methods rely on signal processing techniques to extract tonal feature representations that indicate the presence of musical pitch classes over time, thus allowing for an explicit semantic interpretation. The objective of this project is to use deep neural networks for learning tonal representations, which are interpretable, robust, and invariant regarding timbre, instrumentation, and acoustic conditions. The project builds on complex scenarios of classical music where time-aligned scores and multiple performances of the pieces can be used for training, validating, and testing the algorithms. From a technical perspective, this project investigates approaches for learning pitch-class, multi-pitch, and salience representations. Among others, sequence learning techniques that can handle weakly-aligned annotations and U-net architectures that are inspired by hierarchical musical structures will be explored. Applying the learned representations to complex music scenarios aims for developing robust tonal analysis methods by exploiting the potential of novel deep-learning algorithms, thus paving the way towards a new level of computational music research.