Emmy Noether Group
Computational Analysis of Music Audio Recordings: A Cross-Version Approach
DFG WE 6611/3-1
Funded by DFG's Emmy Noether Programme, this research group utilizes classical music datasets spanning multiple versions (score, recorded performances) of musical works to evaluate, understand, and improve deep-learning methods for music analysis and computational musicology. It thereby aims for robust and interpretable methods for music transcription, harmony, rhythmic, and structural analysis, as well as stylometry.
Cooperating partners:
- Prof. Dr. Meinard Müller, Friedrich-Alexander Universität Erlangen-Nürnberg (FAU), International Audio Laboratories Erlangen
- Prof. Geoffroy Peeters, Télécom Paris, Audio Data Analysis and Signal Processing group
Project duration: 2024–2027 (2030)
Abstract
The computational analysis of music audio recordings constitutes a highly interdisciplinary research area, involving domain knowledge from musicology and music theory as well as methods from signal processing and machine learning. From a computer science perspective, the variety and complexity of music audio data poses enormous challenges, which are specific to this domain. First, music is multi-faceted, being characterized by different semantic dimensions such as time, pitch, timbre, or style, which need to be disengtangled to obtain interpretable representations. Second, music analysis comprises hierarchically related tasks such as the estimation of pitches, chords, local keys, and global keys or the detection of onsets, beats, downbeats, and structural boundaries, suggesting the use of multi-task approaches. Third, music data is complex, consisting of highly correlated sources whose components overlap in time and frequency. Furthermore, musical notions are often ambiguous and subjective, thus demanding for interpretable methods and multiple annotators. Fourth, music scenarios are often data-scarce, lacking the availability of large amounts of annotated data. As a consequence, analysis methods frequently overfit to implicit biases in the training datasets, become sensitive to small perturbations, and do not generalize well to unseen data. The data scarcity poses particular challenges for deep-learning approaches, which are nowadays dominating the field. These approaches achieved substantial improvements for many music analysis tasks but often hit a kind of “glass ceiling” above which further progress is hard to achieve and to measure. To overcome this problem, this project adopts a cross-version approach by exploiting datasets of classical music, which contain several modalities (score and audio), several performances (interpretations and arrangements), and several annotations (multiple experts) for each musical work. Such datasets allow for transferring annotations between versions and for systematically evaluating the robustness of deep-learning methods by testing generalization along different dimensions, e. g., to other versions of a work, other works by a composer, or other composers from a historical period. As a main conceptual contribution, we apply and further develop such cross-version strategies, exploiting them to better understand the analysis methods and to improve these methods using suitable training and fusion strategies. Based on this cross-version approach, we address the specific challenges of music data, aiming for analysis methods that are of particular use for computational musicology, and progressing towards novel methodological strategies in the wider field of the digital humanities.
Project-Related Publications
-
Chiu, Ching-Yu, Lele Liu, Christof Weiß, and Meinard Müller. “Cross-Modal Approaches to Beat Tracking: A Case Study on Chopin Mazurkas”. Transaction of the International Society for Music Information Retrieval ({TISMIR}) 8, no. 1 (2025): 55-69. https://doi.org/10.5334/tismir.238.
-
Ding, Yiwei, Yannik Venohr, and Christof Weiß. “An Evaluation Strategy For Local Key Estimation: Exploiting Cross-Version Consistency”. In Proceedings of the 26th International Society for Music Information Retrieval Conference ({ISMIR}). Daejeon, South Korea, 2025.
-
Henzel, Benjamin, Meinard Müller, and Christof Weiß. “Style Evolution in Western Choral Music: A Corpus-Based Strategy”. Computational Humanities Research 1 (2025).
-
Liu, Lele, and Christof Weiß. “Unsupervised Domain Adaptation for Music Transcription: Exploiting Cross-Version Consistency”. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing ({ICASSP}). Hyderabad, India, 2025.
-
Venohr, Yannik, Yiwei Ding, and Christof Weiß. “Towards Robust Music Transcription By Measuring Cross-Version Consistency In Western Classical Music”. In Proceedings of the 26th International Society for Music Information Retrieval Conference ({ISMIR}). Daejeon, South Korea, 2025.
-
Ding, Yiwei, and Christof Weiß. “Towards Robust Local Key Estimation With a Musically Inspired Neural Network”. In Proceedings of the European Signal Processing Conference ({EUSIPCO}). Lyon, France: IEEE, 2024.
-
Liu, Lele, and Christof Weiß. “Utilizing Cross-Version Consistency for Domain Adaptation: A Case Study on Music Audio”. In International Conference on Learning Representations ({ICLR}), Tiny Papers. https://openreview.net/forum?id=ZNg3YQQKWT.


