ismir25_rach3midi_supp

Supplementary Materials for the paper “Enabling Empirical Analysis of Piano Performance Rehearsal with the Rach3 MIDI Dataset” accepted at ISMIR 2025.

Abstract

Piano performance analysis is a well-studied field in MIR, owing to the availability of open datasets of piano performance. However, pianists spend more time rehearsing than performing, and the process of piano rehearsals remains understudied. The study of piano rehearsals can offer interesting insights into the strategies adopted by a pianist in order to learn, interpret and eventually perform musical pieces. Studying the process of rehearsal requires computational methods that differ from those used for piano performance, due to challenges like mistakes, repetitions of musical segments, or forward and backward skips to sections in the piece. The scarcity of publicly available rehearsal data limits the empirical understanding of these challenges.

We release the Rach3 MIDI Dataset, an openly available collection of over 3000 MIDI files containing more than 750 hours of recordings of piano rehearsals by four pianists (3 advanced, 1 beginner), collected over a period of more than 4 years. This dataset records the progression of pianists learning new repertoire, as well as practicing familiar pieces, all in the Western Classical tradition.

This paper further introduces possible avenues of using this dataset for the computational analysis of piano practice such as rehearsal structure analysis, rehearsal-to-score alignment and mistake identification. We also discuss the challenges and limitations of using state of the art methods for piano performance analysis for this type of data. In addition, we provide the code that was used to preprocess and analyze the recorded rehearsals.

Resources