Handling Asynchrony in Audio-Score Alignment

Johanna Devaney; Daniel P. W. Ellis

Devaney, Johanna
Ellis, Daniel P. W.
Electrical Engineering
Proceedings of the 2009 international computer music conference: ICMC 09, Montreal, QC, Canada
International Computer Music Association
San Francisco
Aligning a canonical score to an audio recording of a musical performance can provide very good information about the timing of individual notes. However, a score representation frequently treats multiple note events as simultaneous, whereas in reality different performers will start notes at slightly differing times, and these timing details may be significant in the analysis of performance and expression. Using an example of a four-part a cappella vocal piece where each voice was recorded separately, we compare note onset and offset times obtained by manual annotation to three difference types of alignment: forced alignment of each part individually to its corresponding track, simultaneous alignment of the polyphonic score to the full audio, and independent alignment of single parts to the polyphonic audio. In each case, we examine the kinds of errors that occur. We discuss how standard dynamic time warping may be extended so that it retains the advantages of polyphonic alignment while allowing ostensibly simultaneous notes to have different onset and offset times.
Electrical engineering
Johanna Devaney, Daniel P. W. Ellis, 2009, Handling Asynchrony in Audio-Score Alignment, Columbia University Academic Commons, http://hdl.handle.net/10022/AC:P:13659.

