Cross-Correlation of Beat-Synchronous Representations for Music Similarity

Ellis, Daniel P. W.; Cotton, Courtenay Valentine; Mandel, Michael I.

Systems to predict human judgments of music similarity directly from the audio have generally been based on the global statistics of spectral feature vectors i.e. collapsing any large-scale temporal structure in the data. Based on our work in identifying alternative ("cover") versions of pieces, we investigate using direct correlation of beat-synchronous representations of music audio to find segments that are similar not only in feature statistics, but in the relative positioning of those features in tempo-normalized time. Given a large enough search database, good matches by this metric should have very high perceived similarity to query items. We evaluate our system through a listening test in which subjects rated system-generated matches as similar or not similar, and compared results to a more conventional timbral and rhythmic similarity baseline, and to random selections.


Also Published In

2008 IEEE International Conference on Acoustics, Speech, and Signal Processing: ICASSP '08: Proceedings: March 30-April 4, 2008 Caesars Palace Las Vegas, Nevada, U.S.A.

More About This Work

Academic Units
Electrical Engineering
Published Here
June 27, 2012