Locating Singing Voice Segments within Music Signals

Berenzweig, Adam; Ellis, Daniel P. W.

A sung vocal line is the prominent feature of much popular music. It would be useful to locate the portions of a musical track during which the vocals are present reliably, both as a 'signature' of the piece and as a precursor to automatic recognition of lyrics. We approach this problem by using the acoustic classifier of a speech recognizer as a detector for speech-like sounds. Although singing (including a musical background) is a relatively poor match to an acoustic model trained on normal speech, we propose various statistics of the classifier's output in order to discriminate singing from instrumental accompaniment. A simple HMM allows us to find a best labeling sequence for this uncertain data. On a test set of forty 15 second excerpts of randomly-selected music, our classifier achieved around 80% classification accuracy at the frame level. The utility of different features, and our plans for eventual lyrics recognition, are discussed.


Also Published In

Proceedings of the 2001 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics: October 21-24, 2001, Mohonk Mountain House, New Paltz, New York, USA

More About This Work

Academic Units
Electrical Engineering
Published Here
July 3, 2012