Melody Transcription From Music Audio: Approaches and Evaluation
Graham E. Poliner; Daniel P. W. Ellis; Andreas F. Ehmann; Emilia Gomez; Sebastian Streich; Beesuan Ong
- Melody Transcription From Music Audio: Approaches and Evaluation
Poliner, Graham E.
Ellis, Daniel P. W.
Ehmann, Andreas F.
- Electrical Engineering
- Permanent URL:
- Book/Journal Title:
- IEEE Transactions on Audio, Speech, and Language Processing
- Although the process of analyzing an audio recording of a music performance is complex and difficult even for a human listener, there are limited forms of information that may be tractably extracted and yet still enable interesting applications. We discuss melody--roughly, the part a listener might whistle or hum--as one such reduced descriptor of music audio, and consider how to define it, and what use it might be. We go on to describe the results of full-scale evaluations of melody transcription systems conducted in 2004 and 2005, including an overview of the systems submitted, details of how the evaluations were conducted, and a discussion of the results. For our definition of melody, current systems can achieve around 70% correct transcription at the frame level, including distinguishing between the presence or absence of the melody. Melodies transcribed at this level are readily recognizable, and show promise for practical applications.
- Publisher DOI:
- Item views: