The Million Song Dataset

Bertin-Mahieux, Thierry; Ellis, Daniel P. W.; Whitman, Brian; Lamere, Paul

We introduce the Million Song Dataset, a freely-available collection of audio features and metadata for a million contemporary popular music tracks. We describe its creation process, its content, and its possible uses. Attractive features of the Million Song Database include the range of existing resources to which it is linked, and the fact that it is the largest current research dataset in our field. As an illustration, we present year prediction as an example application, a task that has, until now, been difficult to study owing to the absence of a large set of suitable data. We show positive results on year prediction, and discuss more generally the future development of the dataset.


Also Published In

ISMIR 2011: Proceedings of the 12th International Society for Music Information Retrieval Conference, October 24-28, 2011, Miami, Florida
University of Miami

More About This Work

Academic Units
Electrical Engineering
Published Here
June 25, 2012