Academic Commons


Detecting local semantic concepts in environmental sounds using Markov model based clustering

Lee, Keansub; Ellis, Daniel P. W.; Loui, Alexander C.

Detecting the time of occurrence of an acoustic event (for instance, a cheer) embedded in a longer soundtrack is useful and important for applications such as search and retrieval in consumer video archives. We present a Markov-model based clustering algorithm able to identify and segment consistent sets of temporal frames into regions associated with different ground-truth labels, and simultaneously to exclude a set of uninformative frames shared in common from all clips. The labels are provided at the clip level, so this refinement of the time axis represents a variant of Multiple-Instance Learning (MIL). Evaluation shows that local concepts are effectively detected by this clustering technique based on coarse-scale labels, and that detection performance is significantly better than existing algorithms for classifying real-world consumer recordings.


Also Published In

2010 IEEE International Conference on Acoustics, Speech, and Signal Processing: Proceedings: March 14-19, 2010, Sheraton Dallas Hotel, Dallas, Texas, U.S.A.

More About This Work

Academic Units
Electrical Engineering
Published Here
June 26, 2012
Academic Commons provides global access to research and scholarship produced at Columbia University, Barnard College, Teachers College, Union Theological Seminary and Jewish Theological Seminary. Academic Commons is managed by the Columbia University Libraries.