Detecting local semantic concepts in environmental sounds using Markov model based clustering
Detecting the time of occurrence of an acoustic event (for instance, a cheer) embedded in a longer soundtrack is useful and important for applications such as search and retrieval in consumer video archives. We present a Markov-model based clustering algorithm able to identify and segment consistent sets of temporal frames into regions associated with different ground-truth labels, and simultaneously to exclude a set of uninformative frames shared in common from all clips. The labels are provided at the clip level, so this refinement of the time axis represents a variant of Multiple-Instance Learning (MIL). Evaluation shows that local concepts are effectively detected by this clustering technique based on coarse-scale labels, and that detection performance is significantly better than existing algorithms for classifying real-world consumer recordings.
- LeeEL10-localconcepts.pdf application/pdf 194 KB Download File
Also Published In
- 2010 IEEE International Conference on Acoustics, Speech, and Signal Processing: Proceedings: March 14-19, 2010, Sheraton Dallas Hotel, Dallas, Texas, U.S.A.
More About This Work
- Academic Units
- Electrical Engineering
- Published Here
- June 26, 2012