2011 Articles
Soundtrack classification by transient events
We present a method for video classification based on information in the soundtrack. Unlike previous approaches which describe the audio via statistics of mel-frequency cepstral coefficient (MFCC) features calculated on uniformly-spaced frames, we investigate an approach to focusing our representation on audio transients corresponding to sound-track events. These event-related features can reflect the "foreground" of the soundtrack and capture its short-term temporal structure better than conventional frame-based statistics. We evaluate our method on a test set of 1873 YouTube videos labeled with 25 semantic concepts. Retrieval results based on transient features alone are comparable to an MFCC-based system, and fusing the two representations achieves a relative improvement of 7.5% in mean average precision (MAP).
Subjects
Files
- CottonE11-transient.pdf application/pdf 745 KB Download File
Also Published In
- Title
- 2011 IEEE International Conference on Acoustics, Speech, and Signal Processing: Proceedings: May 22-27, 2011 Prague Congress Center, Prague, Czech Republic
- Publisher
- IEEE
- DOI
- https://doi.org/10.1109/ICASSP.2011.5946443
More About This Work
- Academic Units
- Electrical Engineering
- Published Here
- June 25, 2012