Home

Short-term audio-visual atoms for generic video concept classification

Wei Jiang; Courtenay Valentine Cotton; Shih-Fu Chang; Daniel P. W. Ellis; Alexander C. Loui

Title:
Short-term audio-visual atoms for generic video concept classification
Author(s):
Jiang, Wei
Cotton, Courtenay Valentine
Chang, Shih-Fu
Ellis, Daniel P. W.
Loui, Alexander C.
Date:
Type:
Articles
Department:
Electrical Engineering
Permanent URL:
Book/Journal Title:
MM '09: Proceedings of the 2009 ACM Multimedia Conference & co-located workshops: October 19-24, 2009, Beijing, China: AMC '09, CEA '09, EiMM '09, IMCE '09, LS-MMRM '09, MiFor '09, MSIADU '09, MTDL '09, SSCS '09, WSM '09, & WSMC '09
Publisher:
Association for Computing Machinery
Publisher Location:
New York
Abstract:
We investigate the challenging issue of joint audio-visual analysis of generic videos targeting at semantic concept detection. We propose to extract a novel representation, the Short-term Audio-Visual Atom (S-AVA), for improved concept detection. An S-AVA is defined as a short-term region track associated with regional visual features and background audio features. An effective algorithm, named Short-Term Region tracking with joint Point Tracking and Region Segmentation (STR-PTRS), is developed to extract S-AVAs from generic videos under challenging conditions such as uneven lighting, clutter, occlusions, and complicated motions of both objects and camera. Discriminative audio-visual codebooks are constructed on top of S-AVAs using Multiple Instance Learning. Codebook-based features are generated for semantic concept detection. We extensively evaluate our algorithm over Kodak's consumer benchmark video set from real users. Experimental results confirm significant performance improvements - over 120% MAP gain compared to alternative approaches using static region segmentation without temporal tracking. The joint audio-visual features also outperform visual features alone by an average of 8.5% (in terms of AP) over 21 concepts, with many concepts achieving more than 20%.
Subject(s):
Electrical engineering
Applied mathematics
Publisher DOI:
http://dx.doi.org/10.1145/1631272.1631277
Item views:
123
Metadata:
text | xml

In Partnership with the Center for Digital Research and Scholarship at Columbia University Libraries/Information Services | Terms of Use