Academic Commons


Prediction-driven computational auditory scene analysis for dense sound mixtures

Ellis, Daniel P. W.

We interpret the sound reaching our ears as the combined effect of independent, sound-producing entities in the external world; hearing would have limited usefulness if were defeated by overlapping sounds. Computer systems that are to interpret real-world sounds -- for speech recognition or for multimedia indexing -- must similarly interpret complex mixtures. However, existing functional models of audition employ only data-driven processing incapable of making context-dependent inferences in the face of interference. We propose a prediction-driven approach to this problem, raising numerous issues including the need to represent any kind of sound, and to handle multiple competing hypotheses. Results from an implementation of this approach illustrate its ability to analyze complex, ambient sound scenes that would confound previous systems.


Also Published In

Proceedings of the Workshop on the Auditory Basis of Speech Perception: Keele University (UK) 15-19 July, 1996

More About This Work

Academic Units
Electrical Engineering
Keele University
Published Here
July 3, 2012
Academic Commons provides global access to research and scholarship produced at Columbia University, Barnard College, Teachers College, Union Theological Seminary and Jewish Theological Seminary. Academic Commons is managed by the Columbia University Libraries.