Prediction-driven computational auditory scene analysis for dense sound mixtures

Ellis, Daniel P. W.

We interpret the sound reaching our ears as the combined effect of independent, sound-producing entities in the external world; hearing would have limited usefulness if were defeated by overlapping sounds. Computer systems that are to interpret real-world sounds -- for speech recognition or for multimedia indexing -- must similarly interpret complex mixtures. However, existing functional models of audition employ only data-driven processing incapable of making context-dependent inferences in the face of interference. We propose a prediction-driven approach to this problem, raising numerous issues including the need to represent any kind of sound, and to handle multiple competing hypotheses. Results from an implementation of this approach illustrate its ability to analyze complex, ambient sound scenes that would confound previous systems.


Also Published In

Proceedings of the Workshop on the Auditory Basis of Speech Perception: Keele University (UK) 15-19 July, 1996
Keele University

More About This Work

Academic Units
Electrical Engineering
Published Here
July 3, 2012