1999 Articles
Using knowledge to organize sound: The prediction-driven approach to computational auditory scene analysis and its application to speech/nonspeech mixtures
Computational auditory scene analysis — modeling the human ability to organize sound mixtures according to their sources — has experienced a rapid evolution from simple implementations of psychoacoustically inspired rules to complex systems able to process demanding real-world sounds. Phenomena such as the continuity illusion and phonemic restoration show that the brain is able to use a wide range of knowledge-based contextual constraints when interpreting obscured or overlapping mixtures: To model such processing, we need architectures that operate by confirming hypotheses about the observations rather than relying on directly extracted descriptions. One such architecture, the 'prediction-driven' approach, is presented along with results from its initial implementation. This architecture can be extended to take advantage of the high-level knowledge implicit in today's speech recognizers by modifying a recognizer to act as one of the 'component models' providing the explanations of the signal mixture. A preliminary investigation indicates the viability of this approach while at the same time raising a number of issues which are discussed. These results point to the conclusion that successful scene analysis must, at every level, exploit abstract knowledge about sound sources.
Subjects
Files
-
S0167-6393_98_00083-1.pdf application/pdf 218 KB Download File
Also Published In
- Title
- Speech Communication
- DOI
- https://doi.org/10.1016/S0167-6393(98)00083-1
More About This Work
- Academic Units
- Electrical Engineering
- Published Here
- February 14, 2012