Using knowledge to organize sound: The prediction-driven approach to computational auditory scene analysis and its application to speech/nonspeech mixtures
Daniel P. W. Ellis
- Using knowledge to organize sound: The prediction-driven approach to computational auditory scene analysis and its application to speech/nonspeech mixtures
- Ellis, Daniel P. W.
- Electrical Engineering
- Permanent URL:
- Book/Journal Title:
- Speech Communication
- Computational auditory scene analysis — modeling the human ability to organize sound mixtures according to their sources — has experienced a rapid evolution from simple implementations of psychoacoustically inspired rules to complex systems able to process demanding real-world sounds. Phenomena such as the continuity illusion and phonemic restoration show that the brain is able to use a wide range of knowledge-based contextual constraints when interpreting obscured or overlapping mixtures: To model such processing, we need architectures that operate by confirming hypotheses about the observations rather than relying on directly extracted descriptions. One such architecture, the 'prediction-driven' approach, is presented along with results from its initial implementation. This architecture can be extended to take advantage of the high-level knowledge implicit in today's speech recognizers by modifying a recognizer to act as one of the 'component models' providing the explanations of the signal mixture. A preliminary investigation indicates the viability of this approach while at the same time raising a number of issues which are discussed. These results point to the conclusion that successful scene analysis must, at every level, exploit abstract knowledge about sound sources.
- Publisher DOI:
- Item views:
Additional metadata is currently unavailable for this item.