1997 Articles
Computational auditory scene analysis exploiting speech-recognition knowledge
The field of computational auditory scene analysis (CASA) strives to build computer models of the human ability to interpret sound mixtures as the combination of distinct sources. A major obstacle to this enterprise is defining and incorporating the kind of high level knowledge of real-world signal structure exploited by listeners. Speech recognition, while typically ignoring the problem of nonspeech inclusions, has been very successful at deriving powerful statistical models of speech structure from training data. In this paper, we describe a scene analysis system that includes both speech and nonspeech components, addressing the problem of working backwards from speech recognizer output to estimate the speech component of a mixture. Ultimately, such hybrid approaches will require more radical adaptation of current speech recognition approaches.
Files
- waspaa97-asrcasa.pdf application/pdf 39.9 KB Download File
Also Published In
- Title
- 1997 IEEE ASSP Workshop on Applications of Signal Processing to Audio and Acoustics: October 19-22, Mohonk Mountain House, New Paltz, New York
- Publisher
- IEEE
- DOI
- https://doi.org/10.1109/ASPAA.1997.625625
More About This Work
- Academic Units
- Electrical Engineering
- Published Here
- July 3, 2012