Tandem connectionist feature stream extraction for conventional HMM systems
Hynek Hermansky; Daniel P. W. Ellis; Sangita Sharma
- Tandem connectionist feature stream extraction for conventional HMM systems
Ellis, Daniel P. W.
- Electrical Engineering
- Permanent URL:
- Book/Journal Title:
- 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing: Proceedings, 5-9 June, 2000, Hilton Hotel and Convention Center, Istanbul, Turkey
- Publisher Location:
- Piscataway, N.J.
- Hidden Markov model speech recognition systems typically use Gaussian mixture models to estimate the distributions of decorrelated acoustic feature vectors that correspond to individual subword units. By contrast, hybrid connectionist-HMM systems use discriminatively-trained neural networks to estimate the probability distribution among subword units given the acoustic observations. In this work we show a large improvement in word recognition performance by combining neural-net discriminative feature processing with Gaussian-mixture distribution modeling. By training the network to generate the subword probability posteriors, then using transformations of these estimates as the base features for a conventionally-trained Gaussian-mixture based system, we achieve relative error rate reductions of 35% or more on the multicondition Aurora noisy continuous digits task
- Item views: