Academic Commons

Articles

Size matters: An empirical study of neural network training for large vocabulary continuous speech recognition

Ellis, Daniel P. W.; Morgan, Nelson

We have trained and tested a number of large neural networks for the purpose of emission probability estimation in large vocabulary continuous speech recognition. In particular, the problem under test is the DARPA Broadcast News task. Our goal here was to determine the relationship between training time, word error rate, size of the training set, and size of the neural network. In all cases, the network architecture was quite simple, comprising a single large hidden layer with an input window consisting of feature vectors from 9 frames around the current time, with a single output for each of 54 phonetic categories. Thus far, simultaneous increases to the size of the training set and the neural network improve performance; in other words, more data helps, as does the training of more parameters. We continue to be surprised that such a simple system works as well as it does for complex tasks. Given a limitation in training time, however, there appears to be an optimal ratio of training patterns to parameters of around 25:1 in these circumstances. Additionally, doubling the training data and system size appears to provide diminishing returns of error rate reduction for the largest systems.

Files

Also Published In

Title
1999 IEEE International Conference on Acoustics, Speech, and Signal Processing: Proceedings: ICASSP99 Phoenix: March 15-19, 1999, Civic Plaza, Hyatt Regency, Phoenix, Arizona, U.S.A.
DOI
https://doi.org/10.1109/ICASSP.1999.759875

More About This Work

Academic Units
Electrical Engineering
Publisher
IEEE
Published Here
July 3, 2012
Academic Commons provides global access to research and scholarship produced at Columbia University, Barnard College, Teachers College, Union Theological Seminary and Jewish Theological Seminary. Academic Commons is managed by the Columbia University Libraries.