Academic Commons

Presentations (Communicative Events)

On-the-fly Topic Adaptation for YouTube Video Transcription

Thadani, Kapil; Biadsy, Fadi; Bikel, Daniel

Automatic closed-captioning of video is a useful application of speech recognition technology but poses numerous challenges when applied to open-domain user-uploaded videos such as those on YouTube. In this work, we explore a strategy to improve decoding accuracy for video transcription by decoding each video with a language model (LM) adapted specifically to the topics that the video covers. Taxonomic topic classifiers are used to determine the topic content of videos and to build a large set of topic-specific LMs from web documents. We consider strategies for selecting and interpolating LMs in both supervised and unsupervised scenarios in a two-pass lattice rescoring framework. Experiments on a YouTube video corpus show a 3.6 absolute reduction in WER over generic single-pass transcriptions as well as a statistically significant 0.8 absolute improvement over rescoring with a very large non-adapted LM built from all the documents.

Subjects

Files

More About This Work

Academic Units
Computer Science
Published Here
April 26, 2013