Academic Commons

Presentations (Communicative Events)

Soundbite Detection in Broadcast News Domain

Hirschberg, Julia Bell; Maskey, Sameer R.

In this paper, we present results of a study designed to identify SOUNDBITES in Broadcast News. We describe a Conditional Random Field-based model for the detection of these included speech segments uttered by individuals who are interviewed or who are the subject of a news story. Our goal is to identify direct quotations in spoken corpora which can be directly attributable to particular individuals, as well as to associate these soundbites with their speakers. We frame soundbite detection as a binary classification problem in which each turn is categorized either as a soundbite or not. We use lexical, acoustic/prosodic and structural features on a turn level to train a CRF. We performed a 10-fold cross validation experiment in which we obtained an accuracy of 67.4 % and an F-measure of 0.566 which is 20.9 % and 38.6 % higher than a chance baseline. Index Terms: soundbite detection, speaker roles, speech summarization, information extraction.

Files

More About This Work

Academic Units
Computer Science
Publisher
Proceedings of Interspeech
Published Here
July 5, 2013
Academic Commons provides global access to research and scholarship produced at Columbia University, Barnard College, Teachers College, Union Theological Seminary and Jewish Theological Seminary. Academic Commons is managed by the Columbia University Libraries.