Academic Commons

Articles

The ICSI Meeting Corpus

Janin, Adam; Baron, Don; Edwards, Jane; Ellis, Daniel P. W.; Gelbart, David; Morgan, Nelson; Peskin, Barbara; Pfau, Thilo; Shriberg, Elizabeth; Stolcke, Andreas; Wooters, Chuck

We have collected a corpus of data from natural meetings that occurred at the International Computer Science Institute (ICSI) in Berkeley, California over the last three years. The corpus contains audio recorded simultaneously from head-worn and table-top microphones, word-level transcripts of meetings, and various metadata on participants, meetings, and hardware. Such a corpus supports work in automatic speech recognition, noise robustness, dialog modeling, prosody, rich transcription, information retrieval, and more. We present details on the contents of the corpus, as well as rationales for the decisions that led to its configuration. The corpus were delivered to the Linguistic Data Consortium (LDC).

Geographic Areas

Files

Also Published In

Title
2003 IEEE International Conference on Acoustics, Speech, and Signal Processing: Proceedings: April 6-10, 2003, Hong Kong Exhibition and Convention Centre, Hong Kong
Publisher
IEEE
DOI
https://doi.org/10.1109/ICASSP.2003.1198793

More About This Work

Academic Units
Electrical Engineering
Published Here
June 29, 2012
Academic Commons provides global access to research and scholarship produced at Columbia University, Barnard College, Teachers College, Union Theological Seminary and Jewish Theological Seminary. Academic Commons is managed by the Columbia University Libraries.