Home

The ICSI Meeting Corpus

Adam Janin; Don Baron; Jane Edwards; Daniel P. W. Ellis; David Gelbart; Nelson Morgan; Barbara Peskin; Thilo Pfau; Elizabeth Shriberg; Andreas Stolcke; Chuck Wooters

Title:
The ICSI Meeting Corpus
Author(s):
Janin, Adam
Baron, Don
Edwards, Jane
Ellis, Daniel P. W.
Gelbart, David
Morgan, Nelson
Peskin, Barbara
Pfau, Thilo
Shriberg, Elizabeth
Stolcke, Andreas
Wooters, Chuck
Date:
Type:
Articles
Department:
Electrical Engineering
Permanent URL:
Book/Journal Title:
2003 IEEE International Conference on Acoustics, Speech, and Signal Processing: Proceedings: April 6-10, 2003, Hong Kong Exhibition and Convention Centre, Hong Kong
Publisher:
IEEE
Publisher Location:
Piscataway, N.J.
Abstract:
We have collected a corpus of data from natural meetings that occurred at the International Computer Science Institute (ICSI) in Berkeley, California over the last three years. The corpus contains audio recorded simultaneously from head-worn and table-top microphones, word-level transcripts of meetings, and various metadata on participants, meetings, and hardware. Such a corpus supports work in automatic speech recognition, noise robustness, dialog modeling, prosody, rich transcription, information retrieval, and more. We present details on the contents of the corpus, as well as rationales for the decisions that led to its configuration. The corpus were delivered to the Linguistic Data Consortium (LDC).
Subject(s):
Technical communication
Artificial intelligence
Publisher DOI:
http://dx.doi.org/10.1109/ICASSP.2003.1198793
Item views:
61
Metadata:
text | xml

In Partnership with the Center for Digital Research and Scholarship at Columbia University Libraries/Information Services | Terms of Use