Reports

Integrating Categorization, Clustering, and Summarization for Daily News Browsing

Barzilay, Regina; Evans, David Kirk; Vasileios, Kemerlis; Sigelman, Sergey

Recently, there have been significant advances in several areas of language technology, including clustering, text categorization, and summarization. However, efforts to combine technology from these areas in a practical system for information access have been limited. In this paper, we present a system that integrates cutting-edge technology in these areas to automatically collect news articles from multiple sources, organize them and present them in both hierarchical and text summary form. Our system is publicly available and runs daily over real data. Through a sizable user evaluation, we show that users strongly prefer using the advanced features incorporated in our system, and that these features help users achieve more efficient browsing of news.

Subjects

Files

More About This Work

Academic Units
Computer Science
Publisher
Department of Computer Science, Columbia University
Series
Columbia University Computer Science Technical Reports, CUCS-023-03
Published Here
April 26, 2011