2003 Reports
Integrating Categorization, Clustering, and Summarization for Daily News Browsing
Recently, there have been significant advances in several areas of language technology, including clustering, text categorization, and summarization. However, efforts to combine technology from these areas in a practical system for information access have been limited. In this paper, we present a system that integrates cutting-edge technology in these areas to automatically collect news articles from multiple sources, organize them and present them in both hierarchical and text summary form. Our system is publicly available and runs daily over real data. Through a sizable user evaluation, we show that users strongly prefer using the advanced features incorporated in our system, and that these features help users achieve more efficient browsing of news.
Subjects
Files
-
cucs-023-03.pdf application/pdf 154 KB Download File
More About This Work
- Academic Units
- Computer Science
- Publisher
- Department of Computer Science, Columbia University
- Series
- Columbia University Computer Science Technical Reports, CUCS-023-03
- Published Here
- April 26, 2011