Topic Shift Detection - finding new information in threaded news

Radev, Dragomir R.

On-line sources of news typically follow a particular pattern when presenting updates on a news event overtime. First, they produce a preliminary report on the event, and later send out updates as the story evolves. There are two classes of readers accessing the latter stories - these who have read the original announcement and are familiar with the story background and those who are "joining" the thread at a later point in time. Because of the existence of the two classes of readers, news sources typically include in consequent stories some information that was already present in earlier stories. We discuss our approach to identifying such repeated pieces of information in news threads and show how this knowledge can help in generating user-specific summaries of entire threads of articles.



More About This Work

Academic Units
Computer Science
Department of Computer Science, Columbia University
Columbia University Computer Science Technical Reports, CUCS-026-99
Published Here
April 25, 2011