Home

Machine Learning and Text Segmentation in Novelty Detection

Barry Schiffman; Kathleen McKeown

Title:
Machine Learning and Text Segmentation in Novelty Detection
Author(s):
Schiffman, Barry
McKeown, Kathleen
Date:
Type:
Technical reports
Department:
Computer Science
Permanent URL:
Series:
Columbia University Computer Science Technical Reports
Part Number:
CUCS-036-04
Abstract:
This paper explores a combination of machine learning, approximate text segmentation and a vector-space model to distinguish novel information from repeated information. In experiments with the data from the Novelty Track at the Text Retrieval Conference, we show improvements over a variety of approaches, in particular in raising precision scores on this data, while maintaining a reasonable amount of recall.
Subject(s):
Computer science
Item views:
146
Metadata:
text | xml

In Partnership with the Center for Digital Research and Scholarship at Columbia University Libraries/Information Services | Terms of Use