Technical reports:
Machine Learning and Text Segmentation in Novelty Detection
Barry Schiffman; Kathleen McKeown
Downloads:
- Title:
- Machine Learning and Text Segmentation in Novelty Detection
- Author(s):
-
Schiffman, Barry
McKeown, Kathleen - Date:
- 2004
- Type:
- Technical reports
- Department:
- Computer Science
- Permanent URL:
- http://hdl.handle.net/10022/AC:P:29238
- Series:
- Columbia University Computer Science Technical Reports
- Part Number:
- CUCS-036-04
- Publisher:
- Department of Computer Science, Columbia University
- Publisher Location:
- New York
- Abstract:
- This paper explores a combination of machine learning, approximate text segmentation and a vector-space model to distinguish novel information from repeated information. In experiments with the data from the Novelty Track at the Text Retrieval Conference, we show improvements over a variety of approaches, in particular in raising precision scores on this data, while maintaining a reasonable amount of recall.
- Subject(s):
- Computer science
- Item views:
- 120