Technical reports:
Similarity-based Multilingual Multi-Document Summarization
David Kirk Evans; Kathleen McKeown; Judith L. Klavans
Downloads:
- Title:
- Similarity-based Multilingual Multi-Document Summarization
- Author(s):
-
Evans, David Kirk
McKeown, Kathleen
Klavans, Judith L. - Date:
- 2005
- Type:
- Technical reports
- Department:
- Computer Science
- Permanent URL:
- http://hdl.handle.net/10022/AC:P:29166
- Series:
- Columbia University Computer Science Technical Reports
- Part Number:
- CUCS-014-05
- Publisher:
- Department of Computer Science, Columbia University
- Publisher Location:
- New York
- Abstract:
- We present a new approach for summarizing clusters of documents on the same event, some of which are machine translations of foreign-language documents and some of which are English. Our approach to multilingual multi-document summarization uses text similarity to choose sentences from English documents based on the content of the machine translated documents. A manual evaluation shows that 68\% of the sentence replacements improve the summary, and the overall summarization approach outperforms first-sentence extraction baselines in automatic ROUGE-based evaluations.
- Subject(s):
- Computer science
- Item views:
- 346