Home

Similarity-based Multilingual Multi-Document Summarization

David Kirk Evans; Kathleen McKeown; Judith L. Klavans

Title:
Similarity-based Multilingual Multi-Document Summarization
Author(s):
Evans, David Kirk
McKeown, Kathleen
Klavans, Judith L.
Date:
Type:
Technical reports
Department:
Computer Science
Permanent URL:
Series:
Columbia University Computer Science Technical Reports
Part Number:
CUCS-014-05
Publisher:
Department of Computer Science, Columbia University
Publisher Location:
New York
Abstract:
We present a new approach for summarizing clusters of documents on the same event, some of which are machine translations of foreign-language documents and some of which are English. Our approach to multilingual multi-document summarization uses text similarity to choose sentences from English documents based on the content of the machine translated documents. A manual evaluation shows that 68\% of the sentence replacements improve the summary, and the overall summarization approach outperforms first-sentence extraction baselines in automatic ROUGE-based evaluations.
Subject(s):
Computer science
Item views:
411
Metadata:
text | xml

In Partnership with the Center for Digital Research and Scholarship at Columbia University Libraries/Information Services | Terms of Use