Academic Commons

Reports

Similarity-based Multilingual Multi-Document Summarization

Evans, David Kirk; McKeown, Kathleen; Klavans, Judith L.

We present a new approach for summarizing clusters of documents on the same event, some of which are machine translations of foreign-language documents and some of which are English. Our approach to multilingual multi-document summarization uses text similarity to choose sentences from English documents based on the content of the machine translated documents. A manual evaluation shows that 68\% of the sentence replacements improve the summary, and the overall summarization approach outperforms first-sentence extraction baselines in automatic ROUGE-based evaluations.

Subjects

Files

More About This Work

Academic Units
Computer Science
Publisher
Department of Computer Science, Columbia University
Series
Columbia University Computer Science Technical Reports, CUCS-014-05
Published Here
April 26, 2011