Academic Commons

Reports

Similarity-based Multilingual Multi-Document Summarization

Evans, David Kirk; McKeown, Kathleen; Klavans, Judith L.

We present a new approach for summarizing clusters of documents on the same event, some of which are machine translations of foreign-language documents and some of which are English. Our approach to multilingual multi-document summarization uses text similarity to choose sentences from English documents based on the content of the machine translated documents. A manual evaluation shows that 68\% of the sentence replacements improve the summary, and the overall summarization approach outperforms first-sentence extraction baselines in automatic ROUGE-based evaluations.

Subjects

Files

More About This Work

Academic Units
Computer Science
Publisher
Department of Computer Science, Columbia University
Series
Columbia University Computer Science Technical Reports, CUCS-014-05
Published Here
April 26, 2011
Academic Commons provides global access to research and scholarship produced at Columbia University, Barnard College, Teachers College, Union Theological Seminary and Jewish Theological Seminary. Academic Commons is managed by the Columbia University Libraries.