Academic Commons

Presentations (Communicative Events)

Syntactic Simplification for Improving Content Selection in Multi-Document Summarization

Siddharthan, Advaith; Nenkova, Ani; McKeown, Kathleen

In this paper, we explore the use of automatic syntactic simplification for improving content selection in multi-document summarization. In particular, we show how simplifying parentheticals by removing relative clauses and appositives results in improved sentence clustering, by forcing clustering based on central rather than background information. We argue that the inclusion of parenthetical information in a summary is a reference-generation task rather than a content-selection one, and implement a baseline reference rewriting module. We perform our evaluations on the test sets from the 2003 and 2004 Document Understanding Conference and report that simplifying parentheticals results in significant improvement on the automated evaluation metric Rouge.

Files

  • thumnail for siddharthan_copestake_04.pdf siddharthan_copestake_04.pdf application/pdf 154 KB Download File

More About This Work

Academic Units
Computer Science
Publisher
20th International Conference on Computational Linguistics (COLING 2004)
Published Here
May 31, 2013
Academic Commons provides global access to research and scholarship produced at Columbia University, Barnard College, Teachers College, Union Theological Seminary and Jewish Theological Seminary. Academic Commons is managed by the Columbia University Libraries.