Academic Commons

Presentations (Communicative Events)

Time-Efficient Creation of an Accurate Sentence Fusion Corpus

Thadani, Kapil; Rosenthal, Sara; McKeown, Kathleen; Moore, Coleman

Sentence fusion enables summarization and question-answering systems to produce output by combining fully formed phrases from
different sentences. Yet there is little data that can be used to develop and evaluate fusion techniques. In this paper, we present a methodology for collecting fusions of similar sentence pairs using Amazon’s Mechanical Turk, selecting the input pairs in a semiautomated fashion. We evaluate the results using a novel technique for automatically selecting a representative sentence from multiple responses. Our approach allows for rapid construction of a high accuracy fusion corpus.



More About This Work

Academic Units
Computer Science
Published Here
April 29, 2013
Academic Commons provides global access to research and scholarship produced at Columbia University, Barnard College, Teachers College, Union Theological Seminary and Jewish Theological Seminary. Academic Commons is managed by the Columbia University Libraries.