2007 Presentations (Communicative Events)
Lexicalized Markov Grammars for Sentence Compression
We present a sentence compression system based on synchronous context-free grammars (SCFG), following the successful noisy-channel approach of (Knight and Marcu, 2000). We define a headdriven Markovization formulation of SCFG deletion rules, which allows us to lexicalize probabilities of constituent deletions. We also use a robust approach for tree-to-tree alignment between arbitrary document-abstract parallel corpora, which lets us train lexicalized models with much more data than previous approaches relying exclusively on scarcely available document-compression corpora. Finally, we evaluate different Markovized models, and find that our selected best model is one that exploits head-modifier bilexicalization to accurately distinguish adjuncts from complements, and that produces sentences that were judged more grammatical than those generated by previous work.
Subjects
Files
- galley_mckeown_07.pdf application/pdf 165 KB Download File
More About This Work
- Academic Units
- Computer Science
- Publisher
- Proceedings of NAACL-HLT
- Published Here
- July 14, 2013