Academic Commons

Presentations (Communicative Events)

Scalable Inference and Training of Context-Rich Syntactic Translation Models

Galley, Michel; Graehl, Jonathan; Knight, Kevin; Marcu, Daniel; DeNeefe, Steve; Wang, Wei; Thayer, Ignacio

Statistical MT has made great progress in the last few years, but current translation models are weak on re-ordering and target language fluency. Syn- tactic approaches seek to remedy these problems. In this paper, we take the framework for acquiring multi-level syntactic translation rules of (Galley et al., 2004) from aligned tree-string pairs, and present two main extensions of their approach: first, instead of merely computing a single derivation that minimally explains a sentence pair, we construct a large number of derivations that include contextually richer rules, and account for multiple interpretations of unaligned words. Second, we pro- pose probability estimates and a training procedure for weighting these rules. We contrast different approaches on real examples, show that our estimates based on multiple derivations favor phrasal re-orderings that are linguistically better motivated, and establish that our larger rules provide a 3.63 BLEU point increase over minimal rules.


More About This Work

Academic Units
Computer Science
Proceedings of COLING/ACL
Published Here
June 30, 2013
Academic Commons provides global access to research and scholarship produced at Columbia University, Barnard College, Teachers College, Union Theological Seminary and Jewish Theological Seminary. Academic Commons is managed by the Columbia University Libraries.