Academic Commons

Presentations (Communicative Events)

Coerced Markov Models For Cross-Lingual Lexical-Tag Relations

Fung, Pascale; Wu, Dekai

We introduce the Coerced Markov Model (CMM) to model the relationship between the lexical sequence of a source language and the tag sequence of a target language, with the objective of constraining search in statistical transfer-based machine translation systems. CMMs differ from Hidden Markov Models in that state sequence assignments can take on values coerced from external sources. Given a Chinese sentence, a CMM can be used to predict the corresponding English tag sequence, thus constraining the English lexical sequence produced by a translation model. The CMM can also be used to score competing translation hypotheses in N-best models. Three fundamental problems for CMM designed are discussed. Their solutions lead to the training and testing stages of CMM.

Files

More About This Work

Academic Units
Computer Science
Publisher
Proceedings of the Sixth International Conference on Theoretical and Methodological Issues in Machine Translation
Published Here
April 26, 2013
Academic Commons provides global access to research and scholarship produced at Columbia University, Barnard College, Teachers College, Union Theological Seminary and Jewish Theological Seminary. Academic Commons is managed by the Columbia University Libraries.