Academic Commons

Theses Doctoral

A Family of Latent Variable Convex Relaxations for IBM Model 2

Simion, Andrei

Introduced in 1993, the IBM translation models were the first generation Statistical Machine Translation systems. For the IBM Models, only IBM Model 1 is a convex optimization problem, meaning that we can initialize all its probabilistic parameters to uniform values and subsequently converge to a good solution via Expectation Maximization (EM). In this thesis we discuss a mechanism to generate an infinite supply of nontrivial convex relaxations for IBM Model 2 and detail an Exponentiated Subgradient algorithm to solve them. We also detail some interesting relaxations that admit and easy EM algorithm that does not require the tuning of a learning rate. Based on the geometric mean of two variables, this last set of convex models can be seamlessly integrated into the open-source GIZA++ word-alignment library. Finally, we also show other applications of the method, including a more powerful strictly convex IBM Model 1, and a convex HMM surrogate that improves on the performance of the previous convex IBM Model 2 variants.


  • thumnail for Simion_columbia_0054D_12719.pdf Simion_columbia_0054D_12719.pdf binary/octet-stream 849 KB Download File

More About This Work

Academic Units
Industrial Engineering and Operations Research
Thesis Advisors
Collins, Michael
Stein, Cliff
Ph.D., Columbia University
Published Here
May 12, 2015