Theses Doctoral

A Family of Latent Variable Convex Relaxations for IBM Model 2

Simion, Andrei

Introduced in 1993, the IBM translation models were the first generation Statistical Machine Translation systems. For the IBM Models, only IBM Model 1 is a convex optimization problem, meaning that we can initialize all its probabilistic parameters to uniform values and subsequently converge to a good solution via Expectation Maximization (EM). In this thesis we discuss a mechanism to generate an infinite supply of nontrivial convex relaxations for IBM Model 2 and detail an Exponentiated Subgradient algorithm to solve them. We also detail some interesting relaxations that admit and easy EM algorithm that does not require the tuning of a learning rate. Based on the geometric mean of two variables, this last set of convex models can be seamlessly integrated into the open-source GIZA++ word-alignment library. Finally, we also show other applications of the method, including a more powerful strictly convex IBM Model 1, and a convex HMM surrogate that improves on the performance of the previous convex IBM Model 2 variants.

Files

  • thumnail for Simion_columbia_0054D_12719.pdf Simion_columbia_0054D_12719.pdf application/pdf 849 KB Download File

More About This Work

Academic Units
Industrial Engineering and Operations Research
Thesis Advisors
Collins, Michael
Stein, Cliff
Degree
Ph.D., Columbia University
Published Here
May 12, 2015