2015 Theses Doctoral
A Family of Latent Variable Convex Relaxations for IBM Model 2
Introduced in 1993, the IBM translation models were the first generation Statistical Machine Translation systems. For the IBM Models, only IBM Model 1 is a convex optimization problem, meaning that we can initialize all its probabilistic parameters to uniform values and subsequently converge to a good solution via Expectation Maximization (EM). In this thesis we discuss a mechanism to generate an infinite supply of nontrivial convex relaxations for IBM Model 2 and detail an Exponentiated Subgradient algorithm to solve them. We also detail some interesting relaxations that admit and easy EM algorithm that does not require the tuning of a learning rate. Based on the geometric mean of two variables, this last set of convex models can be seamlessly integrated into the open-source GIZA++ word-alignment library. Finally, we also show other applications of the method, including a more powerful strictly convex IBM Model 1, and a convex HMM surrogate that improves on the performance of the previous convex IBM Model 2 variants.
Subjects
Files
- Simion_columbia_0054D_12719.pdf application/pdf 849 KB Download File
More About This Work
- Academic Units
- Industrial Engineering and Operations Research
- Thesis Advisors
- Collins, Michael
- Stein, Cliff
- Degree
- Ph.D., Columbia University
- Published Here
- May 12, 2015