2020 Theses Doctoral
Unsupervised Representation Learning with Correlations
Unsupervised representation learning algorithms have been playing important roles in machine learning and related fields. However, due to optimization intractability or lack of consideration in given data correlation structures, some unsupervised representation learning algorithms still cannot well discover the inherent features from the data, under certain circumstances. This thesis extends these algorithms, and improves over the above issues by taking data correlations into consideration.
We study three different aspects of improvements on unsupervised representation learning algorithms by utilizing correlation information, via the following three tasks respectively:
1. Using estimated correlations between data points to provide smart optimization initializations, for multi-way matching (Chapter 2). In this work, we define a correlation score between pairs of data points as metrics for correlations, and initialize all the permutation matrices along a maximum spanning tree of the undirected graph with these metrics as the weights.
2. Faster optimization by utilizing the correlations in the observations, for variational inference (Chapter 3). We construct a positive definite matrix from the negative Hessian of the log-likelihood part of the objective that can capture the influence of the observation correlations on the parameter vector. We then use the inverse of this matrix to rescale the gradient.
3. Utilizing additional side-information on data correlation structures to explicitly learn correlations between data points, for extensions of Variational Auto-Encoders (VAEs) (Chapters 4 and 5). Consider the case where we know a correlation graph G of the data points. Instead of placing an i.i.d. prior as in the most common setting, we adopt correlated priors and/or correlated variational distributions on the latent variables through utilizing the graph G.
Empirical results on these tasks show the success of the proposed methods in improving the performances of unsupervised representation learning algorithms. We compare our methods with multiple recent advanced algorithms on various tasks, on both synthetic and real datasets. We also provide theoretical analysis for some of the proposed methods, showing their advantages under certain situations.
The proposed methods have wide ranges of applications. For examples, image compression (via smart initializations for multi-way matching), link prediction (by VAEs with correlations), etc.
- Tang_columbia_0054D_15722.pdf application/pdf 1.03 MB Download File
- mets.xml application/xml 10.9 KB Download File
More About This Work
- Academic Units
- Computer Science
- Thesis Advisors
- Jebara, Tony
- Ph.D., Columbia University
- Published Here
- February 7, 2020