2022 Theses Doctoral
On Recovering the Best Rank-? Approximation from Few Entries
In this thesis, we investigate how well we can reconstruct the best rank-? approximation of a large matrix from a small number of its entries. We show that even if a data matrix is of full rank and cannot be approximated well by a low-rank matrix, its best low-rank approximations may still be reliably computed or estimated from a small number of its entries. This is especially relevant from a statistical viewpoint: the best low-rank approximations to a data matrix are often of more interest than itself because they capture the more stable and oftentimes more reproducible properties of an otherwise complicated data-generating model. In particular, we investigate two agnostic approaches: the first is based on spectral truncation; and the second is a projected gradient descent based optimization procedure.
We argue that, while the first approach is intuitive and reasonably effective, the latter has far superior performance in general. We show that the error depends on how close the matrix is to being of low rank. Our results can be generalized to the spectral and entrywise error and provide flexible tools for the error analysis of the follow-up computation. Moreover, we derive a high-order decomposition of the error. With an explicit expression of the main error source, we obtain an improved estimate of the linear form. Both theoretical and numerical evidence is presented to demonstrate the effectiveness of the proposed approaches.
Files
-
Xu_columbia_0054D_17285.pdf application/pdf 1.01 MB Download File
More About This Work
- Academic Units
- Statistics
- Thesis Advisors
- Yuan, Ming
- Degree
- Ph.D., Columbia University
- Published Here
- June 22, 2022