Theses Doctoral

Large Dimensional Data Analysis using Orthogonally Decomposable Tensors: Statistical Optimality and Computational Tractability

Auddy, Arnab

Modern data analysis requires the study of tensors, or multi-way arrays. We consider the case where the dimension d is large and the order p is fixed. For dimension reduction and for interpretability, one considers tensor decompositions, where a tensor T can be decomposed into a sum of rank one tensors. In this thesis, I will describe some recent work that illustrate why and how to use decompositions for orthogonally decomposable tensors. Our developments are motivated by statistical applications where the data dimension is large. The estimation procedures will therefore aim to be computationally tractable while providing error rates that depend optimally on the dimension.

A tensor is said to be orthogonally decomposable if it can be decomposed into rank one tensors whose component vectors are orthogonal. A number of data analysis tasks can be recast as the problem of estimating the component vectors from a noisy observation of an orthogonally decomposable tensor. In our first set of results, we study this decompositionproblem and derive perturbation bounds. For any two orthogonally decomposable tensors which are ε-perturbations of one another, we derive sharp upper bounds on the distances between their component vectors. While this is motivated by the extensive literature on bounds for perturbation of singular value decomposition, our work shows fundamental differences and requires new techniques. We show that tensor perturbation bounds have no dependence on eigengap, a quantity which is inevitable for matrices. Moreover, our perturbation bounds depend on the tensor spectral norm of the noise, and we provide examples to show that this leads to optimal error rates in several high dimensional statistical learning problems. Our results imply that matricizing a tensor is sub-optimal in terms of dimension dependence.

The tensor perturbation bounds derived so far are universal, in that they depend only on the spectral norm of the perturbation. In subsequent chapters, we show that one can extract further information from how a noise is generated, and thus improve over tensor perturbation bounds both statistically and computationally. We demonstrate this approach for two different problems: first, in estimating a rank one spiked tensor perturbed by independent heavy-tailed noise entries; and secondly, in performing inference from moment tensors in independent component analysis. We find that an estimator benefits immensely— both in terms of statistical accuracy and computational feasibility — from additional information about the structure of the noise. In one chapter, we consider independent noise elements, and in the next, the noise arises as a difference of sample and population fourth moments. In both cases, our estimation procedures are determined so as to avoid accumulating the errors from different sources. In a departure from the tensor perturbation bounds, we also find that the spectral norm of the error tensor does not lead to the sharpest estimation error rates in these cases. The error rates of estimating the component vectors are affected only by the noise projected in certain directions, and due to the orthogonality of the signal tensor, the projected errors do not accumulate, and can be controlled more easily.