Theses Doctoral

Graph Embedding and Nonlinear Dimensionality Reduction

Shaw, Blake

Traditionally, spectral methods such as principal component analysis (PCA) have been applied to many graph embedding and dimensionality reduction tasks. These methods aim to find low-dimensional representations of data that preserve its inherent structure. However, these methods often perform poorly when applied to data which does not lie exactly near a linear manifold. In this thesis, I present a set of novel graph embedding algorithms which extend spectral methods, allowing graph representations of high-dimensional data or networks to be accurately embedded in a low-dimensional space. I first propose minimum volume embedding (MVE) which, like other leading dimensionality reduction algorithms, first encodes the high-dimensional data as a nearest-neighbor graph, where the edge weights between neighbors correspond to kernel values between points, and then embeds this graph in a low-dimensional space. Next I present structure preserving embedding (SPE), an algorithm for embedding unweighted graphs where similarity between nodes is not known. SPE finds low-dimensional embeddings which explicitly preserve graph topology, meaning a connectivity algorithm, such as k-nearest neighbors, will recover the edges of the input graph from only the coordinates of the nodes after embedding. I further explore preserving graph structure during embedding, and find the concept applicable to dimensionality reduction, large-scale network visualization, and metric learning for link prediction. This thesis posits that simply preserving pairwise distances, as with many spectral methods, is insufficient for capturing the structure of many datasets and that preserving both local distances and graph topology is crucial for producing accurate low-dimensional representations of networks and high-dimensional data.



  • thumnail for Shaw_columbia_0054D_10420.pdf Shaw_columbia_0054D_10420.pdf application/pdf 42.4 MB Download File

More About This Work

Academic Units
Computer Science
Thesis Advisors
Jebara, Tony
Ph.D., Columbia University
Published Here
November 9, 2011