Academic Commons

Theses Doctoral

Learning with Degree-Based Subgraph Estimation

Huang, Bert

Networks and their topologies are critical to nearly every aspect of modern life, with social networks governing human interactions and computer networks governing global information-flow. Network behavior is inherently structural, and thus modeling data from networks benefits from explicitly modeling structure. This thesis covers methods for and analysis of machine learning from network data while explicitly modeling one important measure of structure: degree. Central to this work is a procedure for exact maximum likelihood estimation of a distribution over graph structure, where the distribution factorizes into edge-likelihoods for each pair of nodes and degree-likelihoods for each node. This thesis provides a novel method for exact estimation of the maximum likelihood edge structure under the distribution. The algorithm solves the optimization by constructing an augmented graph containing, in addition to the original nodes, auxiliary nodes whose edges encode the degree potentials. The exact solution is then recoverable by finding the maximum weight b-matching on the augmented graph, a well-studied combinatorial optimization. To solve the combinatorial optimization, this thesis focuses in particular on a belief propagation-based approach to finding the optimal b-matching and provides a novel proof of convergence for belief propagation on the loopy graphical model representing the b-matching objective. Additionally, this thesis describes new algorithmic techniques to improve the scalability of the b-matching solver. In addition to various applications of node degree in machine learning, including classification and collaborative filtering, this thesis proposes a learning algorithm for learning the parameters of the distribution from network data consisting of node attributes and network connectivity, using strategies similar to maximum-margin structured prediction. The main methods and results in this thesis represent a deep exploration of exact degree-based estimation for machine learning from network data, and furthermore lead to various extensions and applications of the main idea described within.

Subjects

Files

  • thumnail for Huang_columbia_0054D_10425.pdf Huang_columbia_0054D_10425.pdf application/pdf 2.31 MB Download File

More About This Work

Academic Units
Computer Science
Thesis Advisors
Jebara, Tony
Degree
Ph.D., Columbia University
Published Here
November 9, 2011
Academic Commons provides global access to research and scholarship produced at Columbia University, Barnard College, Teachers College, Union Theological Seminary and Jewish Theological Seminary. Academic Commons is managed by the Columbia University Libraries.