Theses Doctoral

Latent Variable Models for Events on Social Networks

Ward, Owen Gerard

Network data, particularly social network data, is widely collected in the context of interactions between users of online platforms, but it can also be observed directly, such as in the context of behaviours of animals in a group living environment. Such network data can reveal important insights into the latent structure present among the nodes of a network, such as the presence of a social hierarchy or of communities. This is generally done through the use of a latent variable model. Existing network models which are commonly used for such data often aggregate the dynamic events which occur, reducing complex dynamic events (such as the times of messages on a social network website) to a binary variable. Methods which can incorporate the continuous time component of these interactions therefore offer the potential to better describe the latent structure present.

Using observed interactions between mice, we take advantage of the observed interactions’ timestamps, proposing a series of network point process models with latent ranks. We carefully design these models to incorporate important theories on animal behaviour that account for dynamic patterns observed in the interaction data, including the winner effect, bursting and pair-flip phenomena. Through iteratively constructing and evaluating these models we arrive at the final cohort Markov-Modulated Hawkes process (C-MMHP), which best characterizes all aforementioned patterns observed in interaction data. The generative nature of our model provides evidence for hypothesised phenomena and allows for additional insights compared to existing aggregate methods, while the probabilistic nature allows us to estimate the uncertainty in our ranking. In particular, our model is able to provide insights into the distribution of power within the hierarchy which forms and the strength of the established hierarchy. We compare all models using simulated and real data. Using statistically developed diagnostic perspectives, we demonstrate that the C-MMHP model outperforms other methods, capturing relevant latent ranking structures that lead to meaningful predictions for real data.

While such network models can lead to important insights, there are inherent computational challenges for fitting network models, particularly as the number of nodes in the network grows. This is exacerbated when considering events between each pair of nodes. As such, new computational tools are required to fit network point process models to the large social networks commonly observed. We consider online variational inference for one such model. We derive a natural online variational inference procedure for this event data on networks. Using simulations, we show that this online learning procedure can accurately recover the true network structure. We demonstrate using real data that we can accurately predict future interactions by learning the network structure in this online fashion, obtaining comparable performance to more expensive batch methods.

Files

  • thumnail for Ward_columbia_0054D_17341.pdf Ward_columbia_0054D_17341.pdf application/pdf 1.28 MB Download File

More About This Work

Academic Units
Statistics
Thesis Advisors
Zheng, Tian
Degree
Ph.D., Columbia University
Published Here
July 13, 2022