Academic Commons

Theses Doctoral

Essays on the use of probabilistic machine learning for estimating customer preferences with limited information

Padilla, Nicolas

In this thesis, I explore in two essays how to augment thin historical purchase data with other sources of information using Bayesian and probabilistic machine learning frameworks to better infer customers' preferences and their future behavior. In the first essay, I posit that firms can better manage recently-acquired customers by using the information from acquisition to inform future demand preferences for those customers. I develop a probabilistic machine learning model based on Deep Exponential Families to relate multiple acquisition characteristics with individual level demand parameters, and I show that the model is able to capture flexibly non-linear relationships between acquisition behaviors and demand parameters. I estimate the model using data from a retail context and show that firms can better identify which new customers are the most valuable.

In the second essay, I explore how to combine the information collected through the customer journey—search queries, clicks and purchases; both within-journeys and across journeys—to infer the customer’s preferences and likelihood of buying, in settings in which there is thin purchase history and where preferences might change from one purchase journey to another.

I propose a non-parametric Bayesian model that combines these different sources of information and accounts for what I call context heterogeneity, which are journey-specific preferences that depend on the context of the specific journey. I apply the model in the context of airline ticket purchases using data from one of the largest travel search websites and show that the model is able to accurately infer preferences and predict choice in an environment characterized by very thin historical data. I find strong context heterogeneity across journeys, reinforcing the idea that treating all journeys as stemming from the same set of preferences may lead to erroneous inferences.


  • thumnail for Padilla_columbia_0054D_16454.pdf Padilla_columbia_0054D_16454.pdf application/pdf 13.6 MB Download File

More About This Work

Academic Units
Thesis Advisors
Netzer, Oded
Ascarza, Eva
Ph.D., Columbia University
Published Here
April 19, 2021