2020 Theses Doctoral
Training Decision Trees for Optimal Decision-Making
Many analytics problems in Operations Research and the Management Sciences can be framed as decision-making problems containing uncertain input parameters to be estimated from data. For example, inventory optimization problems often require forecasts of future demand, and product recommendation systems (e.g., movies, sporting goods) depend on models for predicting customer responses to the feasible recommendations. Therefore, a question central to many analytics problems is how to optimally build models from data which estimate the uncertain inputs for the decision problems of interest. We argue that most common approaches for this task either (a) focus on the wrong objectives in training the models for the decision problem, or (b) focus on the right objectives but only study how to do so with prohibitively simple machine learning models (e.g. linear and logistic regression).
In this work, we study how to train decision tree models for predicting uncertain parameters for analytical decision-making problems. Unlike other machine learning models such as linear and logistic regression, decision trees are both nonparameteric and interpretable, allowing them the capability of modeling highly complex relationships between data and predictions while also being easily visualized and interpreted. We propose tractable algorithms for decision tree training in the context of three problem domains relevant to Operations Research. First, we study how to train decision trees for delivering real-time personalized recommendations of products in settings where little prior data is available for training purposes. This problem is known in the literature as the contextual bandit problem and requires careful navigation of the so-called "exploration-exploitation trade-off" in utilizing the decision tree models. Second, we propose a new framework which we call Market Segmentation Trees (MSTs) for training decision tree models for the purposes of market segmentation and personalization. We explore several applications of MSTs relevant to personalized advertising, including recommending hotels to Expedia users as a function of their search queries and segmenting ad auctions according to the distribution of bids that they receive. Finally, we propose a general framework for training decision tree models for uncertain optimization problems which we call "SPO Trees" (SPOTs). In contrast to the typical objective of maximizing predictive accuracy, the SPOT framework trains decision trees to maximize the quality of the solutions found in the uncertain optimization problem, therefore yielding better decisions in several analytics problems of interest.
- McNellis_columbia_0054D_15980.pdf application/pdf 5.94 MB Download File
More About This Work
- Academic Units
- Industrial Engineering and Operations Research
- Thesis Advisors
- Elmachtoub, Adam N.
- Ph.D., Columbia University
- Published Here
- August 4, 2020