Theses Doctoral

Statistical Methodologies for Decision-Making and Uncertainty Reduction in Machine Learning

Zhang, Haofeng

Stochasticity arising from data and training can cause statistical errors in prediction and optimization models and lead to inferior decision-making. Understanding the risk associated with the models and converting predictions into better decisions have become increasingly prominent.

This thesis studies the interaction of two fundamental topics, data-driven decision-making and machine-learning-based uncertainty reduction, where it develops statistically principled methodologies and provides theoretical insights.

Chapter 2 studies data-driven stochastic optimization where model parameters of the underlying distribution need to be estimated from data in addition to the optimization task. Several mainstream approaches have been developed to solve data-driven stochastic optimization, but direct statistical comparisons among different approaches have not been well investigated in the literature. We develop a new regret-based framework based on stochastic dominance to rigorously study and compare their statistical performance.

Chapter 3 studies uncertainty quantification and reduction techniques for neural network models. Uncertainties of neural networks arise not only from data, but also from the training procedure that often injects substantial noises and biases. These hinder the attainment of statistical guarantees and, moreover, impose computational challenges due to the need for repeated network retraining. Building upon the recent neural tangent kernel theory, we create statistically guaranteed schemes to principally characterize and remove the uncertainty of over-parameterized neural networks with very low computation effort.

Chapter 4 studies reducing uncertainty in stochastic simulation where standard Monte Carlo computation is widely known to exhibit a canonical square-root convergence speed in terms of sample size. Two recent techniques derived from an integration of reproducing kernels and Stein's identity have been proposed to reduce the error in Monte Carlo computation to supercanonical convergence. We present a more general framework to encompass both techniques that is especially beneficial when the sample generator is biased and noise-corrupted. We show that our general estimator, the doubly robust Stein-kernelized estimator, outperforms both existing methods in terms of mean squared error rates across different scenarios.

Chapter 5 studies bandit problems, which are important sequential decision-making problems that aim to find optimal adaptive strategies to maximize cumulative reward. Bayesian bandit algorithms with approximate Bayesian inference have been widely used to solve bandit problems in practice, but their theoretical justification is less investigated partially due to the additional Bayesian inference errors. We propose a general theoretical framework to analyze Bayesian bandits in the presence of approximate inference and establish the first regret bound for Bayesian bandit algorithms with bounded approximate inference errors.

Files

  • thumnail for Zhang_columbia_0054D_18854.pdf Zhang_columbia_0054D_18854.pdf application/pdf 3.81 MB Download File

More About This Work

Academic Units
Industrial Engineering and Operations Research
Thesis Advisors
Lam, Kwai Hung Henry
Elmachtoub, Adam N.
Degree
Ph.D., Columbia University
Published Here
October 23, 2024