Theses Doctoral

Sequential Decision Making with Combinatorial Actions and High-Dimensional Contexts

Oh, Min-hwan

In interactive sequential decision-making systems, the learning agent needs to react to new information both in the short term and in the long term, and learn to generalize through repeated interactions with the environment. Unlike in offline learning environments, the new data that arrives is typically a function of previous actions taken by the agent. One of the key challenges is to efficiently use and generalize from data that may never reappear. Furthermore, in many real-world applications, the agent only receives partial feedback on the decisions it makes. This necessitates a balanced exploration-exploitation approach, where the agent needs to both efficiently collect relevant information in order to prepare for future arrivals of feedback, and produce the desired outcome in the current periods by exploiting the already collected information. In this thesis, we focus on two classes of fundamental sequential learning problems:

Contextual bandits with combinatorial actions and user choice (Chapter 2 and Chapter 3):
We investigate the dynamic assortment selection problem by combining statistical estimation of choice models and generalization using contextual information. For this problem, we design and analyze both UCB and Thomson sampling algorithms with rigorous performance guarantees and tractability.

High-dimensional contextual bandits (Chapter 4):
We investigate policies that can efficiently exploit the structure in high-dimensional data, e.g., sparsity. We design and analyze an efficient sparse contextual bandit algorithm that does not require to know the sparsity of the underlying parameter -- information that essentially all existing sparse bandit algorithms to date require.


  • thumnail for Oh_columbia_0054D_16111.pdf Oh_columbia_0054D_16111.pdf application/pdf 2.02 MB Download File

More About This Work

Academic Units
Industrial Engineering and Operations Research
Thesis Advisors
Iyengar, Garud N.
Zeevi, Assaf J.
Ph.D., Columbia University
Published Here
August 3, 2020