2025 Theses Doctoral
Reinforcement Learning via Differentiable Simulation: Applications in Operations Management
Reinforcement learning (RL) is a promising approach for solving dynamic decision-making problems that arise in operations management applications, including scheduling, routing, supply chain management, and adaptive experimentation. However, so far RL algorithms suffer from high sample complexity and difficulty in scaling to large scale problems.
To address these issues, this dissertation proposes a model-based approach to RL via differentiable simulation. Differentiable simulation is a technique for computing gradients of a system’s dynamics with respect to any parameters or action inputs, which has been widely used in physics and robotics, with roots in the stochastic simulation literature dating back to the 1980s-1990s with methodologies such as infinitesimal perturbation analysis (IPA) and generalized likelihood ratio estimation.
In this thesis, we propose approximations which overcome non-differentiability in the action spaces and transition functions for two critical problem domains: adaptive experimentation and queuing network control. For adaptive experimentation, we discuss how the common practice of batched treatment allocation (i.e. allocating treatments to a batch of sampling units at a time) gives rise to principled statistical approximations that enable smoothed formulations of decision-making performance (e.g. regret) that can be directly optimized via stochastic gradient descent. For queuing network control, we introduce a novel smoothing technique for discrete-event systems that enables pathwise policy gradients for updating a reinforcement learning policy. Finally, we introduce a theoretical framework for studying the convergence of stochastic gradient descent for policy optimization that can be applied for a general class of gradient estimators.
Subjects
Files
-
Che_columbia_0054D_19360.pdf
application/pdf
3.19 MB
Download File
More About This Work
- Academic Units
- Industrial Engineering and Operations Research
- Thesis Advisors
- Dong, Jing
- Namkoong, Hongseok
- Degree
- Ph.D., Columbia University
- Published Here
- August 20, 2025