2023 Theses Master's
Koopman Constrained Policy Optimization: A Koopman operator theoretic method for differentiable optimal control in robotics
Deep reinforcement learning has recently achieved state-of-the-art results for robotic control. Robots are now beginning to operate in unknown and highly nonlinear environments, expanding their usefulness for everyday tasks. In contrast, classical control theory is not suitable for these unknown, nonlinear environments. However, it retains an immense advantage over traditional deep reinforcement learning: guaranteed satisfaction of hard constraints, which is critically important for the performance and safety of robots. This thesis introduces Koopman Constrained Policy Optimization (KCPO), combining implicitly differentiable model predictive control with a deep Koopman autoencoder. KCPO brings new optimality guarantees to robot learning in unknown and nonlinear dynamical systems. The use of KCPO is demonstrated in Simple Pendulum and Cartpole with continuous state and action spaces and unknown environments. KCPO is shown to be able to train policies end-to-end with hard box constraints on controls. Compared to several baseline methods, KCPO exhibits superior generalization to constraints that were not part of its training.
Keywords: Koopman autoencoder, Koopman operator theory, policy learning, representation learning, imitation learning, constrained policy optimization
Files
-
Retchin_KCPO_Thesis.pdf application/pdf 1.18 MB Download File
More About This Work
- Academic Units
- Computer Science
- Thesis Advisors
- Song, Shuran
- Degree
- M.S., Columbia University
- Published Here
- May 17, 2023