2024 Theses Doctoral
Federated Learning for Reinforcement Learning and Control
Federated learning (FL), a novel distributed learning paradigm, has attracted significant attention in the past few years. Federated algorithms take a client/server computation model, and provide scope to train large-scale machine learning models over an edge-based distributed computing architecture. In the paradigm of FL, models are trained collaboratively under the coordination of a central server while storing data locally on the edge/clients. This thesis addresses critical challenges in FL, focusing on supervised learning, reinforcement learning (RL), control systems, and personalized system identification. By developing robust, efficient algorithms, our research enhances FLβs applicability across diverse, real-world environments characterized by data heterogeneity and communication constraints.
In the first part, we introduce an algorithm for supervised FL to address the challenges posed by heterogeneous client data, ensuring stable convergence and effective learning, even with partial client participation. In the federated reinforcement learning (FRL) part, we develop algorithms that leverage similarities across heterogeneous environments to improve sample efficiency and accelerate policy learning. Our setup involves π agents interacting with environments that share the same state and action space but differ in their reward functions and state transition kernels. Through rigorous theoretical analysis, we show that information exchange via FL can expedite both policy evaluation and optimization in decentralized, multi-agent settings, enabling faster, more efficient, and robust learning.
Extending FL into control systems, we propose the π΅πππ»ππ algorithm, which enables agents with unknown but similar dynamics to collaboratively learn stabilizing policies, addressing the unique demands of closed-loop stability in federated control. Our method overcomes numerous technical challenges, such as heterogeneity in the agentsβdynamics, multiple local updates, and stability concerns. We show that our proposed algorithm π΅πππ»ππ produces a common policy that, at each iteration, is stabilizing for all agents. We provide bounds on the distance between the common policy and each agentβs local optimal policy. Furthermore, we prove that when learning each agentβs optimal policy, π΅πππ»ππ achieves a sample complexity reduction proportional to the number of agents π in a low-heterogeneity regime, compared to the single-agent setting.
In the last part, we explore techniques for personalized system identification in FL, allowing clients to obtain customized models suited to their individual environments. We consider the problem of learning linear system models by observing multiple trajectories from systems with differing dynamics. This framework encompasses a collaborative scenario where several systems seeking to estimate their dynamics are partitioned into clusters according to system similarity. Thus, the systems within the same cluster can benefit from the observations made by the others. Considering this framework, we present an algorithm where each system alternately estimates its cluster identity and performs an estimation of its dynamics. This is then aggregated to update the model of each cluster. We show that under mild assumptions, our algorithm correctly estimates the cluster identities and achieves an π-approximate solution with a sample complexity that scales inversely with the number of systems in the cluster, thus facilitating a more efficient and personalized system identification.
Subjects
Files
- Wang_columbia_0054D_18952.pdf application/pdf 3.51 MB Download File
More About This Work
- Academic Units
- Electrical Engineering
- Thesis Advisors
- Anderson, James David
- Degree
- Ph.D., Columbia University
- Published Here
- January 8, 2025