2025 Theses Doctoral
Flexible Photonic Accelerated High Performance Compute System
The rapid growth of artificial intelligence (AI) applications has driven the development of larger and more complex machine learning (ML) models. This trend has motivated the need for large-scale distributed compute clusters with significant networking and memory capabilities. However, current datacenter (DC) and high-performance compute (HPC) architectures are fundamentally limited by the bandwidth and energy constraints of electrically switched networks and tightly coupled memory units, resulting in data movement bottlenecks that limit both compute throughput and energy efficiency.
To address these challenges, this work explores the integration of silicon photonic based interconnects across the compute stack, from intra-node memory interfaces to inter-node compute networks. Specifically, we examine the architectural design space of integrating dense wavelength division multiplexing (DWDM)–based photonic links and switches in the compute system. To enable more efficient communication, we propose a hardware–software co-designed approach that optimizes collective communication algorithms based on the underlying optically reconfigurable topology. To support future deployment of photonic interconnects in HPC systems, we develop a quantitative methodology that evaluates their performance-power trade-offs for AI and ML applications.
We begin by introducing SiPAC, a silicon photonic accelerated compute architecture that improves distributed ML training through the co-design of a photonic physical layer and a collective communication algorithm. The physical layer exploits embedded DWDM photonic I/Os to bring high bandwidth optical connectivity directly to the compute unit and leverages resonator-based optical wavelength selectivity to enable a multi-dimensional all-to-all interconnect topology. The collective algorithm builds on this primitive to accelerate commonly used collective operations in ML training and to improve communication efficiency. To support multi-tenant environments, we extend this design with Flex-SiPAC, which leverages a novel wavelength reconfigurable multi-port transceiver and spatial-wavelength selective switch for flexible bandwidth steering across the network. Its co-designed collective algorithm supports multi-tenant scenarios and accelerates collective operations across diverse workloads.
We next introduce SiPAM, a silicon photonic accelerated memory pooling architecture. SiPAM integrates DWDM-based photonic I/Os directly along the compute die’s shoreline, replacing local high-bandwidth memories (HBM) and traditional electrical network interfaces with a unified, high-bandwidth optical communication domain. This architecture enables reconfigurable access to a disaggregated memory pool and decouples memory scaling from compute packaging constraints. In addition to identifying shoreline width as a critical architectural resource, we develop a quantitative optimization methodology to allocate compute resource, memory capacity and memory bandwidth based on workload demand.
Finally, we present a quantitative framework for evaluating reconfigurable network architectures in large-scale AI compute systems. This framework guides the future adoption of photonic switch and link technologies by analyzing performance trade-offs of varying reconfiguration latency, link bandwidth provisioning, and system power consumption. We develop two reconfiguration strategies and evaluate them on our proposed multi-dimensional all-to-all network topology designed to support hybrid parallelism with enhanced flexibility and energy efficiency.
Together, this work presents a systematic exploration of silicon photonic integration across the compute hierarchy, providing architectural and methodological insights toward building scalable, energy-efficient, and reconfigurable photonic infrastructure for emerging AI workloads.
Subjects
Files
This item is currently under embargo. It will be available starting 2026-10-15.
More About This Work
- Academic Units
- Electrical Engineering
- Thesis Advisors
- Bergman, Keren
- Degree
- Ph.D., Columbia University
- Published Here
- October 29, 2025