Theses Doctoral

Optimizing Memory and Storage Performance in Cloud Datacenters

Zarkadas, Ioannis

Data-intensive applications increasingly dominate modern datacenters. This surge is propelled by a number of applications including AI/ML training and inference, HPC, data lakes, and large-scale cloud storage systems. However, software overhead remains a persistent challenge, with studies showing nearly half of cloud computing cycles wasted. As storage speeds outpace traditional system architectures, operating system overhead has emerged as a dominant bottleneck constraining overall system performance.

This thesis addresses these critical inefficiencies through a three-pronged approach. First, we revisit extensible operating systems as an effective way to optimize the storage and memory stack in cloud datacenters. Second, we uncover new sources of memory usage inefficiencies in ML accelerators, by constructing novel performance profiling tools. Third, we revisit and optimize traditional distributed protocols.

This thesis addresses these critical inefficiencies through a three-pronged approach that targets key sources of storage and memory overhead. First, I focus on developing efficient storage and memory stacks through OS extensions, enhancing system efficiency through bypassing OS layers and by enabling flexible policies for key OS components.

Second, I target suboptimal memory utilization in ML accelerators by developing novel performance debugging tools to analyze the low-level interaction of model code with the micro-architecture. This unlocks new insights into model inefficiencies and how to address them.

Third, I address storage efficiency through revisiting traditional storage protocols like replication. More specifically, I enhance async replication with strong staleness guarantees and fast failover, enabling better application performance and storage utilization.

Files

  • thumbnail for Zarkadas_columbia_0054D_19484.pdf Zarkadas_columbia_0054D_19484.pdf application/pdf 1.22 MB Download File

More About This Work

Academic Units
Computer Science
Thesis Advisors
Cidon, Asaf
Degree
Ph.D., Columbia University
Published Here
October 15, 2025