Theses Doctoral

Scaling Deductive Verification for Real-world Cloud Data Security

Li, Xupeng

The use of sensitive private data in many applications from advertising to healthcare, often in the context of machine learning models, has raised concerns regarding the privacy of data in computing. These applications increasingly run on commodity cloud providers. For example, data and computation may be contained in virtual machines (VMs) running on shared hardware in the cloud, relying on a hypervisor to preserve VM isolation to protect applications and their data in VMs.

Although hypervisors and OSes are supposed to protect applications and their private data, their large codebases contain vulnerabilities that can risk data confidentiality and integrity. Vulnerable system software running at more privileged levels that can access application data is a significant security issue. Formal verification offers a potential solution to this problem by mathematically proving that system software can provide critical security guarantees. However, given the complexity of commodity systems software and modern hardware architecture, applying existing system verification techniques to fully verify a commodity system is infeasible.

This dissertation introduces a systematic approach to scale deductive verification for real-world systems and formally verify their high-level data security properties. It involves verifying that the software implementation satisfies a formal high-level specification of its behavior, then proving that the specification guarantees the desired security properties. It introduces novel verification techniques to modularize and automate the first part of this process -- proving that the software implementation satisfies the formal specification -- by security preserving layers and specification synthesis. It also introduces novel approaches to formally model and verify the data security properties in modern commodity systems.

To ensure the feasibility and solidity of the verification, it also demonstrates how to handle unmodified system code with advanced language features, how to model modern hardware features such as relaxed memory behaviors, cache and TLBs. We have applied these techniques to verify two real systems, SeKVM, a commodity multiprocessor hypervisor retrofitted from KVM, and Arm Confidential Compute Architecture (Arm CCA), a novel new architecture to protect the data security of virtual machines. All the proofs are implemented in Coq and are machine-checked.

Files

  • thumnail for Li_columbia_0054D_19099.pdf Li_columbia_0054D_19099.pdf application/pdf 1.6 MB Download File

More About This Work

Academic Units
Computer Science
Thesis Advisors
Gu, Ronghui
Degree
Ph.D., Columbia University
Published Here
May 7, 2025