2025 Theses Doctoral
Discovery Through Bottlenecks in Multimodal Models
Modern machine learning systems generate photorealistic images, classify data with superhuman accuracy, and synthesize human-like text, yet understanding and controlling their behavior remains challenging. While interpretability techniques and conditioning mechanisms have made progress, we propose a fundamentally different approach: building systems where information flows through inherently interpretable bottlenecks, unifying discovery and control by construction.
This thesis presents three bottleneck methods for multimodal systems that exploit a key duality: the same mechanisms that reveal interpretable patterns also serve as precise control interfaces. In fine-grained classification, LLM-based evolutionary optimization discovers discriminative features in interpretable language bottlenecks, enabling targeted control over classification decisions.
In motion domains, bidirectional models learn interpretable muscle activation patterns that both explain motion generation and enable conditional editing based on muscle activity goals. In visual domains, counterfactual generation discovers fine-grained discriminative features through direct image editing, revealing subtle differences between visually similar groups while enabling modifications beyond the reach of language prompts. Together, these methods demonstrate how representational constraints can transform opaque machine learning systems into interpretable, controllable frameworks applicable to domains where the features and control objectives worth discovering are not known in advance.
Subjects
Files
-
Chiquier_columbia_0054D_19602.pdf
application/pdf
3.99 MB
Download File
More About This Work
- Academic Units
- Computer Science
- Thesis Advisors
- Vondrick, Carl M.
- Degree
- D.E.S., Columbia University
- Published Here
- November 19, 2025