2025 Theses Doctoral
Mixed-Initiative Conversational Intelligence in the Era of Large Pre-Trained Models
With the rise of large pre-trained models, the idea of intelligent conversational agents has quickly gained attention in the public eye. Such conversational agents promise impressive capabilities in a multi-turn interaction setting, whether it be knowledge-grounded question answering, reasoning for code generation, or navigating real-world tasks like restaurant booking. Despite the greatly improved capabilities of agents to perform directed instruction-following tasks, agents built around large pre-trained models have not yet exhibited the ability to navigate mixed-initiative environments. Contrary to single-initiative settings, here an agent must recognize when it is appropriate to execute different actions to redirect the flow of the conversation (e.g. via clarifying questions or argumentative strategies) to maximize the chance of conversational success. This dissertation proposes novel methods to address core issues presenting bottlenecks to the mixed-initiative intelligence of such agents.
This dissertation is structured to address three core challenges. The first section introduces the challenges of multi-turn conversational modeling, as in-domain data can be expensive or infeasible to obtain. We propose a framework for synthesizing large-scale conversational data even for novel tasks by leveraging the instruction-following capabilities of existing large language models. In the section, we discuss the challenges of action planning. Here, we propose a seminal line of work on inference-time strategy optimization by leveraging large language model prompting for search and simulation adapted for Monte-Carlo Tree Search. We then introduce a novel perspective towards action optimization called implicit action recognition, and propose a novel model alignment algorithm called Action-based Contrastive Self-Training.
The final section of this dissertation focuses on the challenges of multimodal user modeling, as modern conversational agents rapidly look to become more ubiquitous by expanding towards all interaction modalities. This work builds on the previous sections' progress towards improved data curation and implicit action recognition. Here, we introduce a novel task for mixed-initiative spoken conversation modeling, as well as a simple yet effective approach to adapt models to different users' speaking patterns.
The methods proposed in this dissertation address tractable real-world challenges and serve as the foundation for further exploration in mixed-initiative conversation modeling.
Subjects
Files
-
Chen_columbia_0054D_19338.pdf
application/pdf
1.82 MB
Download File
More About This Work
- Academic Units
- Computer Science
- Thesis Advisors
- Yu, Zhou
- Degree
- Ph.D., Columbia University
- Published Here
- August 27, 2025