2025 Theses Doctoral
Learning to Remember, Summarize, and Answer Questions about Robot Actions
Robots and other systems using deep learning are increasingly common. It is essential to accurately keep track of what they have done and determine if they are operating as we intend. In this dissertation, I introduce several approaches to enable better monitoring and understanding of these systems.
I propose and demonstrate robot action summarization and question answering in natural language. The first step in understanding and controlling robots is knowing what they are doing. Much research has been done on training robots to learn to follow natural language instructions; robot action summarization is a complement to this research, enabling robots to report back succinctly and accurately what they have done after completing a task. I demonstrate that such summaries can be generated from multimodal inputs using recurrent and transformer networks.
I then investigate question answering about robot action episodes using a dataset of questions and answers I introduce. I show that learning to answer questions can help a model summarize by enabling it to learn about some objects solely during question answering and then transferring that representational knowledge to summarization.
If robots are to summarize and answer questions about their past actions, it will be necessary for them to store and recall episodes of action. I introduce a technique to form compact memory representations which can be used for these tasks, as well as for guiding choices about actions which should be taken during an action sequence. In addition to helping users keep track of what robots or other machine learning systems are doing, such artificial episodic memory representations could also pose some undesirable risks. I therefore propose a set of principles to guide the safe development of artificial episodic memory.
Finally, I introduce a method to learn to predict a neural network’s accuracy on particular inputs by training a second network to examine the outputs from its intermediate, hidden layers or its final outputs.
Subjects
Files
- DeChant_columbia_0054D_18946.pdf application/pdf 3.2 MB Download File
More About This Work
- Academic Units
- Computer Science
- Thesis Advisors
- Bauer, Daniel
- Degree
- Ph.D., Columbia University
- Published Here
- December 26, 2024