Theses Doctoral

Scalable Tools for Information Extraction and Causal Modeling of Neural Data

Nejatbakhshesfahani, Mohammadamin

Systems neuroscience has entered in the past 20 years into an era that one might call "large scale systems neuroscience". From tuning curves and single neuron recordings there has been a conceptual shift towards a more holistic understanding of how the neural circuits work and as a result how their representations produce neural tunings.

With the introduction of a plethora of datasets in various scales, modalities, animals, and systems; we as a community have witnessed invaluable insights that can be gained from the collective view of a neural circuit which was not possible with small scale experimentation. The concurrency of the advances in neural recordings such as the production of wide field imaging technologies and neuropixels with the developments in statistical machine learning and specifically deep learning has brought system neuroscience one step closer to data science. With this abundance of data, the need for developing computational models has become crucial. We need to make sense of the data, and thus we need to build models that are constrained up to the acceptable amount of biological detail and probe those models in search of neural mechanisms.

This thesis consists of sections covering a wide range of ideas from computer vision, statistics, machine learning, and dynamical systems. But all of these ideas share a common purpose, which is to help automate neuroscientific experimentation process in different levels. In chapters 1, 2, and 3, I develop tools that automate the process of extracting useful information from raw neuroscience data in the model organism C. elegans. The goal of this is to avoid manual labor and pave the way for high throughput data collection aiming at better quantification of variability across the population of worms. Due to its high level of structural and functional stereotypy, and its relative simplicity, the nematode C. elegans has been an attractive model organism for systems and developmental research. With 383 neurons in males and 302 neurons in hermaphrodites, the positions and function of neurons is remarkably conserved across individuals. Furthermore, C. elegans remains the only organism for which a complete cellular, lineage, and anatomical map of the entire nervous system has been described for both sexes. Here, I describe the analysis pipeline that we developed for the recently proposed NeuroPAL technique in C. elegans. Our proposed pipeline consists of atlas building (chapter 1), registration, segmentation, neural tracking (chapter 2), and signal extraction (chapter 3). I emphasize that categorizing the analysis techniques as a pipeline consisting of the above steps is general and can be applied to virtually every single animal model and emerging imaging modality. I use the language of probabilistic generative modeling and graphical models to communicate the ideas in a rigorous form, therefore some familiarity with those concepts could help the reader navigate through the chapters of this thesis more easily.

In chapters 4 and 5 I build models that aim to automate hypothesis testing and causal interrogation of neural circuits. The notion of functional connectivity (FC) has been instrumental in our understanding of how information propagates in a neural circuit. However, an important limitation is that current techniques do not dissociate between causal connections and purely functional connections with no mechanistic correspondence. I start chapter 4 by introducing causal inference as a unifying language for the following chapters. In chapter 4 I define the notion of interventional connectivity (IC) as a way to summarize the effect of stimulation in a neural circuit providing a more mechanistic description of the information flow. I then investigate which functional connectivity metrics are best predictive of IC in simulations and real data. Following this framework, I discuss how stimulations and interventions can be used to improve fitting and generalization properties of time series models. Building on the literature of model identification and active causal discovery I develop a switching time series model and a method for finding stimulation patterns that help the model to generalize to the vicinity of the observed neural trajectories. Finally in chapter 5 I develop a new FC metric that separates the transferred information from one variable to the other into unique and synergistic sources.

In all projects, I have abstracted out concepts that are specific to the datasets at hand and developed the methods in the most general form. This makes the presented methods applicable to a broad range of datasets, potentially leading to new findings. In addition, all projects are accompanied with extensible and documented code packages, allowing theorists to repurpose the modules for novel applications and experimentalists to run analysis on their datasets efficiently and scalably.

In summary my main contribution in this thesis are the following:

1) Building the first atlases of hermaphrodite and male C. elegans and developing a generic statistical framework for constructing atlases for a broad range of datasets.
2) Developing a semi-automated analysis pipeline for neural registration, segmentation, and tracking in C. elegans.
3) Extending the framework of non-negative matrix factorization to datasets with deformable motion and developing algorithms for joint tracking and signal demixing from videos of semi-immobilized C. elegans.
4) Defining the notion of interventional connectivity (IC) as a way to summarize the effect of stimulation in a neural circuit and investigating which functional connectivity metrics are best predictive of IC in simulations and real data.
5) Developing a switching time series model and a method for finding stimulation patterns that help the model to generalize to the vicinity of the observed neural trajectories.
6) Developing a new functional connectivity metric that separates the transferred information from one variable to the other into unique and synergistic sources.
7) Implementing extensible, well documented, open source code packages for each of the above contributions.

Files

  • thumnail for Nejatbakhshesfahani_columbia_0054D_17530.pdf Nejatbakhshesfahani_columbia_0054D_17530.pdf application/pdf 2.02 MB Download File

More About This Work

Academic Units
Neurobiology and Behavior
Thesis Advisors
Paninski, Liam
Degree
Ph.D., Columbia University
Published Here
October 12, 2022