Theses Doctoral

Essays in Computational Law and Economics

Connell, Paul

Legal institutions are surrounded by, constituted in and expressed through text. Recent advancements in natural language processing (NLP) techniques therefore hold great promise in expanding the scope of questions addressable by empirical law and economics research.

This dissertation demonstrates that promise by addressing three distinct questions in a manner that utilizes economic theory frameworks to answer questions that are grounded at some level in legal text. Specifically, this work explores: (1) how slavery affected technological innovation trajectories in the American South as revealed through patents, (2) how judicial experience and court structure impact securities litigation outcomes discernible in part through court filings and (3) how researchers can quantify and correct misclassification errors that arise when processing textual data such as legal opinions or merger-and-acquisition contracts.

This dissertation employs diverse NLP methodologies to extract meaningful information from legal documents across all three studies. The first paper applies Multinomial Inverse Regression (MNIR) and Large Language Models (LLMs) to analyze historical U.S. patent texts (1836-1877), combining these insights with a directed technical change model and difference-in-differences estimation.

The second paper uses a Random Forest Classification algorithm to process securities class action complaints and judicial opinions, developing a structural model of expert decision-making to evaluate judicial performance.

The third paper presents statistical frameworks for quantifying and correcting misclassification errors that occur when LLMs or humans classify features in legal texts, with validation through Monte Carlo simulations and empirical applications to legal data. The findings of these three studies reveal that institutional structures significantly shape economic outcomes across multiple domains.

The first study concludes that slavery directed Southern technological innovation away from capital-intensive production techniques, potentially explaining divergent industrial development between North and South. The second study demonstrates that judicial experience increases skill in identifying non-meritorious securities lawsuits while advanced age diminishes it, estimating that optimized term limits and specialized courts could have prevented over $14.5 billion in non-meritorious settlements since 1995. The third study provides practical methods for correcting both attenuation and directional bias from misclassification error in textual analysis.

Collectively, these studies illustrate how modern NLP tools can substantially expand the scope and precision of empirical law and economics research, unlocking new insights into the relationship between legal institutions and economic outcomes that were previously constrained by data limitations.

Geographic Areas

Files

  • thumbnail for Connell_columbia_0054D_19114.pdf Connell_columbia_0054D_19114.pdf application/pdf 2.52 MB Download File

More About This Work

Academic Units
Economics
Thesis Advisors
MacLeod, W. Bentley
Degree
Ph.D., Columbia University
Published Here
May 28, 2025