Theses Doctoral

Simulating Fiction: Models of Narrative and Literary Culture

Sack, Graham Alexander

Richard Feynman once remarked, “What I cannot create, I do not understand.” In Simulating Fiction: Models of Narrative and Literary Culture, I argue for a paradigm shift in literary and cultural criticism. Placing Feynman’s maxim in the context of the humanities, I contend that scholars of literature and culture should embrace a “generative” approach to knowledge production that re-centers the discipline around simulation and modeling as a complement to the field’s traditional reliance on description, interpretation, and critique.

Since its inception, literary criticism has lacked methods to model and test claims about how narrative and literary culture work at a fundamental mechanistic level. Over the past decade, the explosive popularity of big data, natural language processing, and machine learning has helped digital humanists discover many striking historical trends and correlations, but it has not solved this basic epistemological problem of explanation. Scholars are better equipped to answer questions of ‘how’ and ‘what’ but not ‘why.’ Computational modeling offers a path forward by extending, complementing, and contradicting humanistic intuition. Whereas literary theory produces knowledge by deduction and big data by induction, simulation does so via abduction—that is, modeling possible causes. Theoretical claims about how narrative and culture work are instantiated algorithmically. Artificial worlds are then grown from the bottom up and their simulated output is validated against real literary and cultural systems. The archive of narrative and cultural theory is brimming with candidate models, ranging from generative storytelling grammars to sociological models of cultural production. Instantiating such theories computationally enables literary scholars to play out the implications in far more vivid detail than is possible solely in the mind’s eye.

The most persuasive way to make the case for a new research paradigm is by positive example. Simulating Fiction therefore consists of several extended case studies focused on modeling narrative at various scales.

The first three chapters offer an in-depth investigation into the question, “Why do narratives (almost universally) develop characters unequally?” While literary critics would traditionally approach such a question qualitatively, I argue that character development begins as a quantitative phenomenon. To quote the noble laureate P. W. Anderson: “More is different.” If one measures the number of words spoken by each character in a Shakespearean drama, the number of times each character is named in a Victorian novel, and the number of seconds each character appears on-screen in a contemporary American film, the same distribution usually appears—what statisticians call a power law or “long tail.” In a field like literary criticism, which concentrates almost exclusively on the particularity of texts, the discovery of such a large-scale statistical regularity is remarkable. But even more compelling is the question of what causes it.

Literary critics are generally trained to seek explanations at the level of historical period, genre, or medium. What, then, should we do when confronted by a pattern that persists despite extreme differences in all three? I contend that we are forced to look below the level of history and form to fundamental mechanisms that operate at the level of narrative structure, cognition, and probability. To lay the foundation for an explanation, I develop a series of models, each of which is capable of generating a “long tail” distribution and has a plausible interpretation in the context of narrative. These include: (1) a model of forces of “unification” and “diversification” in narrative structure that determine the shape of character development; (2) an information theoretic model of how authors “maximize entropy” by pushing the limits of creative exploration within the constraints of memory, empathy, and attention; (3) a “building block” model in which characters are composed through the accumulation of characteristics; and (4) a “rich get richer” model in which major and minor characters are differentiated through a positive feedback loop.

While the first three chapters investigate narrative at the scale of character, the fourth chapter widens focus to the scale of plot. I explore the question, “Can social network models generate what we consider ‘plot’?” Network theory has become extremely popular within the digital humanities over the past decade as a means of measuring and visualizing the relationships between characters within texts. Such descriptive networks, however, are merely a trace of the underlying literary phenomenon—a forensic tool, like an X-ray, for visualizing plot after the fact. My concern in the fourth chapter is to invert and elevate the use of literary networks by simulating their dynamics to generate “proto-narratives.” As a case study, I develop the first computational model of “triangular desire,” an influential theory of character psychology proposed by philosopher and literary critic René Girard in Deceit, Desire, and the Novel (1961), along with a simulation of Structural Balance Theory, a sociological model of group formation based on patterns of friendship and enmity.

My goal is to demonstrate the potential of modeling and simulation as disciplinary practices for the humanities and to make allies and converts of an interdisciplinary audience, including literary and cultural critics, digital humanists, computational social scientists, and complexity theorists.


This item is currently under embargo. It will be available starting 2026-03-17.

More About This Work

Academic Units
English and Comparative Literature
Thesis Advisors
Dames, Nicholas J.
Ph.D., Columbia University
Published Here
March 22, 2021