Theses Doctoral

Genealogies of Machine Learning, 1950-1995

Mendon-Plasek, Aaron Louis

This study examines the history of machine learning in the second half of the twentieth century. The disunified forms of machine learning from the 1950s until the 1990s expanded what constituted “legitimate” and “efficacious” descriptions of society and physical reality, by using computer learning to accommodate the variability of data and to spur creative and original insights. By the early 1950s researchers saw “machine learning” as a solution for handling practical classification tasks involving uncertainty and variability; a strategy for producing original, creative insights in both science and society; and a strategy for making decisions in new contexts and new situations when no causal explanation or model was available.

Focusing heavily on image classification and recognition tasks, pattern recognition researchers, building on this earlier learning tradition from the mid-1950s to the late-1980s, equated the idea of “learning” in machine learning with a program’s capacity to identify what was “significant” and to redefine objectives given new data in “ill-defined” systems. Classification, for these researchers, encompassed individual pattern recognition problems, the process of scientific inquiry, and, ultimately, all subjective human experience: they viewed all these activities as specific instances of generalized statistical induction. In treating classification as generalized induction, these researchers viewed pattern recognition as a method for acting in the world when you do not understand it. Seeing subjectivity and sensitivity to “contexts” as a virtue, pattern recognition researchers distinguished themselves from the better-known artificial intelligence community by emphasizing values and assumptions they necessarily “smuggled in” to their learning programs. Rather than a bias to be removed, the explicit contextual subjectivity of machine learning, including its sensitivity to the idiosyncrasies of its training data, justified its use from the 1960s to the 1980s.

Pattern recognition researchers shared a basic skepticism about the possibility of knowledge of universals apart from a specific context, a belief in the generative nature of individual examples to inductively revise beliefs and abductively formulate new ones, and a conviction that classifications are both arbitrary and more or less useful. They were, in a word, nominalists. These researchers sought methods to accommodate necessarily situated, limited, and perspectival views of the world. This extended to the task of classification itself, that, as one researcher formally proved, relied on value judgments that could not depend on logical or empirical grounds alone. “Inductive ambiguities” informed these researchers’ understanding of human subjectivity, and led them to explicitly link creativity and efficacious action to the range of an individual’s idiosyncrasies and subjective experiences, including one’s culture, language, education, ambitions, and, ultimately, values that informed science. Researchers justified using larger amounts of messy, error-prone data to smaller, curated, expensively-produced data sets by the potential greater range of useful, creative actions a program might learn. Such learning programs, researchers hoped, might usefully operate in circumstances or make decisions that even the program’s creator did not anticipate or even understand.

This dissertation shows that the history of quantification in the second half of the twentieth century and early twenty-first century, including how we know different social groups, individual people, and ourselves, cannot be properly understood without a genealogy of machine learning. The values and methods for making decisions in the absence of a causal or logical description of the system or phenomenon emerged as a practical and epistemological response to problems of knowledge in pattern recognition. These problem-framing strategies in pattern recognition interwove creativity, learning, and computation in durable ways; they subsequently were relabeled “machine learning” from the late 1980s. Not progressive or linear or centralized, this development was disordered and contingent on the existence of disparate communities, each with distinct problems and techniques, while being equally engaged in exchanges of practices, values, and methods among themselves. Developing largely outside of symbolic artificial intelligence from the 1950s to 1980s, these diverse approaches came into AI as “machine learning” in the late 1980s and early 1990s. This reinvention of much of AI as machine learning was not because machine learning performed better, but was due to the realignment of values within AI—one where induction and large heterogeneous data sets seemed the better way to understand and to affect the world and the people in it.


This item is currently under embargo. It will be available starting 2027-09-21.

More About This Work

Academic Units
Thesis Advisors
Jones, Matthew L.
Ph.D., Columbia University
Published Here
September 28, 2022