Classifying Numeric Information for Generalization

Lebowitz, Michael

Learning programs that generalize from real-world examples will have to deal with many different kinds of data. Continuous numeric data can cause problems for algorithms that search for examples with identical property values. These problems can be surmounted by categorizing the numeric data. However, this process has problems of its own. In this paper, we look at the need for categorizing numeric data and several methods for doing so. We concentrate on the use of generalization-based memory, a memory organization where actual examples are stored along with generalizations, which leads to a generalization-based categorization algorithm. We also consider how to use a number heuristic, looking for gaps. These methods have been implemented in the UNIMEM computer system. Examples are presented of these algorithms categorizing data about the states of the United States.



More About This Work

Academic Units
Computer Science
Department of Computer Science, Columbia University
Columbia University Computer Science Technical Reports, CUCS-053-83
Published Here
October 25, 2011