Abstract
Learning programs that generalize from real‐world examples will have to deal with many different kinds of data. Continuous numeric data can cause problems for algorithms that search for examples with identical property values. These problems can be surmounted by categorizing the numeric data. However, this process has problems of its own. In this paper, we look at the need for categorizing numeric data and several methods for doing so. We concentrate on the use of generalization‐based memory, a memory organization where actual examples are stored along with generalizations, which leads to a generalization‐based categorization algorithm. We also consider how to use a number heuristic, looking for gaps. These methods have been implemented in the UNIMEM computer system. Examples are presented of these algorithms categorizing data about the states of the United States.