ZytkowSimon Normative Systems of Discovery and Logic of Search

Normative Systems of Discovery and Logic of SearchAuthor(s): Jan M. Zytkow and Herbert A. SimonReviewed work(s):Source: Synthese, Vol. 74, No. 1, Knowledge-Seeking by Questioning, Part I (Jan., 1988), pp.65-90Published by: SpringerStable URL: http://www.jstor.org/stable/20116486 .Accessed: 19/11/2012 14:15

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .http://www.jstor.org/page/info/about/policies/terms.jsp

.

JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range ofcontent in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new formsof scholarship. For more information about JSTOR, please contact [email protected].

.

Springer is collaborating with JSTOR to digitize, preserve and extend access to Synthese.

http://www.jstor.org

This content downloaded by the authorized user from 192.168.82.206 on Mon, 19 Nov 2012 14:16:00 PMAll use subject to JSTOR Terms and Conditions

JAN M. ZYTKOW AND HERBERT A. SIMON

NORMATIVE SYSTEMS OF DISCOVERY AND LOGIC OF SEARCH

ABSTRACT. New computer systems of discovery create a research program for logic and philosophy of science. These systems consist of inference rules and control

knowledge that guide the discovery process. Their paths of discovery are influenced by the available data and the discovery steps coincide with the justification of results. The

discovery process can be described in terms of fundamental concepts of artificial

intelligence such as heuristic search, and can also be interpreted in terms of logic. The

traditional distinction that places studies of scientific discovery outside the philosophy of

science, in psychology, sociology, or history, is no longer valid in view of the existence of computer systems of discovery. It becomes both reasonable and attractive to study the schemes of discovery in the same way as the criteria of justification were studied: empirically as facts, and logically as norms.

1. NORMATIVE SYSTEMS OF DISCOVERY

In the last decade several computer programs have been constructed that simulate, in detail, various processes of discovery (Feigenbaum et al. 1971; Lenat 1977, 1983; Buchanan 1978; Langley 1978, 1981; Bradshaw et al. 1980; Michalski and Stepp 1981; Langley et al. 1983, 1983a, 1987; Zytkow and Simon 1986). They operate either on

qualitative or on quantitative data. They incorporate different know

ledge and different expectations about the forms of the laws they should consider. Some of these systems incorporate very specific knowledge of a certain domain in science, while others are based only on general-purpose discovery heuristics. Some programs construct new concepts, others are capable of constructing laws, and still others construct explanations in the form of descriptions of the hidden struc ture of things or processes.

In this paper we will draw examples from two discovery systems -

BACON and GLAUBER - that address a few facets of scientific

discovery: concept formation and generation of empirical regularities, either for qualitative (GLAUBER) or quantitative (BACON) bodies of data. GLAUBER forms classes of objects based on regularities in qualitative data, and states qualitative laws in terms of these classes. BACON includes heuristics for finding numerical laws, for postulating intrinsic properties (theoretical terms), and for noting common

Synthese 74 (1988) 65-90 ? 1988 by Kluwer Academic Publishers


66 JAN M. ZYTKOW AND HERBERT A. SIMON

divisors. Since both systems are described in numerous publications and their full discussion goes well beyond the limits of this paper, we will condense their descriptions to the minimum required to support the ideas of this paper.

BACON: A System That Discovers Numerical Laws

BACON (Langley 1978, 1981; Bradshaw et al. 1980; Langley et al. 1983, 1983a, 1987) makes discoveries by induction on bodies of data. It deals primarily with numerical concepts (variables). Given a set of independent variables and a dependent variable, BACON varies one

independent variable, while fixing the values of all other independent variables. It registers the corresponding values of the dependent variable and it looks for a function that relates the varied independent variable with the dependent variable. Once such a function is found, BACON gives the parameters in that function the status of dependent variables at a higher level of its search tree. At that higher level the

system varies the next independent variable and tries to relate it to the new dependent terms. This process continues recursively until all the

independent variables are incorporated into a quantitative relation

ship. As an example, the system may consider two independent variables x\ and x2, while y is the dependent variable. For the fixed value of x2 = 1, it discovers the regularity y

= 2jci + 3. This regularity has the form y

= axx + b, and thus the concepts a and b are created.

For x2 = 1 their values are a = 2, b = 3. Then, for another value of x2 = 2, the regularity is y

= 3*i + 5, and for still another x2 = 3, the

regularity is y = 4jCi + 7. Now the system confronts the variable x2

with a and with b, and it discovers that a = x2 + 1, while b = 2x2 + l. Taken together, these regularities form the law y

=

(jc2+1)jci +(2*2+1). BACON can also consider nominal independent variables, that is,

variables that have symbols as their values. A nominal variable cannot be directly related to numerical variables by the use of numerical functions. In this case BACON introduces a new numerical variable called an intrinsic property and associates it with the same record as the nominal variable. Thus the nominal variable can be related to other numerical variables through the intrinsic property. Intrinsic

properties are defined by numerical dependent variables available to the system. In this process the numerical values of the dependent


NORMATIVE SYSTEMS OF DISCOVERY 67

variables are copied as numerical values of the independent variable. Once the numerical values of the independent variable are defined, the same values will be used whenever the same objects occur later in

BACON's search. BACON rediscovered a large number of laws in 18th and 19th

century physics and chemistry (Ohm's Law, Boyle's Law, Kepler's Laws, Coulomb's Law, the conservation of momentum law, and so

forth). While discovering these laws BACON introduced intrinsic terms corresponding to mass, internal resistance of a battery, and

others.

GLAUBER: A Qualitative Discovery System

Input data. The data utilized by GLAUBER (Langley et al. 1983a, 1987) consist of a set of qualitative observational facts, such as "hydrochloric acid tastes sour" and "hydrochloric acid combines with sodium hydroxide to form sodium chloride". Facts inputted to

GLAUBER are represented as lists beginning with a predicate fol lowed by a number of attribute-value lists. Thus our second example above is represented by the proposition (reacts (inputs HClNaOH) (outputs NaCl)). The predicate is "reacts" here, while the attributes are

"inputs" and "outputs". A set of values is allowed for each attribute. In our example, the inputs attribute has two values, {HO NaOH}, which represent the two substances that combine in the reaction.

Noting Patterns and Defining Classes. GLAUBER accepts such a set of facts and searches for facts that have the same predicate, the same

attribute, and the same value for this attribute. When such a collection of facts is discovered, GLAUBER creates a class (or classes) of values that differ in these facts and a pattern which is stated in the same

manner as the original facts, save that differing values are replaced by class names. For instance, suppose that while focusing on sodium

hydroxide, GLAUBER notes that this chemical combines with

hydrochloric acid to form sodium chloride in one case, with nitric acid to form sodium nitrate in another case, and with sulfuric acid to form

Na2S04 (Glauber's salt) in a third case. In this situation, GLAUBER would create two classes. The first (let us call it NaOH-reactors) contains hydrochloric acid, nitric acid, and sulfuric acid, while the



second (NaOH-results) contains sodium chloride, sodium nitrate, and Na2S04. The pattern would be stated as (reacts (inputs NaOH NaOH reactors) (outputs NaOH-results)).

Combining Related Classes. Identical or very similar classes may be created for different patterns. The system compares classes and com bines those that have a high percentage of elements in common. At the same time, both sets of patterns are associated with the combined class. For example, having also generated the class of sour-tasting substances GLAUBER notes that every member of the sour-tastipg class also fits the pattern associated with the NaOH-reacting class (and vice versa). As a result, the members of sour-tasters and NaOH reactors are combined into a new class. This class becomes associated

with two patterns, one involving taste and the other summarizing the class of NaOH reactions.

Recursing Towards More General Patterns. Since patterns are stated in the same form as the initial facts, GLAUBER can apply its abstraction

method recursively to the patterns it already obtained. In this way it

may arrive at a more general pattern (reacts (inputs acid alkali) (outputs salt)). After the next round of abstracting patterns is com

pleted, the system again compares and combines classes and patterns.

Discovery Systems -

Preliminary Conclusions

Undoubtedly, our brief description of both systems trivializes some of their features and obscures the motivation behind various aspects of their design. An interested reader can find the details of their con

struction and testing in Langley (1978, 1981); Bradshaw et al. (1980); Langley et al. (1983, 1983a, 1987). Nevertheless, several important features of computer systems of discovery can be seen from our

examples. First, requirements of computer programming bring dis

covery strategies to the level of detail, concreteness, and construe -

tivity of principles necessary for running the programs on the com

puter. Second, the programs are general both in terms of the broad

range of data they accept and the discoveries they make. They are

general, or perhaps universal, in the same sense that logic is: each

provides a formalism that applies to data in any language in a class of

languages that have a syntax admissible by the program. Third, a



discovery system may be divided into rules of inference and the control mechanism that guides its application. The control mechanism lets the program sail safely through failures and successes in in ferences, does bookkeeping, and changes the tasks when the ap propriate time comes, thus letting the systems cumulate and generalize their discoveries. Fourth, BACON and GLAUBER apply the same inference mechanism both to the initial facts and to the intermediate results on all levels of abstracting their concepts and laws. This is

possible because both facts and laws have the same syntax, and because there is no formal difference between concepts based on direct observation and the concepts created in the abstraction process.

Discovery Systems and Heuristic Search

GLAUBER, BACON and other systems of discovery can be viewed as carrying out the search through a space that includes possible laws, concepts, and other entities entertained by the scientists. From the

programming point of view the search is guided both by the program and by the data it operates on. From the point of view of artificial

intelligence principles, the programs use heuristic search. Their heuristics comprise knowledge about the discovery process, and in clude both heuristics that make local moves by application of rules of inference, and heuristics that affect global control strategies.

Why is search crucial in problem solving, and why may heuristics be so helpful? If we do not know a method that allows us to solve the

problem directly, we search, trying various possibilities. The less we know the more we must grope in the dark. But even in blindfolded

wandering there may be a method which may even guarantee that the solution can be found at last. We can at least be systematic, not trying the same place twice and not skipping others. Such a control strategy is treated as a very weak method. The less we know, the weaker

methods we use, for strong methods are then inapplicable (they would lead us astray when their presuppositions are not satisfied). The

weaker the methods we use, the longer it generally takes to find a solution. For problems even of moderate size an exhaustive, sys tematic search through the space of possible states may take weeks of

computer time, hence a need for cutting down search. This need is satisfied by providing the program with heuristics. Some heuristics propose the most likely moves from the current place in the space of



states while disregarding some other moves. Other heuristics may select the most suitable search strategies.

BACON and GLAUBER apply a variety of heuristics. In parti cular, they use data to direct their search. If BACON fails to find a linear dependency between two variables, it tests whether one changes

monotonously with the growth of the other. If this is the case, the

system considers a new concept, the ratio of both variables, in the

hope that the new concept will be linearly related to other concepts available to the system. For the inversely monotonous relation the

product of both variables is created. By continuing in this direction BACON quickly discovers, for instance, Kepler's third law, R3/T2

=

const. GLAUBER'S search for regularities is also guided by data. For instance its FORM-LAW heuristic picks a fact and creates a schema

by fixing one attribute and one of its values. Then, the system searches

through the rest of the data for facts that match this schema. In this

way no schema is considered that was not originated by the initial data.

Another technique used for cutting down search is problem decomposition (divide and conquer). If the problem can be decom posed into a collection of simpler problems, then a search for the final solution reduces to a sequence of searches for partial solutions. Partial solutions are easier to find, and they are usually combined until the final solution is reached. Both BACON and GLAUBER apply this method. Moreover, the systems keep cumulating their findings as long as they can and the final solution each of them reaches is the maximal

theory that can be abstracted from the given data with the inference

capabilities of the system. Both systems may be called knowledge cumulating systems, whose partial solutions are laws and concepts of lesser generality, which may be later abandoned for their lack of

importance within the collection of final laws and concepts. What

concepts and laws are the most valuable may only become clear at the end of the search process.

Some properties of search go beyond the technical description of the program. BACON complies with the principle of simplicity of

knowledge by the particular order in which it applies its heuristics.

Simpler possibilities are tried first, and if they succeed, more complex competitors are not considered at all. The principle of simplicity is observed in an implicit way, detectable by meta-analysis of the system. It is analogous to similar properties of logical systems.



Heuristics are criteria for selection and priority. They may be

incorporated in search programs in a variety of ways: in control

processes that select the node in the search tree from which the search is to continue or that mark certain nodes as unpromising, in processes that select the operator to be applied to a node, in restriction of the

operator to a subset that promises especially rapid progress to the

goal, or in processes that evaluate progress.

Discovery Systems: Search and Logic

Since heuristics and operators change one state into another, they are

analogous to inference rules in logic. Deductive inference rules in

logic are operators that guarantee the deductive validity and hence the non-creativity of transformations. Heuristics, in a broad sense, do not guarantee deductive validity or completeness, but instead they increase the efficiency of search.

TABLE I.

Logic Search

Space of syntactically Search space (space of possible correct formulas knowledge states)

Inference rules Operators, heuristic rules

Logical consequence of Point reached in search space from a

a formula given state

Set of consequences of Search tree from a given state a premise

Decidability Computational complexity

The analogy between formal logic and heuristic search may help in

understanding what type of logic of discovery lies implicit in the

computer systems of discovery. Table I, which describes the analogy, can obviously be extended to other concepts relevant to search and

logic. The axioms and inference rules of a logic define a set of reachable consequences, but do not determine the order in which they

will be reached. A logic can therefore be viewed as a non-deter ministic algorithm. By a non-deterministic algorithm we understand



here one that makes all possible inferences permitted from a given state (formal automata theory notion of indeterminism) rather than one that selects one of the inferences randomly and that can make different inferences at different times (physical notion of indeter

minism). A system of heuristic search is deterministic. The order in which points will be reached in the search space is determined by heuristics and operators that generate new points from the points already reached, and by a control structure that determines to which

point in the space of states they will be supplied. In other words, a logic, viewed as a non-deterministic algorithm,

can be supplemented by heuristics to determine which rules of in ference will be applied and a control structure to determine where

they will be applied (i.e., which premises already proven will be taken as the inputs to the inference rules). Thus, an automatic theorem proving system consists of a logic combined with a superimposed system of heuristics and a control structure. Of course this superim posed structure may be very simple

-

say, a breadth-first search. Or, it

may, in the interest of efficiency, be very complex -

say a best-first search using some kind of an evaluation function to determine to

which of the already-proven theorems the next operator should be

applied. The ordering of operators and the control structure may be desig

ned to respond not only to issues of efficiency, but also to con siderations of decidability, completeness, consistency, validity of con

sequences, and so on. The same criteria can be applied to the heuristics and control structure of any heuristic search system.

Heuristics and Inference Rules

We have earlier characterized some of the rules in the discovery systems as rules of inductive inference. Let us examine, in logical terms, the inferences they make.

BACON s heuristics

The premises of FIND-LINEAR-REGULARITY are a collection of statements that tell what values of dependent variable y go together

with what values of independent variable jci. For instance "if jci = 2 then y

= 7", "// jti = 3 then y = 9". The conclusion of FIM>LIN



EAR-REGULARITY heuristic is a statement of the form

(1) Eab(a = 2 & b = 3 & AxxEy axx + b = y). This conclusion is an inductive generalization, since it is intensionally not limited to the input data. In addition, two new quantitative concepts are created, a and b, with their values of 2 and 3 respec tively. Both the premises and the conclusion (1) are only a partial

description of the knowledge state that BACON associates with them, for several other independent variables have their values fixed. Even if these variables are not involved in the inference explicitly, since

BACON keeps this information in another place, they are needed for

logical reconstruction of BACON's reasoning, in particular for com

bining different inferences. Thus, the full conclusion has the form

(2) Ex2, x3,..., xn(x2 =

ml&...

& xn = mn)Eab(a = 2 & b = 3) & AxxEy axx + b = y

where mx,..., rrin are numbers. On the next level of BACON's abstraction, regularities that link x2, a, and b are considered and if the search is successful, conclusions similar to (2) are found.

While combining the statement (2) with regularities for a and b on the next higher level of BACON's abstraction, we obtain a statement

Ex3 ... xn(x3 =

m3 & ... & xn =

m? & Ecdef(c = l,d = l,e = 2,f=l

& Ax2Eab(a = cx2 +d8c b = ex2 +f & AxlEyaxl + b

=

y))).

This conclusion can be transformed to the form which omits reference to the sequence of discovery steps: Ex3

...

xn(x3 =

m3 & ... & xn =

rrVn

& Ax2, X\ Ey y = (jc2 +1)xi + (2jc2 +1)). This result may be further

combined, in a similar way, into a regularity that includes variables X3y X4, ... 5 Xn.

Hintikka in his (1988) considers A?-statements of the form AxEyPxy as initial facts, and argues that they need not be further

decomposed into atomic facts. It is interesting to notice that the intermediate conclusions of BACON are of the same form as such

AE-facts. Since BACON uses the same strategy of reasoning for the bottom level "atomic" facts and for the intermediate results, Hin tikka's initial facts can be used as initial facts for BACON, too.



We can add a few words of justification to Hintikka's claim. If "pluto is a dog" is an atomic fact as philosophers generally agree, then, considering what we know about processing complexity in computer vision and pattern recognition, we can agree that AE statements may be facts, too, if their scope is limited to the duration of observation.

GLAUBER'S Heuristics

The FORM-LAW heuristic of GLAUBER produces conclusions very similar in their logical structure to BACON's FIND-LINEAR. They can be represented in first order logic in the form of a pair of statements

Ax Ey Pxyc and Ay Ex Pxyc,

in the case of two or more classes of differing values, x and y, and one

object c in common. They can be represented in the form

Ax Pxa

in the case of one class of members x, and one object a in common. These regularities can be combined in a way similar to BACON's on the higher levels of GLAUBER'S recursion. In this way a law (Ax) (Ay)(Ez) Rxyz can be produced in two steps. An example of the last

schema is: "For all acids and all alkalis there is a salt which is their

product of reaction". All these inferences as well as BACON's in ferences can be classified as inductive schemes. They start with

particular facts and conclude with generalizations that are applicable to other facts and for that reason are subject to falsification.

Heuristics and Transfer of Justification

So far we demonstrated that the discoveries of BACON and GLAUBER are the conclusions of inductive schemes, while the initial

premises are the initial facts provided to both systems. Justification is transfered across these schemes from the premises to the conclusions. In principle, the same schema may be used to justify conclusions given that the premises are justified prior to the inference or to justify



premises given the conclusions. For a deductive scheme of inference the first situation is called deduction, the second, reduction.

Initial data are taken to be true by BACON and GLAUBER. Although we can conceive of systems that may reject some data

(STAHL system, Zytkow and Simon 1986, Langley et al. 1987), we do not consider such a possibility in this paper. Conclusions of inductive inferences are less certain than the premises, as additional facts can

contradict them. For exactly the same reason, conclusions have larger empirical contents. This refers only to possible facts as opposed to known facts because if the latter contradict the conclusion then an incorrect schema of inductive inference must have been used.

However, as implemented, BACON and GLAUBER do not insist on exact fits to the laws they find. BACON accepts small deviations of observed values from theoretical values, and GLAUBER permits some exceptions from its generalizations. In both cases, the precision required is controlled by parameters in the programs.

Bootstrap Confirmation in Discovery Systems

Scientific laws, some of them at least, play the double role of a definition and a genuine empirical regularity. Can a law which is used as a definition be also empirically confirmed? Bootstrap confirmation criterion (Glymour 1980) justifies an affirmative answer to this ques tion. Bootstrap confirmation requires that empirical situations are

possible such that the values defined (computed) based on the tested hypothesis can eventually contradict the hypothesis. The bootstrap confirmation criterion can be used to justify some inferences in discovery systems. In BACON, the criterion cari justify the schema for introduction of intrinsic variables.

Consider rediscovering a simple version of Ohm's law (we adapt here an example from Langley et al. 1987, chapter 4). Experiments consist in arranging a simple circuit (a battery and a resistor) and

measuring the current in the circuit (I, dependent numerical variable). Let the two independent variables under the control of the experi menter be voltage (V, numerical variable) and the choice of the resistor wire (W, nominal variable). In the initial sequence of experi

ments let voltage be fixed as vu and let the experimenter vary the resistor wire, each time measuring the resultant current. Since W is nominal, BACON introduces a new numerical variable that we will



call C (the conductance, the reciprocal of the resistance), and defines its values for all the wires by the values of the corresponding currents.

Following this, BACON applies its FIND-LINEAR-REGULARITY heuristic and discovers that / is linearly related to C with the slope (S) of 1.0 and the intercept of zero. This result is tautological in view of the way in which the values of C were determined.

But later, when the experimenter varies the next independent vari able V, while using the same collection of wires, the findings of the

system are subject to genuine justification. The system uses the same value of C for the same wire placed in whatever circuit. For the valuers v2 and v3 of V, the slopes will be s2 and s3, respectively. Then, the

application of FIND-LINEAR-REGULARITY to V and S gives in result S = aV. We may notice that a

= V\. Taken together, S = aV

and I = SC are equivalent to Ohm's law. If some values of I, however, were different from the actual values, Ohm's law would not be discovered. With three batteries and three resistors the example includes nine facts. Three of them are used for defining the values of

C. Another is used to calculate s2, and thus to propose the propor tionality between S and V. But all other facts provide confirmation for the discovered regularities.

In our discussion of justification in discovery systems, we have seen that the very process of discovery, along the lines of the systems we have described, involves testing tentative generalizations against the available data. By the time a law has been discovered, it has also been demonstrated to be consistent with these data. Hence, a large part of the verification of any scientific law takes place during the process of

discovering it. Further verification involves new data that may be obtained subsequently to the discovery. Of course, whatever the level of verification attained at the time of discovery, there is no guarantee that new data will not invalidate the law that has been found.

Norms in Discovery Processes

There is little doubt that discoveries can be studied as facts, although some philosophers of science are sure (Popper 1959, p. 32) that such an enterprise is hopeless. But can discoveries be studied as norms?

The answer is definitely affirmative. Some scientists do better work than others. Some courses, and some textbooks teach better methods than others. These intuitive evaluations can be studied and the



methods reconstructed as discovery systems. For the purpose of our

analysis we may distinguish three types of norms, which we will call MUST, MAY, and SHOULD norms.

1. MUST norms specify things that have to be done in order to

accomplish the goals. For instance: "In order for a statement to be considered a theorem, a proof MUST be produced".

2. MAY norms are weaker and allow for a variety of ways in

reaching the goal. A procedure that searches for the proof embodies a MAY norm if either this or another procedure MAY be used in order to accomplish the goal.

Scientific procedures are largely of the MAY type because almost any scientific objective can be achieved in many ways. Among them there are better and worse methods, and methods so poor that they seem futile to pursue even if better ones are not known.

3. Usually there is no basis for choosing the optimal method; methods are applied that experience has shown are "good enough". The "good-enough" methods are called satisficing (Simon 1956, p. 129). We may call them SHOULD methods, too, and fhey are primary targets for discovery systems:

The normative theory of discovery processes can be viewed as a branch of the theory of

computational complexity. Given a class of computational problems, we wish to discover algorithms that are most efficient on average; or, if we cannot find the most

efficient, at least to discover some that make good use of the information available in the problem situation (Simon 1973, p. 477).

Of course, when SHOULD methods are improved, the improved versions supersede the old ones. In terms of our earlier comparison between search and logic, MUST norms correspond to logic as a non-deterministic algorithm, MAY norms correspond to various

superimposed systems of heuristics and control, while SHOULD norms correspond to satisficing theorem-provers that are relatively efficient.

In characterizing different types of norms we referred norms to

goals. Goals are crucial to the validation of norms. What is the source of goals for discovery systems? One approach concentrates on scientific practice and attempts to reconstruct the goals of real science

(scientists) and the relations among them. Law and model generation, concept and instrument creation, belong to this category of activity.



Another approach starts from goals stated in Plato's fashion. Those

may be truth, simplicity, justification, and so forth. They need to be interpreted in order to become constructive. In this process a favorite

logical definition can be used of truth, justification, and the like. Normative discovery systems may use both approaches. We are more

inclined towards the first type of goals, but may admit the use of the second type.

Within the comparison between logic and search, goals delimit search spaces, while additional requirements of efficiency are satisfied

by arranging for selective heuristic search.

Normative Discovery Systems: Research Paradigm

Experience with discovery systems provides a research paradigm for the normative theory of discovery. This paradigm belongs both to artificial intelligence and to the philosophy of science.

1. Take one or a few scientific goals at a time and construct a

computer system that reaches these goals. Do not try to develop a

program that will satisfy the goals under all possible circumstances. Rather, make up a reasonable substitute for the general goal. For instance, the goal of finding a law is understood by BACON as the

goal of finding a law that can be decomposed into linear regularities.

Schemes of inference and control strategies for discovery systems can be abstracted from case studies of particular discoveries or from the advice of particular scientists, including ourselves.

2. Let the program modify its path in response to data. The program becomes more efficient this way, because it can adapt its computation to the particular situation described by the data. This can be done by instantiating an operator on the basis of the data (GLAUBER does this with its FIND-LAW operator), or by making the choice of the operator depend on the data (BACON does this when it creates a new concept). Such a program is called data driven. A data-driven pro gram may be responsive not only to the input data but also to the intermediate data or theories it has inferred.



3. Validate the normative system by investigating the range of goals it can reach and its efficiency in attaining the goals. Validation employs a combination of experimentation and theoretical analysis. Theoretical

analysis can tell, for instance, the range of functional forms that a law can take. In complex situations, however, we are usually not sure

whether the given theoretical abstraction of the actual computer discovery system is adequate to the system itself. It is usually much easier and very enlightening to let the system run on different bodies of initial data and to watch its performance. One of the major advantages of having a computer system is the capability of examining it experimentally. We can let the system run on a variety of data while

using the same program. We can let the program "forget" the output of the former computations, each time starting from a "tabula rasa" state of knowledge, which is impossible in experiments with humans.

We can vary the program and compare the results obtained under different versions. All these tests may contribute to the evaluation of the system.

Individual components of the system may also be subject to validation. A concrete rule of inference that generates particular periodic pat terns is justified by a requirement for periodicity heuristics. Periodicity heuristics, in turn, along with monotonicity, symmetry, conservation heuristics, and so forth, are justified by a requirement for pattern (regularity) detection.

Evaluation criteria may vary from one system to another, as these systems differ in their tasks. Whatever acceptance criteria we use for the purpose of evaluation of a given system, we admit that these criteria are neither necessary (the same goals may be achieved in different ways, allowing for many MAY norms) nor sufficient (the realistic criteria are subject to exceptions). What we are really trying to evaluate is whether the system's processes are "conducive" to

discovering regularities. The normative approach to discovery consists, therefore, of two

complementary parts: systems of discovery and sets of criteria for judging the efficacy and efficiency of these systems.

4. Cumulate systems of discovery into a larger system by allowing them to interact with each other. This procedure is best illustrated by



a number of examples in (Langley et al. 1987, chapter 9). Another way of developing a system of discovery is by responding to its limitations and imbedding it into a larger system that overcomes some

of them. In this way BACON is imbedded in the FAHRENHEIT

system which augments BACON with a search for the scope of laws. In doing this FAHRENHEIT uses BACON both to find a law and to find the scope of that law.

5. Reconstruct the scientific system of goals. Each individual system of discovery takes care of limited goals. If the individual systems are combined together, the need for understanding interrelations between

goals becomes crucial. Clarity of goals is also crucial for validation of normative systems. In the descriptive perspective, clarity of goals is crucial for a comprehensive understanding of science in its variety.

Long experience in philosophy of science, psychology, and other

disciplines reveals that going beyond the analysis of single goals or small systems of goals presents formidable research problems. Here the aid of computer simulation may be essential and by the experi

mental study of systems of discovery we may come to understand

complex systems of goals better. Parts of the goal/subgoal hierarchy in science may be generally

applicable for the whole science. For instance, the goal of extending the scope of the laws to the new range of a variable may call for the construction of measuring devices that work in the new range. But there are domain-dependent goal/subgoal relations, too. Thus the tissue-slice method may be called within a particular biochemical context, the liquid electrode method is relevant to analytical chem

istry, and so forth. The existence (maybe even prevalence) of domain dependent goals adds to the complexity of the scientific goal structure.

6. Do not separate discovery from justification. Discovery and justification processes are intertwined in science. An attempt at separating them in the manner of hypothetico-deductivism would

handicap the performance of discovery systems.

Of course, justification and discovery can be separated conceptually, and the systems of discovery may be studied from both perspectives. Such an analysis, when applied to BACON and GLAUBER, demon strates that regularities are discovered at the same time they are

justified. Since a typical discovery process proceeds in a sequence of steps,



which may be very long, and since each successive step may be guided by the information accumulated by the previous steps, by the time the

process is completed the law that has been discovered will already have been tested against the evidence and found consistent with it.

Hence, there is a considerable overlap between discovery and

justification processes. In fact, justification is needed only to match a law against new evidence that had not been considered as a part of the

discovery process (Simon 1973, p. 475). The relation between discovery and justification may be described

in Bayesian language: Each step in the discovery process, since it involves confrontation of the emerging hypothesis with data, increases the probability of the hypothesis. Its probability at the moment when the discovery process has been completed may therefore be taken as the prior probability in a subsequent Bayesian confirmation process.

Descriptive and Normative View of Discovery Systems

There have been many attempts throughout history to codify the scientific method. There are the well-known accounts of Bacon,

Newton, and Mill, as well as hundreds of accounts of scientific method 0 in textbooks of all times. The main problem with all these studies is that they leave a gap between the advice they provide and the

application of this advice in the practice of making discoveries. A

computer program incorporates a strategy of discovery in a detailed, procedural way, so that, when it is executed, it actually makes dis coveries. Computers make the execution efficient and exact. This

distinguishes the new approach to discovery from the contemplative manner in which we can use descriptive principles of the traditional advice.

In addition to the normative discovery paradigm, another paradigm also stems from the systems of discovery, a paradigm that is useful for

cognitive science and for sciences of science. The latter paradigm differs from the normative one because it aims at different goals. The normative paradigm deals with efficient software that makes dis coveries, while the descriptive one deals with software that models

reasoning of an epoch (history of science) or reasoning of an in dividual (cognitive modeling). Both the descriptive and the normative paradigms can lead to the same system of discovery when we study highly productive scientists.



The system created in this way may reproduce the scientific work, thereby being justified descriptively. The same system may also des cribe the best available way of making discoveries, therefore being justified from the normative perspective. Of course, we are very far from creating a system that would model a scientist in the multitude of tasks he performs and in the variety of means he uses. The same

system may also play both roles in the early stage of development of the two research programs. The early systems of discovery neither

grasp a large scope of scientific activities (and are not good norms for this reason), nor reconstruct human scientific activity in great detail.

They perform both tasks equally well - or equally badly.

Can the same method be an adequate (true) description of the method of a particular scientist or a particular epoch, and at the same time be a valid norm? Certainly the same method can be examined from both perspectives: descriptive and normative. We can confront the method with historical facts about science or about particular scientists to see how closely it conforms to those facts. On the other hand, we can confront it with the scientific goals and see what progress towards the goals this method offers. Both perspectives are

independent and a method that satisfies one may not satisfy the other. However, if we reconstruct adequately the method of a good scientist, both normative and descriptive criteria will be satisfied.

The distinction between facts and values corresponds to the dis tinction between disciplines that study the facts of real science (psy chology, sociology, economy, history), and disciplines that study the validity of the scientific method (philosophy of science, logic, epis temology). We will return to this distinction in the second part of the paper in our discussion of the contexts of justification and discovery.

In fact, the existing systems of discovery deal more with norms than with facts. They usually demonstrate how scientific discoveries might have been made or may be made, and rarely claim that they model the actual path of discovery.

Conclusions

So far we have presented an existence proof for normative systems of

discovery by the construction of such systems and by the extrapolation of them into a paradigm for further research. Since the best existence

proofs are by construction, do we need to further justify our claim? A typical objection to the existing systems is that they are merely a



remote imitation of real-life discovering capabilities. Before we examine this claim let us note that even if self-contained systems capable of making impressive discoveries belong to the remote future, philosophers of science and even scientists can benefit from the

existing systems. Let us recall the analogy with the formal notion of

proof. This notion is only remotely akin to real proofs in mathematics and the existing theorem provers are second-rate imitations of the

mathematicians' capability of proving. But the notion of formal proof and related logical concepts exerted a considerable influence on the

philosophy of science and on the foundations of mathematics. The normative systems of discovery are ready to play a similar role. Further, if self-contained systems are yet to be developed, the existing systems support the normative paradigm well enough to claim that normative systems of discovery are a legitimate program within the

philosophy of science. Are our systems a remote imitation of discovering capabilities? Yes

and no. If we consider the whole variety of actions that a capable scientist may perform, our systems are far behind. But if we consider limited scientific tasks in their areas of strength, they are not doing

worse and are perhaps doing better than real-life scientists. When BACON faces a particular collection of data, it discovers Kepler's laws or Black's law (including the notion of specific heat) in a more efficient and elegant way than Kepler or Black. GLAUBER does not

yield to Johann Glauber in developing the regularity involving acids, alkali, and salts.

Additional features can and are added to the existing systems, and we do not see limits to the range of discoveries they will be able to make. The method used in the development of discovery systems is not a search for the philosophers' stone but rather a mundane work of

gradual improvement which adds more features and increases the

capability of the systems. What systems would satisfy the opponents of the normative approach to discovery we cannot say. But refusal to

accept discovery systems that actually make discoveries is merely dogmatism.

2. EVALUATING THE EXISTING STEREOTYPES

Finally, we wish to discuss two stereotypes in the philosophy of science that claim that the ecological niche for normative systems of discovery



does not exist. These are the stereotypes of "the logic of discovery" and "the context of discovery".

"Logic of Discovery" vs. Discovery Systems

What is a logic of discovery? In the narrow sense it may be represen ted as an inference machine such that, given any data, it mechanically applies rules of inference and comes up with a theory to explain these data (Hempel 1966, p. 14). In addition, inductive schemes of in ferences were required to be logically justified (Popper 1959, p. 28), and to produce a unique, best theory. In the tradition of the twentieth

century philosophy of science such a mechanism was claimed to be non-existential and not possible (Popper 1959, p. 29; Hempel 1966, p. 15).

Our comparison of the concepts related to heuristic search with

logical concepts demonstrates that discovery systems can be con sidered in logical terms and that they can be parts of the logic of

discovery, if only we relax the criteria for such a logic. Three criteria need to be lowered: non-fallibility (logical validity) of inferences, existence of the unique, best solution, and applicability of one system to all data.

In the logical tradition, the label of "validity" is often limited to an infallible rule of inference or to a system of such rules. We do not intend to seek non-fallible inductive results as it is unreasonable to seek the impossible. We treat validity as a relative notion: there are

more and less valid systems. Even if it is not clear how a given system can be improved, it is easy to make it worse. Thus we see room for inductive logic even if we do not share Francis Bacon's overly

optimistic belief that there is a method that leads infallibly to scientific

truth, without producing errors. In a similar way we reject the task of seeking unique and best conclusions. In doing this we follow the attitude that dominates the actual practice of science. Science solves the problems of uniqueness and non-fallibility by ignoring them.

Finally, systems of discovery will work fine on some types of data and will fail on some others. Scientists and science learn only very slowly, over centuries, how to use new forms of data and how to seek for new

types of regularities that summarize them. Our analysis of existing systems of discovery demonstrates the

relationship between norms that can be used in making discoveries



and rules of inference. These norms do not guarantee discovery (what logic could do that?), nor do they guarantee maximum efficiency in

making a discovery. It would be unreasonable to expect a logic of

discovery to meet either of these criteria. What a logic of discovery should do, and what we believe has been done by several systems of

discovery, is to provide heuristic rules of procedure that would con stitute good advice for someone desiring to make a discovery. A logic of discovery is a description of a reasonable scientist

- hence a set of norms for doing science.

Context of Discovery vs. Context of Justification

The context of discovery was for a long time exorcised from the

philosophy of science and "generously" passed on to psychologists, when not considered as a nonscientific task entirely (Popper 1959, p. 32). Even if talking about discovery becomes increasingly popular in the philosophy of science, it is rather talking about discoveries actually

made (usually reconstructions of historical cases), not about the nor mative aspect of discovery processes. It was the normative aspect that was denied its place under the sun by a great number of philosophers

in the tradition of logical positivism. Even if the existing systems of

discovery and the paradigm they create make a strong enough argument for themselves, the vigor with which the context of dis

covery was rejected in the first place is an interesting phenomenon to explain. Why were the arguments against the context of discovery so

widely agreed on? What breach to these arguments is caused by the existing discovery systems? In order to address these questions we

will first recall the problem and the argument.

The Distinction, the Thesis, the Argument, and the Reasons

The simple and tempting distinction occurs repeatedly in the posi tivistically oriented philosophical literature. Let us recall it in a quo tation from Feigl:

It is one thing to ask how we arrive at our scientific knowledge claims... it is another

thing to ask what sort of evidence and what general, objective rules and standards govern the testing, the confirmation or disconfirmation and the acceptance or rejection of knowledge claims of science (Feigl 1965, p. 472).



and in the passage from Reichenbach:

... the tendency to remain in correspondence with actual thinking must be separated from the tendency to obtain valid thinking; and so we have to distinguish between the

descriptive and the critical task. (Reichenbach 1938, p. 7)

Different conceptual distinctions were applied in both quotations: the first opposes justification to discovery, while the second confronts norms with descriptions. The two pairs of concepts used at the same time determine the borderline between the context of discovery and the context of justification. The distinction between both contexts is followed by the claim that only the context of justification is of interest to philosophy. For Reichenbach, and for many others "... episte

mology is only occupied in constructing the context of justification" (Reichenbach 1938, p. 7). This dominant claim can be illustrated in

Table II.

TABLE II.

Discovered Justified

How are laws

(descriptions)

How should laws be

(norms)

(A) Psychology context of

discovery

(C) Logic of discovery non-existent, not possible

(B) Psychology

(D) Epistemology context of

justification

The arguments for the emptiness of (C) are essentially the arguments against logic of discovery that we discussed in a previous section. The area (C) was believed to be empty both actually and potentially because the logic of discovery seemed impossible. Another

important observation refers to (D). In order to keep the intended distinction between the two contexts, it is important to distinguish justification processes from the context of justification. In a traditional example, finding a proof belongs to the context of discovery while

checking a proof belongs to the context of justification. But even checking a proof may involve creative work. Actual proofs in

mathematics virtually never satisfy the standard of proof proclaimed by formal logic, and require creative thinking to be understood. This



is not an isolated example. Most justification processes belong to the context of discovery. Take, for instance, Popper's urge for severe tests of hypotheses. Designing severe tests is nothing less than discovering new experimental arrangements, and although this belongs to the

process of justification, it certainly does not belong to the context of justification. Similarly important in the justification processes is the discovery of new instruments, new methods of preparation, etc.

In conclusion, if the area (D) is determined by the question "how should laws be justified?", then it only partially belongs to the context of justification, because it also partially belongs to the context of discovery. Both contexts are intertwined and the scope of the

justification context is very limited. It is reduced to tests that answer the question "Is such and such justification valid?", and is destitute of constructive steps. Why then, if so limited, has the context of

justification dominated logic and philosophy of science? And why was the context of discovery banned? Let us address both questions in

sequence. Several reasons made the context of justification attractive. It was

based on an attractive domain, formal logic. This promised formal, abstract, precise, analytical, difficult, and general results

- all

scientifically prestigious. By concentrating on logical foundations of

justification philosophers could abstract from the processes of scientific justification and avoid criticism of their results by scientists.

This gave them an exclusive possession of the new field. Since the

exciting developments in the foundations of mathematics that were

yielded by formal logic could be also applied to research in the foundations of science, some philosophers believed that they might even tell scientists whether they were wrong or right in their

justification procedures. Finally, the domain of expertise was com fortably limited. The philosopher was supposed to enter the stage only after all pieces of the solution were arranged for his inspection. Someone else had to organize all pieces of the puzzle and the

philosopher was supposed to check whether the proposed solution was

really a solution. To be prepared to carry out this job, the philosopher has had to carry out a prior task: finding the examination criteria and the way to apply them.

In contrast, the context of discovery hardly offered any payoffs. Attempted reconstructions of the creative process were embarrass

ingly far from satisfying the scientific standards of formalism, general



ity of principles, and testability. No respected discipline offered a source of transfer of ideas and solutions. The idea of a logic of

discovery was dismissed. While the scope of attention was con

veniently narrow and well determined for work within the context of

justification, the area of focus required to explain or to reconstruct any given discovery seemed to be overwhelmingly broad. By narrowing it down we risk the explanation being ad hoc, since we consider only the relevant facts. But how do we know what is relevant before the

discovery has been made? But an even more principled problem was seen. Big discoveries that

particularly attracted attention and influenced the image of discovery were said to go beyond the limits of the existing conceptual structures and beyond the existing ways of thinking. Thus no matter how big the framework in which the discovery was considered, the discovery consists in going beyond that framework.

Reasons for Rehabilitation

The arguments against the context of discovery apparently have lost their power, while at the same time attractive features of the context of justification have become applicable to the context of discovery. In fact, discovery systems in the form of computer programs may be

exciting to people who like detail, precision, and formalism - the

features that made the context of justification attractive. Work in the area of discovery systems offers autonomy of research as the areas of artificial intelligence and machine learning to which they belong develop their own methods. The existing systems of discovery are "closed systems" for each of them has clear limits on the data it will handle and the discoveries it can make. However, these systems demonstrate how little is required to simulate some discoveries, and

they convince us that it is a practical matter of construction to demonstrate how much (or perhaps how little) of complexity in cir cumstances and associations is enough to create an effective discovery system.

The change in attitude was brought about by the development of

symbolic processing and heuristic search systems on fast computers. Instead of arguing for the possibility of systems of discovery we can now present a constructive proof in the form of running systems. The

existing systems are the key argument for the plausibility of the field.



Even if they represent an early stage of development of discovery systems, enough can be extrapolated from their structure and behavior to create the vision of a research program.

The new approach is strengthened by the empirical possibilities that it offers, and the cumulation of programs that may someday go beyond

human comprehension. Computer programs are very patient and

systematic in what they do -

virtues uncharacteristic in humans. Therefore computer programs can test ideas that we humans accept but are not able to follow in accurate detail.

REFERENCES

Bradshaw, G., P. Langley, and H. A. Simon: 1980, 'BACON 4: The Discovery of

Intrinsic Properties', Proceedings of the Third National Conference of the Canadian

Society for Computational Studies of Intelligence, pp. 19-25.

Buchanan, B. G. and T. M. Mitchell: 1978, 'Model-Directed Learning of Production

Rules', in A. Waterman and F. Hayes-Roth (eds.), Pattern-Directed Inference Systems, Academic Press, New York.

Feigenbaum, E. A., B. G. Buchanan, and J. Lederberg: 1971, 'On Generality and Problem Solving: A Case Study Using The DENDRAL Program', Machine In

telligence 6, Edinburgh University Press, Edinburgh. Feigl, H.: 1965, 'Philosophy of Science', in R. M. Chisholm et al. (eds.), Philosophy,

Prentice-Hall, Englewood Cliffs.

Glymour, C: 1980, Theory and Evidence, Princeton University Press, Princeton.

Hempel, C. G.: 1966, Philosophy of Natural Science, Prentice-Hall, Englewood Cliffs.

Hintikka, J.: 1988, 'What is the Logic of Experimental Inquiry?', Synthese 74, next issue.

Langley, P. W.: 1978, 'BACON 1: A General Discovery System', Proceedings of the Second National Conference of the Canadian Society for Computational Studies, pp. 173-80.

Langley, P. W.: 1981, 'Data-Driven Discovery of Physical Laws', Cognitive Science 5, 31-54.

Langley, P., G. Bradshaw, and H. A. Simon: 1983, 'Rediscovering Chemistry with BACON 4', in R. S. Michalski, J. G. Carbonell, and T. M. Mitchell (eds.), Machine Learning: An Artificial Intelligence Approach, Tioga Press, Palo Alto.

Langley, P., J. M. Zytkow, G. Bradshaw, and H. A. Simon: 1983a, 'Three Facets of Scientific Discovery', Proceedings of the Eighth International Joint Conference on

Artificial Intelligence. Langley, P. W., H. A. Simon, G. Bradshaw, and J. M. Zytkow: 1987, Scientific

Discovery: Computer Explorations of the Creative Processes, MIT Press, Boston.

Lenat, D. B.: 1977, 'Automated Theory Formation in Mathematics', Proceedings of the

Fifth International Joint Conference on Artificial Intelligence, pp. 833-42. Lenat, D. B.: 1983, 'The Role of Heuristics in Learning by Discovery: Three Case



Studies', in R. S. Michalski, J. G. Carbonell, and T. M. Mitchell (eds.), Machine Learning: An Artificial Inellligence Approach, Tioga Press, Palo Alto.

Michalski, R. S. and R. E. Stepp: 1981, 'An Application of AI Techniques to

Structuring Objects Into an Optimal Conceptual Hierarchy', Proceedings of the Seventh International Joint Conference on Artificial Intelligence, pp. 460-65.

Popper, K.: 1959, The Logic of Scientific Discovery, Basic Books, New York.

Reichenbach, H.: 1938, Experience and Prediction, University of Chicago Press, Chi

cago. Simon, H. A.: 1956, 'Rational Choice and the Structure of the Environment', Psy

chological Review 63, 129-38.

Simon, H. A.: 1973, 'Does Scientific Discovery Have a Logic?', Philosophy of Science

40, 471-80.

Zytkow, J. M. and H. A. Simon: 1986, 'A Theory of Historical Discovery: The Construction of Componential Models', Machine Learning 1, 107-36.

Computer Science Department Wichita State University Wichita, Kansas 67208 U.S.A.

and

Department of Psychology Carnegie-Mellon University Pittsburgh, Pennsylvania 15213

U.S.A.


Article Contentsp. [65]p. 66p. 67p. 68p. 69p. 70p. 71p. 72p. 73p. 74p. 75p. 76p. 77p. 78p. 79p. 80p. 81p. 82p. 83p. 84p. 85p. 86p. 87p. 88p. 89p. 90

Issue Table of ContentsSynthese, Vol. 74, No. 1, Knowledge-Seeking by Questioning, Part I (Jan., 1988), pp. 1-142Volume InformationFront MatterIntroduction [pp. 3-4]The Revival of Questioning in the Twentieth Century [pp. 5-18]Erotetic Logic and Scientific Inquiry [pp. 19-46]Rational Ignorance [pp. 47-64]Normative Systems of Discovery and Logic of Search [pp. 65-90]How's of Why's and Why's of How's: Relation of Method and Cause in Inquiry [pp. 91-106]System-Problems in Kant [pp. 107-140]Back Matter

Documents

ZytkowSimon Normative Systems of Discovery and Logic of Search