7
Reasoning with uncertain information D. Pang, BSc(Eng) J. Bigham, BSc, DipStat, MSc, PhD Prof. E.H. Mamdani, BE, MSc, PhD, CEng, MIEE, SenMemlEEE Indexing terms: Computer applications, Control equipment and applications, Information and communication theory, Logic Abstract: Handling of uncertainty of various kinds is one of the key issues in expert systems. The first generation of expert systems have addressed this issue in a variety of imaginative ways. However, all these techniques have theoreti- cal deficiencies which have been commented on in the literature. The literature also contains other innovative ideas on how this problem may be tackled. It is clear that the problem itself is very complex and does not appear to be capable of any general solution. The paper surveys the literature on reasoning with uncertain information and, in the course of doing so, articulates some of the underlying questions. the following truth table and is therefore equivalent to ~p V q as well as its contraposition ~~q -> ~p: 1 Introduction Expert systems technology has recently received much attention from industry. The highly specialised knowl- edge of many science and engineering domains is often possessed by just a few experts. It would be advantageous to be able to mechanise the activities of experts in solving their domain problems. One of the difficulties of imple- menting such systems is that, in many real-world prob- lems, even the experts do not have complete understanding of the complex domains. Nevertheless, they are often able to form heuristics that are derived not from first principles but rather from their experience or some abstract mental models. To exhibit expert behaviour, an expert system must be able to represent the uncertain heuristics and perform inferences with them. A knowledge representation scheme must be used that can encode uncertain knowledge, and the inference strategies that are used must have provi- sions for handling uncertainties and conflicts. Many inference techniques for handling uncertainties have been tried out in various expert system applications. There is a growing literature that reports their performance. In this paper we shall present a survey of some of these tech- niques. 2 Uncertainty Classical logic, by its very nature, ignores the problem of uncertainties. All the propositions in such logical systems are either true (T) or false (F). The implication relation between an antecedent p and a consequent q is defined by Paper 5417D (C4) received 21st October 1986 The authors are with the Department of Electrical & Electronic Engin- eering, Queen Mary College, University of London, Mile End Road, London El 4NS, United Kingdom P Q T T T F FT F F T F T T Inference is often based on deduction using a rule of inference called modus ponens, which allows the conse- quent of an implication to be concluded if the antecedent is true: p-*q P q" A derivative of modus ponens is modus tollens p-*q This follows from the fact that p -> q is logically equiva- lent to ~q -*• "p. Inference in expert system has to apply the domain knowledge that is generally true to a set of facts per- taining to a particular situation, to generate beliefs about the situation. If the knowledge is certain and there is complete information on the facts, then we can base the inference strategy on classical logic. But expert systems almost always need to deal with some kind of uncertainty or other, and how the different techniques presented here address the problem will be discussed in the following Section. 2.1 Expressing uncertain knowledge To be able to express uncertain knowledge, we need a scheme which allows a proposition to have a truth value other than true and false. One approach is to extend the truth space between definite truth and falsity and allow- ing truth value to be defined within this interval. This truth value can either be a numerical value between 0 and 1, representing a degree; or a qualitative label, such as 'fairly true', which is defined as a partition of the truth space. An alternative approach is to stick to the dichoto- my, but to include the use of modal operator [1] to qualify the state of the truth of a proposition. Examples are the 'L' and 'M' operators, where 'Lp' and l Mp' are interpreted as p is necessarily true and p is possibly true, respectively. Haack [2] and Turner [3] provide more detailed discussions on the philosophical and mathemati- cal aspects of these approaches than is possible here. Both methods have been used to extend the scope of logic to represent and deal with uncertain information.- IEE PROCEEDINGS, Vol. 134, Pt. D, No. 4, JULY 1987 231

Reasoning with uncertain information

  • Upload
    eh

  • View
    212

  • Download
    0

Embed Size (px)

Citation preview

Reasoning with uncertain information

D. Pang, BSc(Eng)J. Bigham, BSc, DipStat, MSc, PhDProf. E.H. Mamdani, BE, MSc, PhD, CEng, MIEE, SenMemlEEE

Indexing terms: Computer applications, Control equipment and applications, Information and communication theory, Logic

Abstract: Handling of uncertainty of variouskinds is one of the key issues in expert systems.The first generation of expert systems haveaddressed this issue in a variety of imaginativeways. However, all these techniques have theoreti-cal deficiencies which have been commented on inthe literature. The literature also contains otherinnovative ideas on how this problem may betackled. It is clear that the problem itself is verycomplex and does not appear to be capable of anygeneral solution. The paper surveys the literatureon reasoning with uncertain information and, inthe course of doing so, articulates some of theunderlying questions.

the following truth table and is therefore equivalent to~p V q as well as its contraposition ~~q -> ~p:

1 Introduction

Expert systems technology has recently received muchattention from industry. The highly specialised knowl-edge of many science and engineering domains is oftenpossessed by just a few experts. It would be advantageousto be able to mechanise the activities of experts in solvingtheir domain problems. One of the difficulties of imple-menting such systems is that, in many real-world prob-lems, even the experts do not have completeunderstanding of the complex domains. Nevertheless,they are often able to form heuristics that are derived notfrom first principles but rather from their experience orsome abstract mental models.

To exhibit expert behaviour, an expert system must beable to represent the uncertain heuristics and performinferences with them. A knowledge representation schememust be used that can encode uncertain knowledge, andthe inference strategies that are used must have provi-sions for handling uncertainties and conflicts. Manyinference techniques for handling uncertainties have beentried out in various expert system applications. There is agrowing literature that reports their performance. In thispaper we shall present a survey of some of these tech-niques.

2 Uncertainty

Classical logic, by its very nature, ignores the problem ofuncertainties. All the propositions in such logical systemsare either true (T) or false (F). The implication relationbetween an antecedent p and a consequent q is defined by

Paper 5417D (C4) received 21st October 1986The authors are with the Department of Electrical & Electronic Engin-eering, Queen Mary College, University of London, Mile End Road,London El 4NS, United Kingdom

P QT TT FF TF F

TFTT

Inference is often based on deduction using a rule ofinference called modus ponens, which allows the conse-quent of an implication to be concluded if the antecedentis true:

p-*qPq"

A derivative of modus ponens is modus tollens

p-*q

This follows from the fact that p -> q is logically equiva-lent to ~q -*• "p.

Inference in expert system has to apply the domainknowledge that is generally true to a set of facts per-taining to a particular situation, to generate beliefs aboutthe situation. If the knowledge is certain and there iscomplete information on the facts, then we can base theinference strategy on classical logic. But expert systemsalmost always need to deal with some kind of uncertaintyor other, and how the different techniques presented hereaddress the problem will be discussed in the followingSection.

2.1 Expressing uncertain knowledgeTo be able to express uncertain knowledge, we need ascheme which allows a proposition to have a truth valueother than true and false. One approach is to extend thetruth space between definite truth and falsity and allow-ing truth value to be defined within this interval. Thistruth value can either be a numerical value between 0and 1, representing a degree; or a qualitative label, suchas 'fairly true', which is defined as a partition of the truthspace. An alternative approach is to stick to the dichoto-my, but to include the use of modal operator [1] toqualify the state of the truth of a proposition. Examplesare the 'L' and 'M' operators, where 'Lp' and lMp' areinterpreted as p is necessarily true and p is possibly true,respectively. Haack [2] and Turner [3] provide moredetailed discussions on the philosophical and mathemati-cal aspects of these approaches than is possible here.

Both methods have been used to extend the scope oflogic to represent and deal with uncertain information.-

IEE PROCEEDINGS, Vol. 134, Pt. D, No. 4, JULY 1987 231

However, it is important that the interpretation of thenumerical values and the modal operators is clear andconsistent throughout the inference system. Otherwise,the validity of the inferences cannot be guaranteed.

2.2 The difference between implication and if-thenrules

In rule-based expert systems, expert knowledge is rep-resented as if-then rules which are usually interpreted asimplications. However, there are differences between theclassical implication operator and the relationshipbetween the antecedent and the consequent expressed asa heuristic rule. For a start, heuristics use a weaker formof implication, so that, even if the antecedent is definitelytrue, the consequent does not have to follow. Associatedwith each rule is a strength of the correlation between itscondition and its consequent which may be quantified orqualified. Also, when we generalise the implication rela-tion to accommodate uncertainty, the deductive equiva-lence of an implication and its contraposition no longerholds. Rescher [4] showed that the two degrees ofsupport can be significantly different. Take the statement'Most women work' as an example. It does not mean thatmost people who do not work are men.

2.3 Using rules of inference to update belief inconclusion

The inference system must also provide rules of inferencefor drawing conclusion about the consequent given infor-mation of the antecedent. They must take into accountsituations when the evidence itself is uncertain (as inexample 1 below), or when the evidence does not exactlymatch the antecedent of the rule (as in example 2), or theevidence may not even be available (as in example 3), inwhich cases the uncertainty in the conclusion should beupdated to reflect these:

rule: if the patient's parents have suffered from anginathen the patient is very likely to have heartdisease

example 1fact: the patient believes but is not very certain that

his parents did surfer from anginaexample 2fact: the patient believes that his parents did have

slight problem with their heart, but it was notthat serious as to suggest angina

example 3fact: the patient has no idea whether his parents have

suffered from heart disease or not

2.4 Conflict resolutionIn classical logic, if the conclusion from two deductionsare in conflict, the theory is said to be inconsistent, andanything can be validly concluded. Unlike deduction,inference drawn from heuristics are generally uncertain,and, although uncertain conclusions may be conflicting,they do not necessarily lead to an inconsistency. Considerthe following example:

if bird then can-flyif animal then cannot-fly

These two rules are certainly correct, and they should beinterpreted as rules that are generally but not alwaystrue. Given that Tweety is a bird as well as an animal,one does not arrive at an inconsistent situation. This is

because no matter whether one chooses to believe thatTweety can fly or cannot fly, the whole theory is still con-sistent. Thinking in terms of possibility, there is nothingwrong in believing that Tweety possibly can fly and atthe same time that Tweety possibly cannot fly.

This raises a question that uncertain inference systemshave to address that would not occur in a classicaldeduction system; namely, when there is conflicting evi-dence, how should the conflict be resolved? In the aboveexample, because all birds are animals, it is obvious thatthe belief that Tweety can fly should be taken. Suchdomain knowledge as all birds are animals can either berepresented explicitly and the conflict resolution strategywill take note of such relationship when it attempts toselect the most specific rule, or this knowledge can beembedded into default rules as explicit exceptions to therules, e.g.

if animal UNLESS bird then cannot-fly

A discussion of resolving conflict using specificity can befound in Touretzky [5]. The problem becomes more diffi-cult when the relationship between the antecedents is notone of total subsumption, and the task of resolution isnot just a matter of choosing between rules. Oneapproach that has been taken is to make assumptionsabout the evidence, such as conditional independence,and the various pieces of evidence are combined to showthe overall effect.

3 Inference techniques under uncertainty

3.1 Numerical approachesMost expert systems employ inference techniques underuncertainty that propagate numeric values through theinference chain. The numerical values are usually inter-preted as subjective probabilities or degrees of belief. Thisis because the numerical values corresponding to thestrengths of the if-then rules in the knowledge base areobtained from the experts. The inference is inductiverather than deductive in nature, meaning that the infer-ences cannot be deductively proved but are only plaus-ible. More than half a century ago, Ramsey [6] andSavage [7] already recognised the importance of subjec-tive probabilities and inductive inference in modellinghuman reasoning.

3.1.1 Bayes theorem: We shall start with a numericaltechnique based on Bayes theorem. Uncertainty in a pro-position is represented as a probability between 0 and 1.The theorem states that, if E = {Eu E2,..., £„} is a set ofn pieces of evidence and H = {Hx, H2,..., Hm] is a set ofm mutually exclusive hypotheses that are under consider-ation, then

with

j) = P(Ej\Hi)P(Hi)

where P(H{) is the probability of the hypothesis prior tothe knowledge of the evidence,

P(Hi | Ej) is the posterior probability of the hypothesisafter Ej is observed,

and P(Ej \ //,-) is the conditional probability of the evi-dence Ej given the hypothesis //,.

This formula can be used as a rule of inference in anexpert system. Suppose the knowledge base contains the

232 IEE PROCEEDINGS, Vol. 134, Pt. D, No. 4, JULY 1987

rule 'if Ej then H- and £, is true, then the Bayes theoremupdates the belief in Ht from P(#,) to P(Ht \ Ej), providedthat P(Ej\Ht) and P{Ej) are known.

A variant of the Bayes theorem has been incorporatedinto the inference strategy of PROSPECTOR [8] andAL/X [9]. The modified rule contains two parts:

•»•"•'=Sp

where

is the prior odds for

is the posterior odds for Ht, and

P{Ej\Hd

P{Ej\Hd

are called likelihood ratios

The expert who supplies the rules in the knowledge basealso provides the likelihood ratios, which are used toupdate the odds for the hypotheses according to whetherthe evidence is observed to be true or false. The twoequations, however, do not say what to do when the evi-dence is itself uncertain. The solution taken by PROS-PECTOR and AL/X, which is not part of the originaltheorem, is to assume that the effect of the uncertaintiesin the evidence on the likelihood ratios is piecewiselinear. Duda attempted to justify this approach and alsodiscussed some of the problems associated with it [8].

These systems also used a generalised updatingformula for aggregating multiple evidence:

o(H( | • • • &Ej) = LYL2 • • •

The assumption behind it is that Ex, ..., £, are condi-tionally independent of each other under the hypothesisHi and its negation. This is a very restrictive assumptionfor many applications. Moreover, Pednault, Zucker andMuresan [10] studied the assumption and concludedthat, if the hypotheses set contains more than 2 elements,then the independence assumption becomes inconsistent.Snow [11] proved that, under these circumstances, onlyone piece of evidence can be effective in updating theprobabilities of the hypotheses.

One major criticism of the use of subjective probabil-ity, in this approach, is that it is not possible to representignorance. It means that if a piece of evidence is onlypartially in favour of a hypothesis, it would also have tobe partially supporting the negation of that hypothesis soas to satisfy the requirement P(H | E) + P{H \E)= I. Thisis counter-intuitive as people often distinguish betweensupporting and refuting evidence. Moreover, probabilitiescan only be assigned to a singleton hypothesis and theymust sum up to one. Consider a medical diagnosisproblem: if there is evidence pointing to a category of

1EE PROCEEDINGS, Vol. 134, Pt. D, No. 4, JULY 1987

diseases, the expert will have to make arbitrary choice tosubdivide his belief among the individual disease in thecategory. Lastly, the expert has to provide prior prob-abilities to the hypothesis set, which is often a very diffi-cult task.

It must be added that some of these criticisms havebeen strongly refuted by Cheeseman [12] as results ofmisconceptions of probability. For instance, he statesthat people argued for separate treatment of confirmingand disconfirming evidence because they have ignoredprior probabilities.

3.1.2 MYCIN: In MYCIN [13], the uncertainty in a pro-position is represented by a numerical value between — 1and 1 called the certainty factor (CF), with + 1 , - 1 and 0corresponding to complete certainty, complete disbeliefand complete ignorance. The default CF value for ahypothesis is 0, and so there is no need for the expert tosupply prior probabilities. CF is to be interpreted as'judgmental measures that reflect a level of belief. It iscomposed of two measures MB and MD representingconfirming and disconfirming evidence withCF = MB- MD.

MYCIN was developed to model what Shortliffe andBuchanan called logic of confirmation. Basically, theybelieve that there is a clear distinction between confirm-ing and disconfirming evidence which increases ordecreases the certainty factor of the hypothesis, respec-tively. Associated with each if-then rule is a measure ofsupport C[h, e], the degree of confirmation of thehypothesis based on the observation. A confirming rulefor h will only update MB[h~] and disconfirmation ruleonly MD\h~\. Uncertain evidence is assumed to havelinear effect on the confirming and disconfirming factors.MB and MD, for each hypothesis, are aggregated separa-tely before combining to give CF. The rules of aggre-gation are given as

MB[h, el&e2'\

_ JO, if MD[h, ey&e{\ = 1

~ {MB[h, e{\ + MB[h, e2](l - MB[h, e{\\ otherwise

MD\h, el&e2']

_ JO, if MB[h, el&e2~\ = 1~ {MD[h, e{\ + MD\h, e2j\ - MD[h, e{\), otherwise

Ishizuka, Fu and Yao [14] have shown that these com-bining functions still depend on the independenceassumption.

3.1.3 Plausible inference: The plausible inferenceapproach, proposed by Friedman [15] is very similar tothe MYCIN approach. Uncertainty in a proposition isrepresented as a credibility measure C(h) = I{h) — D(h),where I(h) and D{h) are the combinations of incrementsand decrements of credibilities, respectively.

The major difference is that Friedman uses four mea-sures of strength for each rule, representing the relevancefactors in the four modes of inference [16], which are:confirmation, modus tollens, modus ponens and denial.For the rule

ConfirmationA1(A|B| + increment in C(A) when C(B) = 1

TollensAl(/4 \B\ -^decrement in C(A) when C(B) = - 1PonensAl(B|/l| + )increment in C(B) when C(A) = 1Denial A1(B| A \ -)decrement in C(B) when C{A) = - 1

233

Inference, therefore, can take place in all directions overthe inference network. The system employs four sets ofrules of inference and aggregation functions correspond-ing to the four modes of inference, and are basically thesame as those used in MYCIN.

3.1.4 Belief theory: Based on Dempster's work on upperand lower probabilities [17], Shafer [18] developed abelief theory to deal with subjective degree of belief. Fora set of mutually exclusive hypotheses H = {Hl9 H2, ...,Hn} the theory allows part of the unity belief to be attrib-uted to any subset of H or any disjunction of H,s. Thedistribution of the belief over the hypotheses set is calleda basic probability assignment m, which has to satisfy thefollowing conditions:

and

m(</>) = 0

The interpretation of the basic probability for a given setof elements is the amount of belief that is committedexactly to that set, but cannot be subdivided into anysubset of itself. For example, if we have to identify aperson's nationality from the set H = {English, French,Indian, Chinese}, then m(H) = 1 would indicate totalignorance because all one can say is that the person isfrom one of the four countries. The basic probabilityassignment of m({Indian, Chinese}) = 1 would mean thatone is certain that the person is an Asian but is not surewhether he is from India or China. A Bayesian would beforced to distribute his belief between the two. The otherproperty of this theory is that, if one attributes part ofone's belief to a proposition, the rest of the belief doesnot have to be assigned to the negation of the proposi-tion.

Continuing the above example, if we know that theperson in question has dark hair, we may be willing tobelieve that he is an Asian to degree 0.8, but we may notbe willing to say that he is a European to degree 0.2 as aBayesian would have to do. Instead, the 0.2 belief can beassigned to the set H, indicating ignorance. In this way,disbelief and ignorance are clearly distinguished in therepresentational framework.

In this theory, uncertainty in a proposition is charac-terised by two values: the degrees of belief and plausi-bility. The former is a measure of the evidence for theproposition, and the latter is defined as 1 — measure ofevidence against the proposition. In the above example,the degree of belief that the subject is Asian is 0.8 andplausibility is 1. The degree of belief that he is Chinese,however, is 0 because there is no direct evidence support-ing it, even though the degree of plausibility is still 1.

In this approach, a piece of evidence does not induce aprobability distribution over the set of hypotheses, rathera basic probability assignment. As in the Bayesianapproach, the theory does not address the problem ofuncertain evidence.

Shafer also proposed a rule of combination for aggre-gating multiple evidence. The rule still depends on thecondition independence assumption. It also involves anormalisation process which is actually redistributing theportion of belief unattributed among the nonempty sets.Zadeh [19] criticised that this would produce counter-intuitive results. Yager [20] proposed an alternative rulewhich put all the unattributed basic probabilities to the

ignorance. Dubois and Prade [21] defended the originalrule.

The major problem of implementing the belief theoryis the computation-time complexity. For a set of hypoth-eses, there can be a maximum of 2" basic probabilities.Barnett [22] described a linear-time algorithm by impos-ing the restriction that evidence can only either support asingleton or its negation. Gordon and Shortliffe [23] areinvestigating an efficient algorithm when the hypothesesform a strict hierarchy, and, therefore, only some dis-junctions of hypotheses are of interest. Garvey, Lowranceand Fischler [24] are also working on a derivative of thebelief theory which they call evidential reasoning. Theyare also able to simplify the computational complexity byonly calculating the necessity and plausibility measures ofthe hypotheses that they are interested in, instead ofkeeping all the basic probabilities. (Also refer to Smets[25] for his treatment of the belief function.)

3.1.5 INFERNO: This work by Quinlan [26] concernsthe over-restrictive assumptions made by many expertsystem inference techniques. In the INFERNO system,propositions are not assumed to be independent unlessexplicitly specified. Quinlan argued that, althoughINFERNO may give weaker inference when comparedwith others, by making no independence assumption, noerrors would be generated by the inference system. Theinteresting point about this approach is that it allows theexpert to explicitly represent any knowledge about therelation between the different propositions, and so helpthe inference strategy to select the right way to combineevidence.

Quinlan is also critical of the inability of most expertsystems to detect inconsistency in the knowledge. In par-ticular, he has argued that normalisation in the Dempsterrule is not the right way to resolve conflict when there isstrong evidence for and against a proposition. Hebelieves that this situation is best handled by reportingan inconsistency to the user. Moreover, the expert systemshould also be able to suggest ways for rectification.

In INFERNO, a proposition p is characterised by twomeasures, t(p) representing the lower bound on the prob-ability of p and f(p) representing the lower bound on theprobability of ~p. It is similar to MYCIN and the plaus-ible inference system, in that evidence for and evidenceagainst are aggregated separately. Where they differ mostsignificantly is that, in INFERNO, when t(p) +f(p) > 1,INFERNO will inform of inconsistency instead of tryingto do a combination.

However, it is not clear that the fact thatt(A) +f(A) > 1 necessarily signals an inconsistency.Recall the example in the preceding Section againstTweety. For the sake of illustration, arbitrarily (andsensibly) assign two strengths to the rules:

birdanimal

fly (0.8)fly (0.9)

Let node A denotes bird, B denotes animal and Cdenotes fly, then part of the inference net would look likethis

A1

fl-C

For t(A) = 1 and t(B) = 1, INFERNO will infert(C) ^ 0.8 and/(C) $s 0.9, and so report an inconsistency.

234 IEE PROCEEDINGS, Vol. 134, Pt. D, No. 4, JULY 1987

The difficulty arises because the relationship betweenbirds and animals has not been articulated. There may bemore complex situations where latent relationshipbetween the antecedents will not be so easy to recogniseand correct.

There is also an implicit assumption made byINFERNO when combining evidence, namely that t(A)and f(A) are monotonic increasing. So if two pieces ofevidence B and C confirm A to, say, degrees 0.6 and 0.8,respectively, t(a) will be incremented to 0.8. If t(A) is to beinterpreted as P(A), then surely P(A \ B&C) does not haveto be greater than the maximum of P(A \ B) and P(A \ C).In fact, P(A\B&C) is not functional to P(A\B) andP(A | C), and any attempt to make it so involves assump-tions of one sort or another.

Another characteristic of INFERNO is that propaga-tion of the truth values can take place in all directionsinside the inference network. Unlike Friedman's plausibleinference system, the strength used for the modus tollensmode of inference of a rule is calculated from the strengthgiven in that rule for the modus ponens mode of infer-ence, and neither confirmation nor denial is used.

3.1.6 Possibility theory and fuzzy set theory: Duboisand Prade [27] proposed an approach for dealing withuncertainty based on the possibility theory [28]. Uncer-tainty in a proposition is characterised by two numbersin the real interval [0 1], n(p) and n(~p), the possibilitythat p is true and the possibility that p is false, such thatmax (n(p), rc(~p)) = 1.

Conditional possibilities are used to represent thestrengths of the rules. Combination of evidence is doneby applying the minimum operation to the various possi-bility distributions. It also involves a normalisationprocess which always assesses the degree 1 to the moredominant between p and ~p. Dubois and Prade [21]have compared this theory with other numericalapproaches in great detail elsewhere.

Fuzzy set theory has been used in other expert systems[29]. Fuzzy sets are sets in which elements may havepartial membership. They can be used to represent vagueconcepts such as 'tall', when there is no crisp boundaryon the height above which a person is considered to betall and below which not tall. Ishizuka et al. [14] haveincorporated the theory in the Dempster-Shaferapproach to allow the expert to give rules that relatevague sets in the antecedents and consequences. Yager[20] also investigated similar ideas on the use of fuzzy settheory in Shafer's belief theory framework [18].

3.2 Symbolic approachesRecently, some AI research people began to question theuse of numbers in expert systems. Experts providing theheuristics in the knowledge base are often reluctant tosupply numerical strengths to them. Even if they arewilling to give subjective probabilities, there are greatdoubts as to whether these probabilities are accurate andwhether they behave as objective probabilities [30].Moreover, numerical approaches often depend on veryrestrictive assumptions, such as conditional indepen-dence, that cannot be satisfied in most domains. As aresult, new approaches that are symbolic in nature havebeen proposed.

32.1. Linguistic reasoning: Fox [31] proposed amethod of using linguistic terms such as 'perhaps','couldbe', 'unknown', 'may_be', 'believed', 'definite' and'improbable' to qualify the truth of a proposition. He

built up a hierarchy of these terms to establish the rela-tive meanings among them. The uncertain implication inthe if-then rules are made explicit. An example of a rulewould look like this:

if symptom is believed to be spotsthen diagnosis may be measles

If evidence shows that spots are 'definite', then rule isfired to conclude that the patient 'may' have measles, butnot if the symptom is only 'couldbe' spots. Fox says thathis approach is complementary to statistical methods, forterms such as 'probable' and 'improbable' in his vocabu-lary can indeed be determined by weighting evidence if sodesired.

3.2.2 Nonmonotonic logic and truth maintenancesystems: When faced with incomplete information,humans usually make assumptions in their reasoning andretract them in light of new information. Beliefs based onthese assumptions are, therefore, liable to be revised.McDermott and Doyle [32] proposed a nonmonotoniclogic to model this kind of reasoning, in which old theo-rems can be invalidated by new axioms. They introduceda modal operator 'M' where 'Mp' is interpreted as 'p isconsistent'. McDermott's first logic is based on predicatecalculus with addition of a nomonotonic rule of infer-ence, which effectively says that Mp can be inferred if ~pis not in the theory, or assume Mp unless p is believed tobe false.

They showed how these features can be used to rep-resent defaults and allow inference to be made even withincomplete information. In the case of complete knowl-edge of the exceptions to the implication, they are explic-itly stated:

bird(x) A M(~penguin(x)) A M(~ostrich(x)) -> can-fly(x)

which can be used to infer can-fly(x) given bird(x) andboth ~penguin(x) and ~ostrich(x) are not proved. If theknowledge of the exceptions are not available, the defea-sibility of the implication can still be expressed, e.g.

bird(x) A M(can-fly(x)) -> can-fiy(x)

which, given bird(x), infers can-fly(x) as default unless~ can-fly can be proved. In this way, the logic provides atheory to represent uncertainties caused by incompleteknowledge or incomplete information.

The truth maintenance system (TMS) implemented byDoyle [33] can be used to handle nonmonotonic rulesdescribed here. In this system, a proposition can haveeither of two states, IN which means that there is a non-circular proof for it or OUT when there is no such proof.A justification for a proposition consists of a list of pro-positions that have to be IN and a list of propositionsthat have to be OUT for it to be valid. So the justifica-tion for can-fly would have (bird) as the IN-list and(penguin ostrich) as the OUT-list. By keeping the justifi-cation with the proposition, the TMS can bring it IN orOUT according to the states of dependent propositions.

There is a significant difference between such a TMSand nonmonotonic logic. When an assumption leads toan inconsistency, it is not retracted in McDermott's logic,e.g. the theory

{MC-+D, ~D)

is considered to be inconsistent. The nonmonotonic ruleof inference, in the absence of ~C in the theory, infersMC and therefore D, causing the contradiction. In aTMS, a process called dependency-directed backtracking,

IEE PROCEEDINGS, Vol. 134, Pt. D, No. 4, JULY 1987 235

which is invoked when an inconsistency is detected, triesto retract assumptions so as to maintain global consis-tency. As a result, C will be brought IN. McDermott alsodiscovered other unpleasant properties of the logic andattempted to resolve them by basing his logic on somemodal logic bases [34].

A nonmonotic theory can generate more than onefixed point (set of mutually consistent theorems). Thetheory

{A,B, AAMC^~D,BAMD-+~C}

has two fixed points {MC, ~D} and {MD, ~C}. Doyle'sTMS will arbitrarily select one. The following exampleshows what TMS would do to restore consistency bydependency-directed backtracking when there is morethan one fixed point:

(1) {MA^C,either ~A or ~B will be brought IN

(2) {MD -+A,A^C,MB-> ~C},

~B will be brought IN

The second example also demonstrates a weakness in theinterpretation of the IN state in TMS. The justification ofC has only an IN-list and is, therefore, not considered asan assumption, although A itself is an assumption. Onthe other hand, ~C is considered as an assumption and~ B is therefore selected to be brought IN. There is, gener-ally, no way of knowing whether a belief is based onassumptions or not by just looking at the justification.The arbitrary way of resolving conflict is clearly undesir-able.

Since Doyle developed his TMS, other truth main-tenance systems have been built [35, 36]. De Kleer calledhis system an assumption-based truth maintenancesystem, which allowed multiple hypotheses to be investi-gated simultaneously, without the need to commit toonly one of a set of contradictory assumptions.

Doyle [37] argued that it is more difficult for a humanexpert to produce reliable probabilities than for him tospecify exceptions to the general rule, and proposed touse a reasoned assumption approach based on his TMS.Ginsberg [38] has incorporated some form of non-monotonic reasoning into Shafer's belief theory.

3.2.3 Theory of endorsement: A radical approach toreasoning under uncertainty was proposed by Cohen[39]. It is based on the belief that experts often possessmore knowledge about uncertainties than that capturedby most expert systems. Instead of giving a numericalvalue to summarise the uncertainties in a proposition orimplication, they can also provide knowledge about thesources of the uncertainties and how they interact. Byrepresenting this explicitly, Cohen developed a theory ofreasoning about uncertainties.

In this approach, uncertainty in a proposition is rep-resented by an endorsement, which is a record of its justi-fications together with their sources of uncertainty.Knowledge about the relationship between the differenttypes of uncertainties are encoded as semantic combiningrules for combining endorsements. To illustrate this, weshall return to the example of identification of oursubject's nationality from the set {English, French, IndianChinese}. If we know that the person in question hasblonde hair, the endorsement associated with the propo-sition that the person is Asian may record the negativeevidence and the uncertainty that he may have dyed his

236

hair. If we then find out that he lives in China, we maydecide to remove the uncertainty about hair dying, giventhe knowledge that hair dying is illegal in China isencoded as a semantic combining rule. Sullivan andCohen [40] described an endorsement-based planrecognition program.

The approach, however, is very domain dependent.The theory does not, unfortunately, provide much guid-ance or any tools for the experts, who have to providethe domain specific heuristics for combining and rankingendorsements. The approach is also computationally veryexpensive, owing to the large amount of symbolic manip-ulation.

4 Conclusion

The survey presented in this paper is by no meansexhaustive. Nevertheless, it highlights the difficultiesinvolved in reasoning with uncertain knowledge. Theauthors do not believe that any of the techniquesdescribed in this paper are the best and should be usedfor all expert systems. In fact, the existence of any suchtechnique is doubted.

The choice of a technique to build into the inferencestrategy of an expert system depends on the nature of theapplication. The designer often has to make tradeoffsbetween the validity of the system and its performance.The inference techniques that give ranking informationtend to use more restrictive assumptions, while those thatclaim to be more generally valid usually produce weakerinference. Another factor that affects the choice is howmuch knowledge can be elicited from the expert. To acertain extent, this also depends on the inference tech-nique to be used. Cohen has demonstrated how knowl-edge about uncertainty can be used to resolve conflict,but this may not be available, and, in any case, thisknowledge is again incomplete, and assumptions willhave to be made. Expert system designers should not betoo apprehensive about using some of the morerestrictive approaches, but they ought to formulate theknowledge so as not to contradict the underlyingassumptions. Fox [41] has proposed to represent theassumptions of the various approaches as meta-levelknowledge, and has used this to select the approach forreasoning according to the circumstances.

In spite of these criticisms, all the above techniquesprovide something useful, although they fall far short oftechniques which may be applied generally. The situationat present appears to be that each domain has to betreated with sufficient care and attention, in which theappropriate techniques may be usefully applied with suit-able modifications.

5 Acknowledgment

This work was funded by an Alvey Uncle type project.

6 References

1 HUGHES, G.E., and CRESSWELL, M.J.: 'An introduction tomodal logic' (Methuen and Co., London, 1972)

2 HAACK, S.: 'Deviant logics: some philosophical issues' (CambridgeUniversity Press, 1974)

3 TURNER, R.: 'Logics for artificial intelligence' (Ellis Horwoodseries in Artificial Intelligence, 1984)

1EE PROCEEDINGS, Vol. 134, Pt. D, No. 4, JULY 1987

4 RESCHER, K : 'Hypothetical reasoning' (North-Holland, 1964)5 TOURETZKY, D. S.: 'Implicit ordering of defaults in inheritance

systems'. Proc. Natl. Conf. Artif. Intell., 1984, pp. 322-3256 RAMSEY, F.P.: 'Truth and probability', in BRAITHWAITE, R.B.

(Ed.): 'The foundations of mathematics and other logical essays'(The Humanities Press, New York, 1960)

7 SAVAGE, L.J.: 'The foundations of statistics reconstructed', inKYBURG, H.E.: 'Studies in subjective probability' (Wiley, 1961)

8 DUDA, R.O., HART, P.E., and NILSSON, N.J.: 'Subjective Bayes-ian methods for rule-based inference systems'. AFIPS Conf. Proc,1976, pp. 1075-1082

9 REITER, J.: 'AL/X: An inference system for probabilistic reason-ing'. M.Sc. Thesis, Department of Computer Science, University ofIllinois, Urbana, Champaign, 1981

10 PEDNAULT, E.P.D., ZUCKER, S.W., and MURESAN, L.V.: 'Onthe independence assumption underlying subjective Baysian updat-ing', Artif. Intell, 1981,16, pp. 213-222

11 SNOW, P.: 'Unduly severe conditional independence assumptions'.Computer Science Department, Hawthorne College, 1986

12 CHEESEMAN, P.: 'In defence of probability'. Proc. 9th Int. Conf.Artif. Intell, 1985, pp. 1002-1009

13 SHORTLIFFE, E.H., and BUCHANAN, B.G.: 'A model of inexactreasoning in medicine', Math. Biosci., 1975, 23, pp. 351-379

14 ISHIZUKA, M , FU, K.S., and YAO, J.T.P.: 'A theoretical treat-ment of certainty factor in production systems'. Structural Engineer-ing Report, Purdue University, CE-STR-81-6, 1981

15 FRIEDMAN, L.: 'Reasoning by plausible inference'. Proc. 5th Conf.Automated Deduction (Springer-Verlag, Berlin, 1981), pp. 126-142

16 POLY A, G.: 'Patterns of plausible inference' (Princeton UniversityPress, Princeton, 1954)

17 DEMPSTER, A.P.: 'Upper and lower probabilities induced by amultivalued mapping', Ann. Math. Stat., 1967, 38, (2), pp. 325-339

18 SHAFER, G.: 'A mathematical theory of evidence' (Princeton Uni-versity Press, Princeton, 1976)

19 ZADEH, L.A.: 'On the validity of Dempster's rule of combinationof evidence'. Electronics Research Laboratory MemorandumUCB/ERL M79/24, University of California, Berkeley, 1979

20 YAGER, R.R.: 'Generalised probabilities of fuzzy events from fuzzybelief structures', Inf. Sciences, 1982, 28, pp. 45-62

21 DUBOIS, D., and PRADE, H.: 'Combination and propagation ofuncertainty with belief functions'. Proc. 9th Int. Joint Conf. Artif.Intell, 1985, pp. 111-113

22 BARNETT, J.A.: 'Computational methods for a mathematicaltheory of evidence'. Proc. 7th Int. Conf. Artif. Intell, 1981, pp.868-875

23 GORDON, J., and SHORTLIFFE, E.H.: 'A method for managingevidential reasoning in a hierarchical hypothesis space', Artif. Intell,1985, 26, pp. 323-357

24 GARVEY, T.D., LOWRANCE, J., and FISCHLER, M.A.: 'Aninference technique for integrating knowledge from disparatesources'. Proc. 7th Int. Joint Conf. Artif. Intell, 1982, pp. 319-325

25 SMETS, P.: 'Bayes theorem generalised for belief functions'(Universite Libre de Bruxelles, Belgium, 1986)

26 QUINLAN, J.R.: 'INFERNO: a cautious approach to uncertaininference', Comput. J., 1983, 26, pp. 255-269

27 DUBOIS, D., and PRADE, H.: 'A simple inference technique fordealing with uncertain facts in terms of possibility'. Proc. Int. Symp.Fuzzy Sets & Syst., Cambridge, 1984

28 ZADEH, L.A.: 'Fuzzy sets as a basis for a theory of possibility',Fuzzy Sets & Syst., 1978,1, pp. 3-28

29 GRIFFITHS, D., MAMDANI, E.H., and EFSTATHIOU, H.J.:'Expert systems and fuzzy logic', in BEZDEK, J. (Ed): 'CDC volumeon fuzzy sets and decision analysis' (Marseille, 1985)

30 TVERSKY, A., and KAHNEMAN, D.: 'Judgment under uncer-tainty: heuristics and biases', Science, 1974,185, pp. 1124-1131

31 FOX, J.: "Linguistic' reasoning about beliefs and values'. Proc.Alvey Workshop, May 1984

32 McDERMOTT, D.V., and DOYLE, J.: 'Non-monotonic logic I',Artif. Intell, 1980, 13, pp. 27-40

33 DOYLE, J.: 'A truth maintenance system', Artif. Intell, 1979, 12,pp. 231-272

34 McDERMOTT, D.V.: 'Nonmonotonic logic II: Nonmonotonicmodal theories', J. Assoc. Comput. Mach., 1982, 29, pp. 33-57

35 McALLESTER, D.: 'An outlook on truth maintenance'. ArtificialIntelligence Laboratory, AIM-551, MIT, Cambridge, MA, 1980

36 De KLEER, J.: 'An assumption-based TMS', Artif. Intell, 1986, 28,pp. 127-162

37 DOYLE, J.: 'Methodological simplicty in expert system construc-tion: The case of judgments and reasoned assumptions', AI Mag.,Summer 1983, pp. 39-43

38 GINSBERG, M.L.: 'Non-monotonic reasoning using Dempster'srule', Proc. Natl. Conf Artif. Intell, 1984, pp. 126-129

39 COHEN, P.R., and GRINBERG, M.R.: 'A framework for heuristicreasoning about uncertainty'. Proc. 8th Int. Conf. Artif. Intell, 1983,pp. 355-357

40 SULLIVAN, M., and COHEN, P.R.: 'An endorsement-based planrecognition program'. Proc. 9th Int. Joint Conf. Artif. Intell, 1985,pp. 475-479

41 FOX, J.: 'Three arguments for extending the framework of probabil-ity'. Proc. Workshop AI & Decis. Making, AAAI, Los Angeles, 1985

IEE PROCEEDINGS, Vol. 134, Pt. D, No. 4, JULY 1987 237