97
KENYATTA UNIVERSITY INSTITUTE OF OPEN DISTANCE & e-LEARNING IN COLLABORATION WITH SCHOOL OF ENGINEERING AND TECHNOLOGY DEPARTMENT OF COMPUTING AND INFORMATION TECHNOLOGY SIT 410 : KNOWLEDGE BASED SYSTEMS 1 WRITTEN BY: EDITED BY: COURSE AUTHOR’S NAME

SIT 410 Knowledge Based Systems

Embed Size (px)

DESCRIPTION

SIT 410 Knowledge Based Systems

Citation preview

Simple Expert System Sample 1: A CAR TROUBLE DIAGNOSTIC SYSTEM

KENYATTA UNIVERSITY

INSTITUTE OF OPEN DISTANCE & e-LEARNINGIN COLLABORATION WITHSCHOOL OF ENGINEERING AND TECHNOLOGYDEPARTMENT OF COMPUTING AND INFORMATION TECHNOLOGY

SIT 410 : KNOWLEDGE BASED SYSTEMS

EDITED BY:COURSE AUTHORS NAMEWRITTEN BY:

Copyright Kenyatta University, 2010All Rights ReservedPublished By:KENYATTA UNIVERSITY PRESS

SIT 410 KNOWLEDGE BASED SYSTEMS

Introduction to artificial intelligence systems and their applications. Strategies for space search such as uninformed, and heuristics. knowledge representation: Natural Language, Propositional and predicate logic, semantic networks, frames, rules. KB based systems, Architectures and applications of expert systems. Inference strategies: Goal driven and data driven. PROLOG.Course ScheduleLectureTOPIC

1Overview of Artificial Intelligence Introduction-definitions and history, Branches of AI, Applications of AI

2 & 3State space search: Strategies- Uninformed and heuristic search strategies Depth-first, breadth-first, uniform-cost search, A* Search algorithm, greedy search, e.t.

4 ,5 & 6KBS development and implementation Knowledge acquisition Knowledge representation schemes/techniques Prepositional and Predicate logic, Rules, frames, semantic networks, e.t.c. Examples of shells for ES implementation

7 & 8Knowledge-based systems Definitions- knowledge, knowledge representation, inference Components of a KBS Types of KBS-Expert systems, rule based systems, KB DSS, e.t.c

9Reasoning: Inference strategies: Forward and backward chaining systems

10PROLOG

References1. Decision Support Systems and Intelligent Systems (7th Edition): Books by Efraim Turban,Jay E. Aronson,Ting-Peng Liang2. Engineering Knowledge-Based Systems: Theory and Practice, by Avelino Gonzalez and Douglas Dankel II.3. Artificial Intelligence: A Modern Approach (2nd ed.), Russel, S., & Norvig, P. (2003).. New Jersey: Prentice Hall.

Lecture OneIntroduction to Artificial Intelligence (A.I)Lecture Overview Intelligence Defining A.I A.I Applications

IntelligenceDictionary definition.(1) The ability to learn or understand or to deal with new or trying situations : REASON; also : the skilled use of reason (2) The ability to apply knowledge to manipulate one's environment or to think abstractly as measured by objective criteria (as tests) Defining A.IThere is no agreed definition of the term artificial intelligence. However, there are various definitions that have been proposed. These are considered below. AI is a study in which computer systems are made that think like human beings. Haugeland, 1985 & Bellman, 1978. AI is a study in which computer systems are made that act like people. AI is the art of creating computers that perform functions that require intelligence when performed by people. Kurzweil, 1990. AI is the study of how to make computers do things which at the moment people are better at. Rich & Knight AI is a study in which computers that rationally think are made. Charniac & McDermott, 1985. AI is the study of computations that make it possible to perceive, reason and act. Winston, 1992 AI is the study in which systems that rationally act are made. AI is considered to be a study that seeks to explain and emulate intelligent behaviour in terms of computational processes. Schalkeoff, 1990. AI is considered to be a branch of computer science that is concerned with the automation of intelligent behavior. Luger & Stubblefield, 1993. Views of AI fall into four categories: Thinking humanly Thinking rationally Acting humanly Acting rationally The textbook advocates "acting rationally"Therefore A.I is the part of computer science concerned with designing intelligent computer systems, that is, computer systems that exhibit the characteristics we associate with intelligence in human behaviour - understanding language, learning, reasoning and solving problems

A.I Applications Autonomous control : e.g. ALVINN Knowledge-based systems (KBS) e.g. Medical Diagnosis - MYCIN 1971, A program that could diagnose blood infections. It had 450 rules Mineral Prospecting - PROSPECTOR 1979, A program that with geological data. It recommended exploratory drilling sites that proved to have substantial molybdenum deposits.

Game Playing e.g. Deep Blue Data Mining Problem Solving: complex problems e.g puzzle, mathematical problems, logistic planning Robotics: intelligent systems which can control robots e.g. surgeon systems A.I agents

Branches of A.I Machine vision Speech synthesis and recognition Machine Learning Robotics Natural Language and understanding Problem solving Game playing Knowledge-based systems A.I agents ..Intelligent TechniquesIntelligence techniques may be used for: Capturing individual and collective knowledge and extending a knowledge base, using artificial intelligence and database technologies Capturing tacit knowledge, using expert systems, case-based reasoning, and fuzzy logic Knowledge discovery, or discovering underlying, hidden patterns in data sets, using neural networks and data mining Generating solutions to highly complex problems, using genetic algorithms Automating routine tasks, using intelligent agents Artificial intelligence (AI) is the effort to develop computer-based systems (both hardware and software) that behave as humans, with the ability to learn languages, accomplish physical tasks, use a perceptual apparatus, and emulate human expertise and decision making.

Require input from both human experts for defining the knowledge base and knowledge engineers, who translate the knowledge into a set of rules

Fuzzy logic systems: Use rule-based logic to represent imprecise or ambiguous values used in human or linguistic categorization, such as defining and comparing terms such as "hot, warm, cool, cold" for use in a temperature control system. Provide solutions to problems requiring expertise that is difficult to represent in the form of crisp IF-THEN rules Neural networks: Find patterns and relationships in massive amounts of data that would be too complicated and difficult for a human being to analyze. "Learn" patterns by sifting through data, searching for relationships, building models, and correcting over and over again the model's own mistakes. Use a large number of sensing and processing nodes that continuously interact with each other May be sensitive and not perform well with too little or too much data Are used in science, medicine, and business primarily to discriminate patterns in massive amounts of data. Genetic algorithms: Find optimal solutions to a problem by examining a very large number of possible solutions for that problem. Use adaptive, evolutionary conceptual models to change and reorganize components of possible solutions to create viable solutions, test their fitness, and discard unlikely solutions Use processes such as fitness, crossover, and mutation are to "breed" solutions. Are useful for dynamic and complex business problems involving hundreds or thousands of variables, such as problems involving engineering design optimization, product design, and monitoring industrial systems. Hybrid AI systems: Integrate genetic algorithms, fuzzy logic, neural networks, and expert systems are being developed to take advantage of the best features of each technology. Intelligent agents: Are software programs that work in the background without direct human intervention Carry out specific, repetitive, and predictable tasks Use a limited built-in or learned knowledge base to accomplish tasks or make decisions on the user's behalf Are used in agent-based modeling applications used to model or simulate the behavior of consumers, stock markets, and supply chains and to predict the spread of epidemics

INTELLIGENT AGENTS IN P&GS SUPPLY CHAIN NETWORKIntelligent agents are helping Procter & Gamble shorten the replenishment cycles for products such as a box of Tide.

Lecture 2

State space search

1. Introduction All AI tasks involve searching.General idea: you know the available actions that you could perform to solve your problem you don't know which ones in particular should be used and in what sequence, in order to obtain a solution You can search through all the possibilities to find one particular sequence of actions that will give a solution.The scenario: Initial state Target (Goal) state A set of intermediate states A set of operations, that move us from one state to another. The task: Find a sequence of operations that will move us from the initial state to the target state. Solved in terms of searching a graph. The set of all states: search space

Examples:Expert systems: Find the sequence of rules that will prove the goal (backward chaining)Puzzles: Find a sequence of actions to solve the puzzleChess: Find the sequence of moves that will result in winning the game.Search techniques:Uninformed search: Exhaustive search (brute force methods: systematically and exhaustively search all possible paths) Depth-first Breadth-first Informed search: Heuristic search (use rules-of-thumb to guess which paths are likely to lead to a solution) Hill climbing Best-first search A* algorithm Evaluating a search techniques/algorithm1. Completeness Is the strategy guaranteed to find a solution?2. Time Complexity How long does it take to find a solution? 3. Space Complexity How much memory does it take to perform the search?4. Optimality Does the strategy find the optimal solution where there are several solutions?

2. Graphs and trees A graph consists of a set of nodes (vertices) with links (edges) between them. A link is represented usually as a pair of nodes, connected by the link. Undirected graphs: the links do not have orientation Directed graphs: the links have orientation Examples:

Path: Sequence of nodes such that each two neighbors represent an edgeExamples: in G1: A B D C A E, in G2: EABDCNote: the sequence ABEA in G2 is not a path, because the edges have orientation, and there is no edge BE, the edge is EBCycle: a path with the first node equal to the last and no other nodes are repeatedExamples: In G1: A B D C A, In G2: no cycles.Acyclic graph: a graph without cyclesTree: undirected acyclic graph, where one node is chosen to be the rootGiven a graph and a node:Out-going edges: all edges that start in that nodeIn-coming edges : all edges that end up in that nodeSuccessors (Children): the end nodes of all out-going edgesAncestors (Parents): the nodes that are start points of in-coming edges In undirected graphs the edges are symmetrical, i.e. the notion of child and parent depends on how the graph is traversed.

3. Exhaustive search 3. 1. Breadth-first search At step i traverse all nodes at level i.A. in trees

The order in Breadth-first search:AB, CD, E, F, G, H,I, JB. in graphs: we need to keep a list of visited nodes as there may be cyclesAlgorithm: using a queue1. Queue = [initial_node] , FOUND = False2. While queue not empty and FOUND = False do:Remove the first node NIf N = target node then FOUND = trueElse find all successor nodes of N and put them into the queue.In essence this is Dijkstra's algorithm of finding the shortest path between two nodes in a graph.3. 2. Depth-first search Keep going down one path until you get to a dead end. Then back up and try alternatives.Algorithm: using a stack1. Stack = [initial_node] , FOUND = False2. While stack not empty and FOUND = False do:Remove the top node NIf N = target node then FOUND = trueElse find all successor nodes of N and put them onto the stack.Search order for the sample tree:A, B, D (leftmost path)E (back up thru B - explore its middle subtree)F (back up thru B - explore its right subtree)C, G, I (back up thru A - explore its right subtree, and follow the leftmost path)J (back up thru G - explore its right subtree)H (back up thru C - explore its right subtree)Depth-first search in graphs: keep a list of visited nodes3. 3. Comparison of depth-first and breadth-first In breadth-first going level by level the goal is eventually found without backtracking In depth-first we may reach a dead end, and in that case a backtracking is accomplished - returning to a higher node and exploring its successors.Length of path: breadth-first finds the shortest path first.Memory: depth-first uses less memoryTime: If the solution is on a short path - breadth first is better, if the path is long - depth first is better.

Lecture 3Heuristic search

4. Heuristic search Heuristic search is used to reduce the search space.Basic idea: explore only promising states/paths.We need an evaluation function to estimate each state/path.4. 1. Hill climbing Basic idea: always head towards a state which is better than the current one.Example: if you are at town A and you can get to town B and town C (and your target is town D) then you should make a move IF town B or C appear nearer to town D than town A does.Algorithm:4. Start with current-state = initial-state. 4. Until current-state = goal-state OR there is no change in current-state do: c. Get the successors of the current state and use the evaluation function to assign a score to each successor. c. If one of the successors has a better score than the current-state then set the new current-state to be the successor with the best score. There is no exhaustive search, so no node list is maintained.No problem with loops, as we move always to a better node.Hill climbing terminates when there are no successors of the current state which are better than the current state itself. If a solution is found, it is found for a very short time and with minimum memory requirements. However it is not guaranteed that a solution will be found - the local maxima problem.General hill climbing is only good for a limited class of problems where we have an evaluation function that fairly accurately predicts the actual distance to a solution.4. 2. Best-first search The evaluation function scores each successor node. The node with the best score is chosen to be expanded. The algorithm works in breadth-first manner, keeps a data structure (called agenda, based on priority queues) of all successors and their scores.Algorithm:1. Start with agenda = [initial-state]. 1. While agenda is not empty do e. Pick the best node on agenda. e. If it is the goal node then return with success. Otherwise find its successors. e. Assign the successor nodes a score using the evaluation function and add the scored nodes to the agenda If a node that has been chosen does not lead to a solution, the next "best" node is chosen, so eventually the solution is found.The algorithm always finds a solution, not guaranteed to be the optimal one.Comparison with hill-climbingSimilarities: best-first always chooses the best nodeDifference: best-first search keeps an agenda as in breadth-first search, and in case of a dead end it will backtrack, choosing the next-best node.Note: if the evaluation function is very expensive (i.e., it takes a long time to work out a score) the benefits of cutting down on the amount of search may be outweighed by the costs of assigning a score.

4. 3. The A* Algorithm Best-first search doesn't take into account the cost of the path so far when choosing which node to search from next. A* attempts to find a solution which minimizes the total length or cost of the solution path.A* algorithm uses an evaluation function that accounts for the cost from the initial state to the current state, and the cost from the current state to the goal state (i.e. the score assigned to the node in consideration).F(Node) = g(Node) + h(Node)g(Node) - the costs from the initial state to the current nodeh(Node) - future costs, i.e. node scoreA* always finds the best solution, provided that h(Node) does not overestimate the future costs.Thus, in the next example, Hill climbing will choose node H and will be stuck in node K. Best-first will choose node H, go to K, backtrack to F and will find a path to G: A H F G, though not the optimal one. A* will choose node D as its total score is 12: the sum of g(D) = 2 plus h(D) = 10. Example:

Start state: AGoal state: G1. For each node write down its successors 1. Draw the search tree, that corresponds to the graph 1. Write the sequence of visited nodes and the cost of the path in: h. depth-first search h. breadth-first search h. hill climbing h. best-first search h. A* Revision Questions Explain how breadth-first and depth first algorithms work, discuss the advantages and disadvantages of each of them. Explain how hill-climbing and best-first algorithms work. Explain how A* algorithm works, compare it with best-first algorithm. Given a graph as in the example above, be able to perform the tasks listed in the example.

Lecture 4

Knowledge Acquisition Introduction

First, what precisely do we mean by knowledge?We know about data processing and about information processing. The difference is a question of levels.First, data means just uninterrupted values, e.g. 46.Second, information means organized values, which can be regarded as having some sense or interpretation, e.g. 46 held as the age field in a personal_details record. (data + sense).Third knowledge means information which is known to be true . (data + sense + knowing).

Types of KnowledgeIt is recognized that there are different kinds of knowledge: Declarative knowledge: facts. Procedural knowledge: how to do things. Semantic knowledge: use and meanings of words. Conceptual knowledge: abstract knowledge of concepts and relationships between concepts. Episodic knowledge: detailed knowledge of particular occurrences or experiences. Meta-knowledge: knowledge about knowledge, e.g. how experts actually organise and use their knowledge.(Of course, not all of these need be involved in every knowledge-based system.)

The main difficulties about knowledge are: Its overall unstructured nature Its breadth and complexityEven where knowledge is clearly structured (as in an encyclopaedia, for example), there is may still be a practical difficulty over identifying and finding relevant knowledge.

Levels of KnowledgeAI workers (and psychologists) have recognised that there are different levels of knowledge: shallow knowledge deep knowledgeShallow knowledge means surface-level information about appearances and behaviour in very specific situations. For example If the petrol tank is empty then the car will not start.Typically such knowledge could be in the form of rules of the form IF THEN .

We might have a lot of shallow knowledge (say about cars) and still have little understanding. Dealing with complex or unfamiliar situations, or giving explanations, may not be easy just on the basis of shallow knowledge.Deep knowledge means knowledge of the internal and causal structure of a context or situation. For example, knowledge of how a car engine works and of what happens inside it.Such knowledge is much harder to represent in a computer. It may involve concepts, relationships, abstractions and analogies.

Sources of KnowledgeSources of knowledge are extremely varied. Books, databases, people. ES can be built with appropriate means to search in databases. It s people that are the problem!The knowledge acquisition problem is to elicit and formalise human knowledge and expertise.Human knowledge is not well-structured. Worse, experts may use their knowledge unconsciously. Also different experts not only may disagree, but also may have wholly different approaches and methods by which they apply their knowledge.The knowledge acquisition problem is to elicit and formalise human knowledge.There is a range of methods: manual, semi-automatic, automatic.

Knowledge Acquisition

Manual Knowledge AcquisitionTwo kinds of approach:1. via interviews with experts2. via observation of experts in actionThe knowledge engineer elicits the knowledge from the expert and fits it into some chosen knowledge representation scheme (which the expert will generally not know about).

1. Interviewing is a skill, and much effort has been put into developing interviewing techniques (not just for knowledge acquisition).Interviews may be structured: the interviewer may work to a standardized scheme of questioning. This may be appropriate if the knowledge representation scheme has been previously worked out.Or the interviewing may consist of having the expert talk through his approach to certain particular problems, with prompting from the interviewer.

2. Observation just means noting circumstances which arise and actions taken. The observer may intrude by asking the expert to give his reasons for particular steps, or to think aloud while he is working. Tracking is the jargon word for following the expert's train of thought.The difficulty is that the expert is not generally a knowledge engineer and the knowledge engineer is not generally an expert. There is a gap to be bridged, since neither will know what is of significance to the other.The solution is likely to be to allow the expert to become a knowledge engineer, possibly by giving him/her computer support.Knowledge engineering is itself an expert task. It is clearly possible to envisage an expert system which may assist with it.Meanwhile, let us consider ways in which machine assistance may be brought into the knowledge engineering process.

Semi-Automatic Knowledge AcquisitionOur rule-based shell may be regarded as providing computer support for knowledge acquisition via its Build facility. It does not really elicit knowledge, though.We look at one technique which can be automated for eliciting knowledge: Repertory Grid Analysis.Repertory Grid AnalysisExample: Consider the problem of selecting an appropriate programming language for a particular programming task. The first stage of RGA involves the following steps:1. The expert identifies important objects (e.g. Java, LISP, Cobol, Prolog, Perl, Fortran, C).2. The expert identifies important attributes of these (e.g. availability, ease of use, training time, orientation).3. The expert identifies for each attribute a criterion or measure (e.g. for availability High/Medium/Low, for orientation Symbolic/General/Numeric).

Once these have been established, the expert is prompted by the following indirect means to impart his expertise:1. The interviewer (or automatic system) repeatedly asks questions about which attributes distinguish some objects from others, perhaps by giving three objects, and asking for an attribute which can distinguish two of them from the third (e.g. for LISP, Prolog and Cobol, two are Symbolic and one is not).2. The interviewer builds up a table (grid) containing numerical ratings for the attributes for each object.3. The expert may then examine the results, and adjust the table if it appears not to be a correct representation of the knowledge.This is a simplified description of the process. Computer systems exist which use this approach to elicit knowledge in a quite sophisticated way (see Turban).

Automatic Knowledge AcquisitionBroadly, this means using a computer program to convert data into knowledge. This process may also be described as learning.We may imagine other situations like the choice of programming language, where a given situation has certain characteristics which will determine a correct decision or action.The idea is to create general rules from a set of example cases where the correct outcome is known. These cases may be Real existing data, or The record of the program's own experience, or generated by an expert to represent his/her knowledge.There are automated systems which do this. A well-known one is ID3 (Turban p146). Given a set of cases, it orders the various attributes as to relevance to the outcome, and then builds a decision tree. This tree may then be used to reach a conclusion when we are given a new case, with new attribute values.

Lecture 5 and 6

Knowledge RepresentationIntroduction Knowledge. True rational belief(philosophy).OR facts, data and relationships (Computational view). Representation. Structure + operations; OR map + operations; OR game layout and rules of play; OR abstract data types. Knowledge representation. Framework for storing knowledge and manipulating knowledge OR Set of syntactic and semantic conventions that makes it possible to describe things. Bench-Capon, 1990. The object of KR is to express knowledge in a computer-tractable form, so that it can be used to help agents perform well. A KR language is defined by two aspects: Syntax: describes how to make sentences OR describes the possible configurations that can constitute sentences. Semantics: determine the facts in the world to which the sentences refer OR the things in the sentence.Inference: The terms inference and reasoning are generally used to cover any process by which conclusions are reached. Logical inference deduction

Different Knowledge Representation schemes/formalisms Natural Language Frames Semantic Nets Rules Logic Propositional logic (Boolean Logic) Predicate logic (First Order Logic)

1. Natural LanguageExpressiveness of natural language: Very expressive, probably everything that can be expressed symbolically can be expressed in natural language (pictures, content of art, emotions are often hard to express) Probably the most expressive knowledge representation formalism we have. Reasoning is very complex, hard to modelProblems with natural language: Natural language is often ambiguous. Syntax and semantics are not fully understood. There is little uniformity in the structure of sentences.2. Semantic Networks Originally developed in the early 1960s to represent the meaning of English words. The term dates back to Ross Quillian's Ph.D. thesis (1968), in which he first introduced it as a way of talking about the organization of human semantic memory, or memory for word concepts.A semantic net is a graph, where the nodes in the graph represent concepts, and the arcs represent binary relationships between concepts.Types of relations:subclass, the link is named is_amember, the link is named is_instance_ofOther relations used depend on the application. (e.g. has_parts, likes, etc)Property inheritance is the basic inference mechanism for semantic networks. Example

This network represents the fact that mammals and reptiles are animals that mammals have heads, an elephant is a mammal, and Clyde is a particular elephantInferring facts not explicitly represented: Clyde has a head.Representational adequacy - problems with representing quantifiers, (such as ``every dog in town has bitten the constable'')Advantages. Easy to translate to predicate calculus.Disadvantages. Cannot handle quantifiers; nodes may have confusing roles or meanings; searching may lead to combinatorial explosion; cannot express standard logical connectives; can represent only binary or unary predicates.

Summary: Simple way to represent binary relations. Use inheritance via the is_a and is_instance_of relations to infer implicit facts Difficult to use semantic networks in a fully consistent and meaningful manner.

3. Frames Frames capture knowledge about typical objects or events, such as a typical bird, or a typical restaurant meal. All the information relevant to a particular concept is stored in a single complex entity, called a frame.Frames support inheritance.

Example 1Mammal subclass: Animal warm_blooded: yes

Elephant subclass: Mammal * colour: grey * size: large

Clyde instance: Elephant colour: pink owner: Fred

Nellie: instance: Elephant size: smallComponents of a frame entity: attribute - value pairs. attributes (also called slots) are filled with particular valuesE.G. attribute color, value grey.Types of attributes:Definitive attributes: they define the object, distinguish between the particular object and other objects in the same class. For example having wings might be a definitive feature for birds (is it?)Necessary attributes: necessary for every object in the class, e.g. a necessary feature of a bird is laying eggs, however it is not definitive - raptiles also lay eggsAttribute values:Typical attribute values: for example a typical feature for birds is that they fly. However there are some birds that do not fly. In the above example ``*'' is used to indicate attributes that are only true of a typical member of the class, and not necessarily every member.Default attribute values. Default values help us fill in the blanks of information about a given object. For example, we assume by default that birds fly, unless stated otherwise.Overriding values: Some typical features for a given class of objects are not present for certain members of that class, for example there are birds that do not fly. In such cases we talk about overriding valuesProperty inheritanceSimple if single parent-class, single values for slots.There may be problems in case of multiple values and several parent classes:Multiple values:Elephant: has part: trunkMammal: has part: head. Clyde is an elephant. What would be the value of the slot has part?Several parent classes (e.g., Clyde is both an elephant and a circus-animal)Which parent to inherit from first?.Slots and ProceduresFrame representation can use a procedure to compute the value of a given slot if needed, e.g. the area of a square, given the size

Example 2

Advantages: can cope with missing values- close matches are presented.Disadvantages: has been hard to implement, especially inheritance. Representational adequacy certain things are difficult to represent: Negation, disjunction, quantification

4. Rules These are formalization often used to specify recommendations, give directives or strategy.Format:IF THEN .Related ideas: rules and fact base; conflict set - source of rules; conflict resolution- deciding on rules to apply.One of the most popular approaches to knowledge representation is to use production rules, sometimes called IF-THEN rules. They can take various forms.e.g. IF condition THEN action IF premise THEN conclusion IF proposition p1 and proposition p2 are true THEN proposition p3 is trueSome of the benefits of IF-THEN rules are that they are modular, each defining a relatively small and, at least in principle, independent piece of knowledge. New rules may be added and old ones deleted usually independently of other rules. Advantages: easy to use; explanations are possible; capture heuristics; can handle uncertainties to some extent.Disadvantages: cannot cope with complex associated knowledge; they can grow to unmanageable size.

Production Rules They are conditional statements specifying an action to be taken in case a certain condition is true. They codify knowledge in the form of premise-action pairs. Syntax: IF (premise) THEN (action) Example: IF income is `standard' and payment history is `good', THEN `approve home loan'. In case of knowledge-based systems, rules are based on heuristics or experimental reasoning. Rules can incorporate certain levels of uncertainty. A certainty factor is synonymous with a confidence level , which is a subjective quantification of an expert's judgment. The premise is a Boolean expression that should evaluate to be true for the rule to be applied. The action part of the rule is separated from the premise by the keyword THEN. The action clause consists of a statement or a series of statements separated by AND's or comma's and is executed if the premise is true.

A rule based system will contain global rules and facts about the knowledge domain covered. During a particular run of the system a database of local knowledge may also be established, relating to the particular case in hand. One of the most widely used tutorial examples of rule based systems is Mycin, an expert system which was designed to assist doctors with the diagnosis and treatment of bacterial infection. It uses the rule based approach and also demonstrates the way in which uncertainty (both in observations and in the reasoning process) may be handled. Mycin was designed to help the doctor to decide whether a patient has a bacterial infection, which organism is responsible, which drug may be appropriate for this infection, and which may be used on the specific patient. The global knowledge base contains facts and rules relating for example symptoms to infections, and the local database will contain particular observations about the patient being examined. A typical rule in Mycin is as follows: IF the identity of the germ is not known with certainty AND the germ is gram-positive AND the morphology of the organism is "rod" AND the germ is aerobicTHEN there is a strong probability (0.8) that the germ is of type enterobacteriacaeNote that a probability or certainty factor (C.F.) is given, reflecting the strength of the original expert's confidence in the inference made in this rule. In other words, the confidence in the conclusion assuming the premises are true. The premises are, in fact, established from observations either in the laboratory or from the patient, and may themselves have an element of uncertainty associated with them. In the above example it may only be known that the germ is aerobic with a probability of 0.5. The certainty factor associated with a conclusion in MYCIN is calculated from the certainty factor of the premises, the certainty factor of the rule and any existing certainty factors for the conclusion if it has been obtained already from some other rules. The way in which the knowledge base is used is determined by the inference engine. It is a basic principle of production systems that each rule should be an independent item of knowledge and essentially ignorant of other rules. The inference engine could then simply "fire" rules at any time when its premises are satisfied. If several rules could all fire at once the inference engine must have a mechanism for "conflict resolution". This may be achieved, for example, by having some predefined order, perhaps on the basis of the strength of the conclusion, or alternatively on the basis of frequency of rule usage. Forward and Backward chaining through the rules may be used. The two systems each have their advantages and disadvantages and in fact answer different types of question. For example, in Mycin a forward chaining system might answer the question "what do these symptoms suggest?" whereas a backward chaining system might answer the question "does this patient suffer from a pelvic abscess?" In general, rules and goals may need to be constructed differently for forward and backward chaining systems. 5. PropositionsA proposition is a statement that is either true or false.

For example, here are some propositions: The file is being printed. The system is ready. The red light is on.It is conventional to represent propositions by lower case letters.For example: p: The file is being printed.q: The system is ready. r: The red light is on.

If we use the verbal specification it is not easy to answer the question. However, suppose that we replace the various statements by letters:t : the alarm is activated by the temperature monitoring interface f : the alarm is activated by the flow monitorm : the alarm is activated manually b: the bell sounds in the chief supervisor's office w: a warning message appears on all supervisor's screens Then, using various symbols that we will define shortly, the specification may be rewritten as:( t f m ) ( b w ) It is much easier to see the "structure" of this symbolic statement than the verbal one. Alarm does not work accordingly to specification.

Propositions, Connectives, Compound, propositionsA compound proposition is built from simple propositions using connectives (sometimes also called "operators) including: Not And Or If then Xor Compound PropositionsExamples of compound propositions: If the system is ready and the red light is on then the file is printed.(q r ) p

If the file is not printed then either the red light is not on or the system is not ready~p (~r ~q)

Either the red light is on and the file is printed or else the system is not ready.(r p) ~q

Example 1

Define the following propositions:p: Peter is driving his own car.a: Andrew is late.m: Max has caught the bus.

Write each of the following in symbols:Either Peter is driving his own car and Andrew is late or else Max has not caught the bus. solution:(p a) ~m

Example 2 Define the following propositions:p: Peter is driving his own car.a: Andrew is late.m: Max has caught the bus.

Translate into simple English:m ( p a)

Solution: Max has caught the bus and either Peter is not driving his own car or Andrew is not late.

Truth TableOften we want to discuss properties/relations common to all propositions. In such a case rather than stating them for each individual proposition we use variables representing an arbitrary proposition and state properties/relations in terms of those variables. Those variables are called a propositional variable. Propositional variables are also considered a proposition and called a proposition since they represent a proposition hence they behave the same way as propositions. A proposition in general contains a number of variables. For example (P Q) contains variables P and Q each of which represents an arbitrary proposition. Thus a proposition takes different values depending on the values of the constituent variables. This relationship of the value of a proposition and those of its constituent variables can be represented by a table. It tabulates the value of a proposition for all possible values of its variables and it is called a truth table.

For example the following table shows the relationship between the values of P, Q and P Q: OR

P Q(P Q)

F FF

F TT

T FT

T TT

In the table, F represents truth value false and T true. This table shows that P Q is false if P and Q are both false, and it is true in all the other cases.

Meaning of the Connectives NOT, AND, OR, IMPLIES, IF AND ONLY IF Let us define the meaning of the five connectives by showing the relationship between the truth value (i.e. true or false) of composite propositions and those of their component propositions. They are going to be shown using truth table. In the tables P and Q represent arbitrary propositions, and true and false are represented by T and F, respectively. NOT

P P

T F

F T

AND

P Q(P Q)

F FF

F TF

T FF

T TT

OR

P Q(P Q)

F FF

F TT

T FT

T TT

IMPLIES

P Q(P Q)

F FT

F TT

T FF

T TT

.

This table shows that (P Q) is true if both P and Q are true, and that it is false in any other case. This table shows that if P is true, then ( P) is false, and that if P is false, then ( P)is true. Similarly for the rest of the tables.

When P Q is always true, we express that by P Q. That is P Q is used when proposition P always implies proposition Q regardless of the value of the variables in them.IF AND ONLY IF

P Q ( P Q )

F FT

F TF

T FF

T TT

When P Q is always true, we express that by P Q. That is is used when two propositions always take the same value regardless of the value of the variables in them. See Identities for examples of .

Assignment: Construct the truth table for p (q r)

Tautologies and contradictions A proposition that is true for every combination of truth values is called a tautology. For example, the proposition ( p q ) ( ~ p ~ q) is a tautology A proposition that is false for every possible combination OF truth values is called a contradiction. For example, the proposition (p q) (p q) is a contradiction.

Logical equivalence We often have different, but equivalent, logical expressions; that is expressions that look different but having the same meaning. It is important to be able to determine whether two given propositions merely sound similar or whether they have exactly the same logical meaning. One way to verify logical equivalence is to use a truth table. We denote logical equivalence by the symbol . Example : Show that p ( p q ) ( p q )

The Laws of Propositional Logic

We frequently need to simplify logical expressions or to check whether given logical expressions are logically equivalent. One way to do these things is to use the laws of logic. For our purposes the most important laws are the distributive laws and de Morgan's laws.

Distributive Lawsp (q r) = (p q) (p r) p (q r) = (p q) (p r)

De Morgans Laws(p q) = p q(p q) = p qConditional statementsIn computing we often use conditional statements of the form "If then "; in other words if particular conditions are satisfied then certain consequences should follow.There are many ways of expressing this type of statement in English.A proposition of the form "If p then q" is called a conditional statement, and is represented by p q.The symbolic statement is usually read as "if p then q" orperhaps as "p implies q".Construct truth tables for the following proposition:((p q) r) (p r)The contrapositive of a conditional Given a conditional statement such as p q , its contrapositive is q p The contrapositive of a conditional is just another way of saying the same thing as the conditional. In other words a conditional statement and its contrapositive are logically equivalent. When one is true, then so is the other. If one is false, so is the other. It is not difficult to think of some examples: The conditional p q and its contrapositive q p are logically equivalent.

6. Predicates A predicate is a statement with one or more variables. If we assign values to the variables then the statement becomes a proposition, and has a truth value. For example:"7 > 20" is a proposition, whereas"x > 20" is a predicate"Peter owns 3 cats" is a proposition, whereas"x owns y cats" is a predicateWe will denote predicates by capital letters.

Some set notation A set is a collection of things, usually (but not necessarily) sharing some common attribute. Eg. let P = set of all people in this room Eg. let A = set of all letters in the alphabet In maths we have some special sets of numbers Eg. R denotes the set of all real numbers (numbers with a decimal point) eg . N denotes the set of all natural numbers = { 1, 2, 3, 4, } We use the symbol to denote membership of a set. The symbol is read "is a member of" or "is an element of" or "belongs to

QuantifierS A predicate has one or more variables, and if we substitute values for the variables the predicate becomes a proposition and has a truth value. Instead of substituting particular values for the variables, we may be able to make a more general statement by using a quantifier. We will use two quantifiers, the universal quantifier , and the existential quantifier . Example: Consider the predicate Q(x) : ( x < 5 ) ( x 5 ) This is true for all real numbers. So we can say: For all real numbers x, Q(x) We can use the symbol which means "for all" or "for every", and write x R, Q(x) Example: Let P = set of people in this room and define the predicateC(x) : x likes chocolate Then x P, C(x) means "Everybody in this room likes Chocolate And if we define the predicate D(x) : x likes going to the dentist then x P, ~ D(x) means "Everybody here dislikes going to the dentist", or "Nobody in this room likes going to the dentist. The universal quantifier is used for statements of the type "All do " or "None do " "All are " or "None are " The existential quantifier is used when we are making statements of the type "some do" or "some don't". The symbol is read as "there exists " or "there is at least one " or "for some " Examples: If we define I(x) : x speaks Italian, and ifP = set of people in this room, then : x P, I(x) means "There is at least one person in this room who speaks Italian x P, ~ I(x) means "There is at least one person in this room who does not speak Italian" If we define S(x): (x > 2) (x < 7), then x N, S(x) means "There is at least one natural number that is bigger than 2 and less than 7 The existential quantifier is used for statements of the type "Some do " or "Some don't " "Some are " or "Some are not "

Connections between and e.g. Everybody likes ice cream means that there is no one who doesnt like ice cream.x likes(x,icecream) x likes(x,icecream) using De Morgans Theorem:x S x Sx S x S

Predicates with two quantifiers Now we consider propositions with two quantifiers, for Example "Every student passed at least one unit" or "There is at least one song that everybody has heard. Example: Let P be a set of people and M a set of movies, and define the predicate S(x, y) to mean "person x has seen movie y". Consider the following symbolic statements:(i) x P, y M, S(x, y), which may be translated:"For each person x, there is some movie y such that x has seen y. or in simpler English:"Every person has seen at least one movie."

(ii) x P, y M, S(x, y), which may be translated:"There is some person x such that for every movie y,x has seen y. or more simply:"Some person has seen every movie.

(iii) y M, x P, S(x, y), which may be translated:"There is some movie y such that for every person x, x has seen y." or in simpler English:"There is a movie that every person has seen.

(iv) x P, y M, S(x, y), which may be translated:"For every person x and for every movie y, x has seen y." or more simply:"Every person has seen every movie."

Lecture 7Knowledge-based SystemsKnowledge-based system is a computer system that is programmed to imitate human problem-solving by means of artificial intelligence and reference to a database of knowledge on a particular subject.Knowledge-based systems are systems based on the methods and techniques of Artificial Intelligence. Their core components are the knowledge base and the inference mechanisms.KBS is a frequently used abbreviation for knowledge-based system.Remarks: 1. KBS is often used as a synonym for an expert system (ES) although the two are not the same in a strict sense. Strictly speaking, a KBS is any system that uses knowledge in performing its tasks. 2. KBS is a branch of artificial intelligence. 3. Keywords in the definition: "knowledge", "represents", "reasons", "specialist". 4. KBS uses the heuristic method in problem solving.

Characteristics of KBS1. KBS differs from conventional programs. It simulates human reasoning about a domain, rather than the domain itself. It performs reasoning over representations of human knowledge, in addition to doing numerical computations or data retrieval. It solves problems by heuristic or approximate methods which, unlike the algorithmic solutions, are not guaranteed to succeed. 2. KBS differs from other AI systems. It deals with subject matter of realistic complexity that normally requires a considerable amount of human expertise. It must exhibit high performance in terms of speed and reliability in order to be a useful tool. It must be capable of explaining and justifying solutions or recommendations to convince the user that its reasoning is in fact correct.

Applications of KBSSome areas where KBS has been very successful: 1. Medical diagnosis: MYCIN (for blood disorder) 1. Molecular structure analysis: DENTRAL 1. Computer configuration: XCON (R1) 1. Machine fault diagnosis 1. Fraud detection 1. Loan evaluation 1. ... ... Too many to enumerate. Major Components of a KBSA KBS usually consists of four major components: User interface converts user queries into an internal representation to be processed by the system, and converts system's solutions and explanations into a language which the user can understand. Knowledge Base contains expert knowledge about a narrow domain of application. Inference Engine manipulates the knowledge base, i.e., deduce new knowledge from the knowledge base, to give answers to user's queries. Explanation generator (Sometimes it is also considered as part of the Inference engine.) provides explanations to the user about how the system arrives at a conclusion so that the user can be convinced. Major Components of a KBS Figure: Major Components of a KBS/ES.When to Consider KBSKBS provides a mechanism to share existing but scarce expertise: When the expert is unavailable, When the qualitative performance of nonexpert needs to be enhanced, When the efficiency and consistency of the expert need to be enhanced, and When others need to be trained to understand the expert's thought processes. Economic ConsiderationsThe following criteria should be met before embarking on a KBS project for solving a highly constrained class of problems: No known algorithmic solution exists, thereby forcing consideration of the use of heuristic knowledge. The expert's solution to the problem is satisfactory (but may suffer from procedural difficulties such as timeliness). Decisions made by nonexperts are likely to be different from those of the expert and to have a significant impact on the organization in terms of . financial cost, . resource consumption, . delay (efficiency), and . risk. Who Are Involved in Developing a KBS1. Management 2. End-users 3. Project Champion 4. Domain Experts 5. Knowledge Engineers/Crafters 6. Apprentice Knowledge Engineers/Crafters Knowledge Engineering and Knowledge Engineers Knowledge engineers are those who study the problem domain, acquire knowledge from the expert and represent the knowledge in a structured form in the knowledge base. Knowledge engineering is the a subfield of AI which is devoted to knowledge acquisition, representation and inference for KBS. There are different views towards how mature the KBS technology and design process are. Some people regard the technology and the process are not mature enough to be engineered, thus the term crafting and crafter. We will not, however, make such distinction in this Unit.

Lecture 8Rule-Based Systems and Shells

It was noted early on in the history of ES that certain parts of an ES could be re-used for other ES which dealt with different domains.So attention has been given to developing frameworks or shells which provide as much as possible of an ES and into which the context-dependent parts can be fitted. (The idea of a general problem solver was perhaps not so unrealistic after all.)A shell may provide: Knowledge acquisition subsystem. Inference engine. User interface. Explanation subsystem.ShellsA shell does not provide a knowledge base (though it will provide the structure for a knowledge base).In order to build an ES using shell, it is necessary only to construct and install a knowledge baseAs we shall see, different expert systems may be designed on fundamentally different principles, containing knowledge bases with completely different structures.The most significant categories are rule-based, case- based and model-based systems.We shall consider first rule-based systems ...

Knowledge as RulesBy an IF ... THEN ... rule we mean something like:IF ID CheckedAND Satisfactory EmploymentAND Salary AdequateTHEN Credit GrantedIt will be convenient to think of (and write) such rules with the conclusion first:Credit GrantedIF ID CheckedAND Satisfactory EmploymentAND Salary AdequateThe part of a rule after the IF is called the body of the rule. It contains what will be subgoals. As we shall see, there are several different kinds of things which can appear in the body of a rule.Rules may contain AND, as above. They may also contain OR.For exampleID CheckedIF Credit Card ShownOR Driving Licence ShownOR Passport ShownAs these two examples show, the rules will form a tree structure. Trying to demonstrate the truth' of the conclusion to a rule will lead to requirements to demonstrate the truth of premises of the rule, which in turn will lead to . . .Each rule is referred to by its conclusion, so the two rules above are called Credit Granted and ID Checked.

Backward Chaining SystemsSo far we have looked at how rule-based systems can be used to draw new conclusions from existing data, adding these conclusions to a working memory. This approach is most useful when you know all the initial facts, but don't have much idea what the conclusion might be. If you DO know what the conclusion might be, or have some specific hypothesis to test, forward chaining systems may be inefficient. You COULD keep on forward chaining until no more rules apply or you have added your hypothesis to the working memory. But in the process the system is likely to do alot of irrelevant work, adding uninteresting conclusions to working memory. For example, suppose we are interested in whether Alison is in a bad mood. We could repeatedly fire rules, updating the working memory, checking each time whether (bad-mood alison) is in the new working memory. But maybe we had a whole batch of rules for drawing conclusions about what happens when I'm lecturing, or what happens in February - we really don't care about this, so would rather only have to draw the conclusions that are relevant to the goal. This can be done by backward chaining from the goal state (or on some hypothesised state that we are interested in). This is essentially what Prolog does, so it should be fairly familiar to you by now. Given a goal state to try and prove (e.g., (bad-mood alison)) the system will first check to see if the goal matches the initial facts given. If it does, then that goal succeeds. If it doesn't the system will look for rules whose conclusions (previously referred to as actions) match the goal. One such rule will be chosen, and the system will then try to prove any facts in the preconditions of the rule using the same procedure, setting these as new goals to prove. Note that a backward chaining system does NOT need to update a working memory. Instead it needs to keep track of what goals it needs to prove to prove its main hypothesis. In principle we can use the same set of rules for both forward and backward chaining. However, in practice we may choose to write the rules slightly differently if we are going to be using them for backward chaining. In backward chaining we are concerned with matching the conclusion of a rule against some goal that we are trying to prove. So the 'then' part of the rule is usually not expressed as an action to take (e.g., add/delete), but as a state which will be true if the premises are true. So, suppose we have the following rules: 1. IF (lecturing X) AND (marking-practicals X) THEN (overworked X) 2. IF (month february) THEN (lecturing alison) 3. IF (month february) THEN (marking-practicals alison) 4. IF (overworked X) THEN (bad-mood X) 5. IF (slept-badly X) THEN (bad-mood X) 6. IF (month february) THEN (weather cold) 7. IF (year 1993) THEN (economy bad) and initial facts: (month february) (year 1993) and we're trying to prove: (bad-mood alison) First we check whether the goal state is in the initial facts. As it isn't there, we try matching it against the conclusions of the rules. It matches rules 4 and 5. Let us assume that rule 4 is chosen first - it will try to prove (overworked alison). Rule 1 can be used, and the system will try to prove (lecturing alison) and (marking practicals alison). Trying to prove the first goal, it will match rule 2 and try to prove (month february). This is in the set of initial facts. We still have to prove (marking-practicals alison). Rule 3 can be used, and we have proved the original goal (bad-mood alison). One way of implementing this basic mechanism is to use a stack of goals still to satisfy. You should repeatedly pop a goal of the stack, and try and prove it. If its in the set of initial facts then its proved. If it matches a rule which has a set of preconditions then the goals in the precondition are pushed onto the stack. Of course, this doesn't tell us what to do when there are several rules which may be used to prove a goal. If we were using Prolog to implement this kind of algorithm we might rely on its backtracking mechanism - it'll try one rule, and if that results in failure it will go back and try the other. However, if we use a programming language without a built in search procedure we need to decide explicitly what to do. One good approach is to use an agenda, where each item on the agenda represents one alternative path in the search for a solution. The system should try `expanding' each item on the agenda, systematically trying all possibilities until it finds a solution (or fails to). The particular method used for selecting items off the agenda determines the search strategy - in other words, determines how you decide on which options to try, in what order, when solving your problem. We'll go into this in much more detail in the section on search.

Forward Chaining SystemsIn a forward chaining system the facts in the system are represented in a working memory which is continually updated. Rules in the system represent possible actions to take when specified conditions hold on items in the working memory - they are sometimes called condition-action rules. The conditions are usually patterns that must match items in the working memory, while the actions usually involve adding or deleting items from the working memory. The interpreter controls the application of the rules, given the working memory, thus controlling the system's activity. It is based on a cycle of activity sometimes known as a recognise-act cycle. The system first checks to find all the rules whose conditions hold, given the current state of working memory. It then selects one and performs the actions in the action part of the rule. (The selection of a rule to fire is based on fixed strategies, known as conflict resolution strategies.) The actions will result in a new working memory, and the cycle begins again. This cycle will be repeated until either no rules fire, or some specified goal state is satisfied. Rule-based systems vary greatly in their details and syntax, so the following examples are only illustrative. First we'll look at a very simple set of rules: 1. IF (lecturing X) AND (marking-practicals X) THEN ADD (overworked X) 2. IF (month february) THEN ADD (lecturing alison) 3. IF (month february) THEN ADD (marking-practicals alison) 4. IF (overworked X) OR (slept-badly X) THEN ADD (bad-mood X) 5. IF (bad-mood X) THEN DELETE (happy X) 6. IF (lecturing X) THEN DELETE (researching X) Here we use capital letters to indicate variables. In other representations variables may be indicated in different ways, such as by a ? or a ^ (e.g., ?person, ^person). Let us assume that initially we have a working memory with the following elements: (month february) (happy alison) (researching alison) Our system will first go through all the rules checking which ones apply given the current working memory. Rules 2 and 3 both apply, so the system has to choose between them, using its conflict resolution strategies. Let us say that rule 2 is chosen. So, (lecturing alison) is added to the working memory, which is now: (lecturing alison) (month february) (happy alison) (researching alison) Now the cycle begins again. This time rule 3 and rule 6 have their preconditions satisfied. Lets say rule 3 is chosen and fires, so (marking-practicals alison) is added to the working memory. On the third cycle rule 1 fires, so, with X bound to alison, (overworked alison) is added to working memory which is now: (overworked alison) (marking-practicals alison) (lecturing alison) (month february) (happy alison) (researching alison) Now rules 4 and 6 can apply. Suppose rule 4 fires, and (bad-mood alison) is added to the working memory. And in the next cycle rule 5 is chosen and fires, with (happy alison) removed from the working memory. Finally, rule 6 will fire, and (researching alison) will be removed from working memory, to leave: (bad-mood alison) (overworked alison) (marking-practicals alison) (lecturing alison) (month february) (This example is not meant to a reflect my attitude to lecturing!) The order that rules fire may be crucial, especially when rules may result in items being deleted from working memory. (Systems which allow items to be deleted are known as nonmonotonic). Anyway, suppose we have the following further rule in the rule set: IF (happy X) THEN (gives-high-marks X) If this rule fires BEFORE (happy alison) is removed from working memory then the system will conclude that I'll give high marks. However, if rule 5 fires first then rule 7 will no longer apply. Of course, if we fire rule 7 and then later remove its preconditions, then it would be nice if its conclusions could then be automatically removed from working memory. Special systems called truth maintenance systems have been developed to allow this. A number of conflict resolution strategies are typically used to decide which rule to fire. These include: Don't fire a rule twice on the same data. (We don't want to keep on adding (lecturing alison) to working memory). Fire rules on more recent working memory elements before older ones. This allows the system to follow through a single chain of reasoning, rather than keeping on drawing new conclusions from old data. Fire rules with more specific preconditions before ones with more general preconditions. This allows us to deal with non-standard cases. If, for example, we have a rule ``IF (bird X) THEN ADD (flies X)'' and another rule ``IF (bird X) AND (penguin X) THEN ADD (swims X)'' and a penguin called tweety, then we would fire the second rule first and start to draw conclusions from the fact that tweety swims. These strategies may help in getting reasonable behaviour from a forward chaining system, but the most important thing is how we write the rules. They should be carefully constructed, with the preconditions specifying as precisely as possible when different rules should fire. Otherwise we will have little idea or control of what will happen. Sometimes special working memory elements are used to help to control the behaviour of the system. For example, we might decide that there are certain basic stages of processing in doing some task, and certain rules should only be fired at a given stage - we could have a special working memory element (stage 1) and add (stage 1) to the preconditions of all the relevant rules, removing the working memory element when that stage was complete. Forwards vs Backwards ReasoningWhether you use forward or backwards reasoning to sove a problem depends on the properties of your rule set and initial facts. Sometimes, if you have some particular goal (to test some hypothesis), then backward chaining will be much more efficient, as you avoid drawing conclusions from irrelevant facts. However, sometimes backward chaining can be very wasteful - there may be many possible ways of trying to prove something, and you may have to try almost all of them before you find one that works. Forward chaining may be better if you have lots of things you want to prove (or if you just want to find out in general what new facts are true); when you have a small set of initial facts; and when there tend to be lots of different rules which allow you to draw the same conclusion. Backward chaining may be better if you are trying to prove a single fact, given a large set of initial facts, and where, if you used forward chaining, lots of rules would be eligible to fire in any cycle. Case-based reasoning (CBR): Stores cases (descriptions of past experiences) in a database for later retrieval Searches the database for cases with similar characteristics to a new case to find and apply appropriate solutions Rely on continuous expansion and refinement by users

HOW CASE-BASED REASONING WORKSCase-based reasoning represents knowledge as a database of past cases and their solutions. The system uses a six-step process to generate solutions to new problems encountered by the user.

Lecture 9Simple Expert SystemsExpert systems: Capture tacit knowledge in a very specific, limited domain of human expertise Support highly structured decision making Model human knowledge as a set of rules called the knowledge base Work by applying a set of IF-THEN-ELSE rules extracted from human experts Use an inference engine to search through the knowledge base. In forward chaining, the inference engine begins with information entered by the user to search the knowledge base for a conclusion. In backward chaining, the system begins with a hypothesis and asks the user questions to confirm or disprove the hypothesis.

The expert systems given below are very basic. These samples should give you and idea as to where to start your coursework. They can all be quickly and easily implemented using Crystal

Simple Expert System 1: A CAR TROUBLE DIAGNOSTIC SYSTEM

Knowledge AcquisitionThe first task is knowledge acquisition. The solutions for this expert system are based wholly on knowledge on automotive systems from the internet and a local Jua Kali mechanic.

The basic items that were identified to be needed in order to get a vehicle to start are a combustion chamber, some sort of mechanism to turn the engine, air and fuel to burn, and something to ignite the air fuel mixture. All the solutions in this illustration deal with how these elements come together in order to make a vehicle start. Below is an introduction to the basic systems that were considered. Battery: This is the part of a vehicle that stores the power that is required to turn the engine and create a spark. Battery Cables: This is a set of wires that carry the power from the battery to the starter and the rest of the engine. These cables usually fail due to corrosion, which interferes with the energy flow from the battery.Starter: This is a mechanical device, an electric motor that uses power from the battery to rotate the engine.Coil: An electronic component that takes the twelve volts coming from the battery and converts it to a much larger voltage.Coil Wire: A wire which caries the voltage from the coil to the distribution or computer controlled ignition points, which then distribute the pulse to the correct spark plug wire.Spark Plug Wires: A set of wires that caries the electronic pulse from the distributor or ignition points to the appropriate spark plug.Spark Plugs: A set of electronic components constructed of insulators and conductors. These spark plugs create a short between a spark point and a conductor. This short creates a spark that ignites the air fuel mixture.Fuel: Also referred to as petrol. Fuel Filter: A filtering device located somewhere between the fuel tank and engine. Used to eliminate impurities from the fuel.Knowledge RepresentationThe Second step is to represent the acquired knowledge. This involved coming up with rules that would later be encoded into the knowledge base of the expert system. The hard part was deciding at what point should problems be included in the solution space and which should be dropped. A decision was made to limit the solution space to problems that can be fixed without any special knowledge of how a car works. This eliminated many problems including internal engine failures.

Below, a decision tree is used to represent the reasoning used in the system.

Starter Turning ?

NOYES

Lights on ?Filter ReplacedRecently?Car Moving ?Got Enough Fuel?Replace FuseReplace StarterClean TerminalsReplace FilterCall MechanicBuy FuelCar is Fine

Fix Battery CableNOCharge BatteryNO Replace CoilCoil Fuse OK ?NONOYESTerminals Clean ?NOCable OK?YESCoil Clicks ?NONONOYESYESYESYESYESYES1

72

Simple Expert System 2: A MEDICAL DIAGNOSIS SYSTEM

Knowledge Acquisition ProcessFor this process, two medical practitioners one Doctor working for an organization, and one in the private sector here in Nairobi - with a view to finding out: (1) Illnesses that have nearly similar signs and symptoms to Malaria, and (2) The signs and symptoms for each of these illnesses. The interviews were face-to-face. Other information was obtained from the internet. The ailments covered are listed below, and the respective signs and symptoms also given.MalariaMalaria + RTIRTITyphoidMeningitis

Fever

Chills (and sweating)

Coughing

Headache

Severe Headache

Nausea (and vomiting)

Body Malaise

Abdominal discomfort

Diarrhoea/Constipation

Loss of Appetite

Stiff neck

Photophobia (sensitive to light)

Note: RTI = Respiratory Track Infection (specifically the common cold, also referred to as the upper respiratory infection).Note: Cells shaded in green indicate YES for the given sign/symptom.

Rule-based Knowledge Representation

R1IF fever THEN patient_ill

R2IF patient_ill AND coughing THEN respiratory_tract_infection

R3IF patient_ill AND headache AND chills_sweat AND nausea AND body_malaise THEN [malaria]

R4IF patient_ill AND respiratory_track_infection AND NOT malaria THEN [common_cold]

R5IF malaria AND respiratory_track_infection THEN [malaria_and_respiratory_tract]

R6IF patient_ill AND NOT chills_sweat AND headache AND severe_headache THEN non_malaria

R7IF non_malaria AND nausea AND stiff_neck AND photophobia THEN [meningitis]

R8IF non_malaria AND body_malaise AND diarrhoea_constipation AND appetite_loss AND NOT stiff_neck AND NOT photophobia THEN [typhoid]

R9IF patient_ill AND headache AND severe_headache AND body_malaise AND abdominal_discomfort AND stiff_neck THEN [unknown_illness_1]

R10IF patient_ill AND headache AND NOT severe_headache AND body_malaise AND NOT nausea THEN [unknown_illness_2]

R11IF patient_ill AND headache AND NOT severe_headache AND NOT body_malaise THEN [unknown_illness_3]

R12IF patient_ill AND NOT headache AND NOT common_cold THEN [unknown_illness_4]

R13IF patient_ill AND headache AND severe_headache AND NOT body_malaise AND nausea AND NOT stiff_neck THEN [unknown_illness_5]

R14IF patient_ill AND headache AND severe_headache AND NOT body_malaise AND NOT nausea THEN [unknown_illness_6]

Simple Expert System 3: A MEDICAL DIAGNOSIS SYSTEM

ProblemThe medical diagnosis system considered here is supposed to be used to diagnose diseases, which may have some common symptoms. The diseases picked are Malaria, Typhoid, Meningitis, Cholera, Amoebic Dysentery, Lobar Pneumonia and Hepatitis C. Source of information was medical reference books and the Internet

When a patient goes to see a medical expert the first thing the expert obtains are the symptoms. He/she then uses his/her knowledge to arrive at a possible conclusion after which he/she carries out a confirmatory test. Since human is to error this knowledge may not always be accurate. The system considered here, will be designed to obtain this possible conclusion for the expert, suggest a confirmatory test, recommend treatment and warn the expert of any disease that manifests similar symptoms.

Knowledge representationAfter carrying out the research the following information was obtained.

MALARIATYPHOID FEVERMENENGITISCHOLERAAMOEBIC DYSENTRYLOBAR PNEUMONIAHEPATITS C

DiarrheaDiarrheaDiarrheaDiarrheaDiarrheaHigh Fever Fever

High FeverHigh FeverFeverFeverAbdominal painCoughNausea

VomitingVomitingVomitingVomitingFever Abdominal painDiarrhea

Joint painsJoint painsJoint painsDehydrationBlood/mucus in stoolChest painHead ache

DehydrationHeadacheConvulsionsAbdominal painHeadacheAbdominal pain

Severe HeadacheNauseaNo appetiteWeaknessStool analysis test for cystsJoint painsVomiting

NauseaNo appetiteNauseaRice water stoolChillsConstipation

ConvulsionsCoughOr rashesCold clammy skinNo appetiteMalaise

No appetiteAbdominal painStiff neckTachycardiaFluid-filled lungsYellowing skin

Abdominal painOr rashSevere headacheHypertensionSwollen glands Tender liver

ChillsStool/blood test for S. TyphiDislike of lightWeaknessSoar throatBody pain

Or pneumoniaChest painsSpinal tap testSunken eyesOr asthmaAnemia

Or blood smearFatigueparaparalisisOr smokerLiver function test

AnemiaBlurred visionFatigueNo appetite

DillusionsRapid respirationOr Dark urine

Dizziness

MISDIAGNOSEDMISDIAGNOSEDMISDIAGNOSEDMISDIAGNOSEDMISDIAGNOSEDMISDIAGNOSEDMISDIAGNOSED

PlagueFluDysentryCholera PlagueMinor Disease

PneumoniaMinor DiseaseMinor DiseaseFlu

Food poisoningTuberculosis

TREATMENTTREATMENTTREATMENTTREATMENTTREATMENTTREATMENTTREATMENT

Chloroquine AntibioticsIntense antibiotic therapyRelieve Pain Replace FluidsRelieve Pain Relieve Pain

QuinineFeverMetronidazoleFeverFever

Antibiotics - penicillin

Erythromycin

Notes:Minor disease: This is the catch all for a number of minor, debilitating minor illnesses other than those listed. Usually this is no more than a head cold or bad case of the flu. Symptoms are widely varying and are the discretion of the System User. Infection Symptoms: usually, fever, general pain, vomiting, headaches, etc.Misdiagnosed as: usually another minor disease, such as the incorrect flu bug. Sometimes pneumonia.Treatment: of pneumonia using antibiotics

From the above a decision was made to design a system where by the user will be expected to key in the symptoms manifested and the system will compare them with those of the diseases in the knowledge base. If they are similar then a diagnosis is arrived at otherwise the disease is unidentified.

Lecture 10PROLOG

Facts, Rules and QueriesSymbolsProlog expressions are comprised of the following truth-functional symbols, which have the same interpretation as in the predicate calculus. EnglishPredicate CalculusPROLOG

and^,

orv;

if-->:-

not~not

Variables and NamesVariables begin with an uppercase letter. Predicate names, function names, and the names for objects must begin with a lowercase letter. Rules for forming names are the same as for the predicate calculus. mother_of male female greater_than socratesFactsA fact is a predicate expression that makes a declarative statement about the problem domain. Whenever a variable occurs in a Prolog expression, it is assumed to be universally quantified. Note that all Prolog sentences must end with a period. likes(john, susie). /* John likes Susie */likes(X, susie). /* Everyone likes Susie */likes(john, Y). /* John likes everybody */likes(john, Y), likes(Y, john). /* John likes everybody and everybody likes John */likes(john, susie); likes(john,mary). /* John likes Susie or John likes Mary */not(likes(john,pizza)). /* John does not like pizza */likes(john,susie) :- likes(john,mary)./* John likes Susie if John likes Mary.

RulesA rule is a predicate expression that uses logical implication (:-) to describe a relationship among facts. Thus a Prolog rule takes the form left_hand_side :- right_hand_side .This sentence is interpreted as: left_hand_side if right_hand_side. The left_hand_side is restricted to a single, positive, literal, which means it must consist of a positive atomic expression. It cannot be negated and it cannot contain logical connectives. This notation is known as a Horn clause. In Horn clause logic, the left hand side of the clause is the conclusion, and must be a single positive literal. The right hand side contains the premises. The Horn clause calculus is equivalent to the first-order predicate calculus. Examples of valid rules: friends(X,Y) :- likes(X,Y),likes(Y,X). /* X and Y are friends if they like each other */hates(X,Y) :- not(likes(X,Y)). /* X hates Y if X does not like Y. */enemies(X,Y) :- not(likes(X,Y)),not(likes(Y,X)). /* X and Y are enemies if they don't like each other */Examples of invalid rules: left_of(X,Y) :- right_of(Y,X) /* Missing a period */likes(X,Y),likes(Y,X) :- friends(X,Y). /* LHS is not a single literal */not(likes(X,Y)) :- hates(X,Y). /* LHS cannot be negated */QueriesThe Prolog interpreter responds to queries about the facts and rules represented in its database. The database is assumed to represent what is true about a particular problem domain. In making a query you are asking Prolog whether it can prove that your query is true. If so, it answers "yes" and displays any variable bindings that it made in coming up with the answer. If it fails to prove the query true, it answers "No". Whenever you run the Prolog interpreter, it will prompt you with ?-. For example, suppose our database consists of the following facts about a fictitious family. father_of(joe,paul).father_of(joe,mary).mother_of(jane,paul).mother_of(jane,mary).male(paul).male(joe).female(mary).female(jane).We get the following results when we make queries about this database | ?- father_of(joe,paul).

true ?

yes| ?- father_of(paul,mary).

no| ?- father_of(X,mary).

X = joe

yes| ?- Closed World Assumption. The Prolog interpreter assumes that the database is a closed world -- that is, if it cannot prove something is true, it assume that it is false. This is also known as negation as failure -- that is, something is false if PROLOG cannot prove it true given the facts and rules in its database. In this case, in may well be (in the real world), that Paul is the father of Mary, but since this cannot be proved given the current family database, Prolog concludes that it is false. So PROLOG assumes that its database contains complete knowledge of the domain it is begin asked about. Prolog's Proof ProcedureIn responding to queries, the Prolog interpreter uses a backtracking search, similar to the one we study in Chapter 3 of Luger. To see how this works, let's add the following rules to our database: parent_of(X,Y) :- father_of(X,Y). /* Rule #1 */ parent_of(X,Y) :- mother_of(X,Y). /* Rule #2 */And let's trace how PROLOG would process the query. Suppose the facts and rules of this database are arranged in the order in which they were input. This trace assumes you know how unification works. ?- parent_of(jane,mary). parent_of(jane,mary) /* Prolog starts here and searches for a matching fact or rule. */ parent_of(X,Y) /* Prolog unifies the query with the rule #1 using {jane/X, mary/Y}, giving parent_of(jane,mary) :- father_of(jane,mary) */ father_of(jane,mary) /* Prolog replaces LHS with RHS and searches. */ /* This fails to match father_of(joe,paul) and and father_of(joe,mary), so this FAILS. */ /* Prolog BACKTRACKS to the other rule #2 and unifies with {jane/X, mary/Y}, so it matches parent_of(jane,mary) :- mother_of(jane,mary) */ mother_of(jane,mary) /* Prolog replaces LHS with RHS and searches. */ YES. /* Prolog finds a match with a literal and so succeeds.Here's a trace of this query using Prolog's trace predicate: | ?- trace,parent_of(jane,mary).{The debugger will first creep -- showing everything (trace)} 1 1 Call: parent_of(jane,mary) ? 2 2 Call: father_of(jane,mary) ? 2 2 Fail: father_of(jane,mary) ? 2 2 Call: mother_of(jane,mary) ? 2 2 Exit: mother_of(jane,mary) ? 1 1 Exit: parent_of(jane,mary) ?

yes{trace}| ?-Exercises1. Add a male() rule that includes all fathers as males. 2. Add a female() rule that includes all mothers as females. 3. Add the following rules to the family database: 4. son_of(X,Y)5. daughter_of(X,Y)6. sibling_of(X,Y)7. brother_of(X,Y)8. sister_of(X,Y)9. Given the addition of the sibling_of rule, and assuming the above order for the facts and rules, show the PROLOG trace for the query sibling_of(paul,mary).