Fighting Knowledge Acquisition Bottleneck
with Argument Based Machine Learning
Martin Mozina, Matej Guid, Jana Krivec, Aleksander Sadikov and Ivan Bratko
ECAI 2008
Faculty of Computer and Information Science
University of Ljubljana, Slovenia
Motivation for Knowledge Acquisitionwith Argument Based Machine Learning
Knowledge Acquisition is a major bottleneck in building knowledge bases.
domain experts find it hard to articulate their knowledge Machine Learning is a potential solution, but has weaknesses
Machine Learning & Knowledge Acquisition
Problem: Models are not comprehensible to domain experts mostly statistical learning (not symbolic) inducing spurious concepts (e.g. overfitting)
Combination of domain expert and machine learning would yield best results
learn symbolic models exploit experts’ knowledge in learning
Combining Machine Learning and Expert Knowledge
Expert provides background knowledge for ML Expert validates and revises induced theory Iterative procedure: Experts and ML improve the model in turns
IF ... THEN ...IF ... THEN ......
Combining Machine Learning and Expert Knowledge
Expert provides background knowledge for ML Expert validates and revises induced theory Iterative procedure: Experts and ML improve the model in turns
IF ... THEN ...IF ... THEN ......
Combining Machine Learning and Expert Knowledge
Expert provides background knowledge for ML Expert validates and revises induced theory Iterative procedure: Experts and ML improve the model in turns
IF ... THEN ...IF ... THEN ......
ABML
Definition of Argument Based Machine Learning
Learning with background knowledge: INPUT: learning examples E, background knowledge BK OUTPUT: theory T, T and BK explain all ei from E
Argument Based Machine Learning: INPUT: learning examples E, arguments ai given to ei (from E)
OUTPUT: theory T, T explains ei with arguments ai
BK,T ei
T ai ei
Argument Based Rule Learning
Classic rule learning: IF HairColor = Blond THEN CreditApproved = YES Possible argument: Miss White received credit (CreditApproved=YES) because she has a regular job (RegularJob=YES).
Name RegularJob
Rich AccountStatus
HairColor
CreditApproved
Mr. Bond No Yes Negative Blond Yes
Mr. Grey No No Positive Grey No
Miss White
Yes No Positive Blond Yes
Miss Silver
Yes Yes Positive Blond Yes
Mrs. Brown
Yes No Negative Brown No
Name RegularJob
Rich AccountStatus
HairColor CreditApproved
Mr. Bond No Yes Negative Blond Yes
Mr. Grey No No Positive Grey No
Miss White
Yes No Positive Blond Yes
Miss Silver
Yes Yes Positive Blond Yes
Mrs. Brown
Yes No Negative Brown No
Name RegularJob
Rich AccountStatus
HairColor CreditApproved
Mr. Bond No Yes Negative Blond Yes
Mr. Grey No No Positive Grey No
Miss White
Yes No Positive Blond Yes
Miss Silver
Yes Yes Positive Blond Yes
Mrs. Brown
Yes No Negative Brown No
AB rule learning (possible rule): IF RegularJob=YES AND AccountStatus = Positive THEN CreditApproved = YES
Name RegularJob
Rich AccountStatus
HairColor CreditApproved
Mr. Bond No Yes Negative Blond Yes
Mr. Grey No No Positive Grey No
Miss White Yes No Positive Blond Yes
Miss Silver
Yes Yes Positive Blond Yes
Mrs. Brown
Yes No Negative Brown No
Formal definition of Argumented Examples
Argumented Example (A, C, Arguments): A; attribute-value vector [e.g. RegularJob=YES,Rich=NO, ...] C; class value [e.g. CreditApproved=YES] Arguments; a set of arguments Arg1, ..., Argn for this example
Argument Argi : Positive argument: C because Reasons Negative Argument: C despite Reasons
Reasons: a conjunction of reasons r1, ..., rm
ABCN2
ABCN2 = extension of CN2 rule learning algorithm (Clark,Niblett 1991)
Extensions: Argument Based covering:
All conditions in R are true for E R is consistent with at least one positive argument of E. R is not consistent with any negative argument of E.
Evaluation: Extreme Value Correction (Mozina et al. 2006)
Probabilistic covering (required for Extreme Value Correction)
Interactions between expert and ABML
1. Learn a hypothesis with ABML.
2. Find the most critical example. (if none found, stop procedure)
3. Expert explains the example.
4. Argument is added to the example.
5. Return to step 1.
critical example
learn data set
Argument
What if expert’s explanation
is not good enough?
ABML
Interactions between expert and ABML
1. Learn a hypothesis with ABML.
2. Find the most critical example. (if none found, stop procedure)
3. Expert explains the example.
4. Argument is added to the example.
5. Return to step 1.
1. Expert explains example.
2. Add argument to example
3. Discover counter examples (if none, then stop).
4. Expert improves the argument for example.
5. Return to step 3.What if expert’s explanation
is not good enough?
Knowledge Acquisition ofChess Concepts
used in a Chess Tutoring Application
Case Study: Bad Bishop
The Concept of the Bad Bishop
Chess experts in general understand the concept of bad bishop. Precise formalisation of this concept is difficult.
Traditional definition (John Watson, Secrets of Modern Chess Strategy,
1999)
A bishop that is on the same colour of squares as its own pawns is bad:
its mobility is restricted by its own pawns, it does not defend the squares in front of these pawns. Moreover, centralisation of these pawns is the main factor in
deciding whether the bishop is bad or not.
Data set
Data set: 200 middlegame positions from real chess games
Chess experts’ evaluation of bishops:
bad: 78 bishops not bad: 122 bishops
CRAFTY’s positional feature values served as attribute values for learning.
We randomly selected: 100 positions for learning 100 positions for testing
wGM Jana Krivec GM Garry KasparovFM Matej Guid
Standard Machine Learning Methods' Performance with CRAFTY's features
only Machine learning methods’ performance on initial dataset
Method CA Brier score AUC
Decision trees (C4.5) 73% 0,49 0,74
Logistic regression 70% 0,43 0,84
Rule learning (CN2) 72% 0,39 0,80
The results were obtained on test data set. The results obtained with CRAFTY’s positional features only are too inaccurate for commenting purposes…
additional information for describing bad bishops is necessary.
First Critical Example
Rules obtained by ABML method ABCN2 failed to classify this example as "not bad"
The following question was given to the experts: “Why is the black bishop not bad?“
The experts used their domain knowledge:
“The black bishop is not bad, since its mobility is not seriously restricted by the pawns of both players.”
Introducing new attributes into the domain and adding arguments to an
example Experts’ explanation could not be described with current domain attributes.
The argument
“BISHOP=“not bad” because IMPROVED_BISHOP_MOBILITY is high“
was added to the example.
A new attribute, IMPROVED_BISHOP_MOBILITY, was included into the domain:
the number of squares accessible to the bishop, taking into account only own and opponent’s pawn structure
Method failed to explain critical example with given argument.
Counter example was presented to experts:
Experts’ explanation: “There are many pawns on the same colour of squares as the black bishop, and some of these pawns occupy the central squares.”
Counter example: “bad”, although IMPROVED_BISHOP_MOBILITY is high.
Counter example
"Why is the “red” bishop bad, comparing to the “green” one?"
Critical example: “not bad”, IMPROVED_BISHOP_MOBILITY is high.
Improving Arguments with Counter Examples
Attribute BAD_PAWNS was included into the domain. This attribute evaluates pawns that are on the colour of the square
of the bishop ("bad" pawns in this sense).
The argument given to the critical example was extended to “BISHOP=“not bad” because IMPROVED_BISHOP_MOBILITY is high and BAD_PAWNS is low”
With this argument the method could not find any counter examples anymore.
New rule:
if IMPROVED_BISHOP_MOBILITY ≥ 4 and BAD_PAWNS ≤ 32 then BISHOP= “not bad” class distribution [0,39]
Assesing “bad” pawns
The experts designed a look-up table (left) with predefined values for the pawns that are on the color of the square of the bishop in order to assign weights to such pawns.
BAD_PAWNS_AHEAD =
16 + 24 + 2 = 42
After the Final Iteration...
Attribute Description
BAD_PAWNSpawns on the color of the square of the bishop - weighted according to their squares (bad pawns)
BAD_PAWNS_AHEAD bad pawns ahead of the bishop
BAD_PAWNS_BLOCK_BISHOP_DIAGONAL
bad pawns that block the bishop's (front) diagonals
BLOCKED_BAD_PAWNS bad pawns blocked by opponent's pawns or pieces
IMPROVED_BISHOP_MOBILITY number of squares accessible to the bishop taking into account only pawns of both opponents
The whole process consisted of 8 iterations. 7 arguments were attached to automatically selected critical
examples 5 new attributes were included into the domain
Classification Accuracy Through Iterations
Method CA Brier score AUC
Decision trees (C4.5) 89% 0,21 0,86
Logistic regression 88% 0,19 0,96
Rule learning (CN2) 88% 0,19 0,94
ABCN2 95% 0,11 0,97
Results on the final dataset
Classification Accuracy Through Iterations
The accuracies of all methods improved by adding new attributes. ABCN2 (which also used the arguments) outperformed all others.
Arguments suggested useful attributes AND lead to even more accurate models.
Advantages of ABML for Knowledge Acquisition
explain single example
easier for experts to articulate knowledge
more knowledge from experts
critical examples expert provide only relevant knowledge
time of experts' involvent is decreased
Advantages of ABML for Knowledge Acquisition
counter examples detect deficiencies in expert's explanations
even more knowledge from experts
arguments constrain learning
hypotheses are consistent with expert knowledge
hypotheses comprehensible to expert
more accurate hypotheses
Conclusions
1 more knowledge from experts
2 time of experts' involvent is decreased
3 hypotheses comprehensible to expert
4 more accurate hypotheses
ABML-based Knowledge Acquisition process provides:
Argument Based Machine Learning
enables
better knowledge acquisition