Upload
pablo-silva
View
217
Download
0
Embed Size (px)
Citation preview
8/13/2019 Bayes Network
1/12
Pablo Silva LZ1LUZ [email protected] | Artificial Intelligence | December 13, 2013
Knowledge and data
engineering with BayesiannetworksREPORT HOMEWORK
8/13/2019 Bayes Network
2/12
Task 1 Domain description
My model is about AI classes. The purpose of the model is to find out the probability that an AI
class be interesting under some situations.
Task 2 Variable definitions
VARIABLES:
Lecturer:o Represents: the lecturer of the AI class in someday.o Value Range: Peter Lang, Pter Antal.o Depends on:nothing.
Winter:o Represents: the current season is autumn.o Value Range:binary (true, false).o Depends on:nothing.
Place:o Represents: the building where the classes are on.o Value Range:E504, Q408.o Depends on:nothing.
Subject:o Represents: the subject taught in class.o Value Range: Bayesian networks, another.o Depends on: Lecturer.
Delay:o Represents: the lecturer comes in time for the class.o Value Range:binary (true, false).o Depends on:Lecturer, Winter, Place.
Agreeableness:o Represents: how interesting was the class for the students.o Value Range: excellent, good, bado Depends on: Subject and Delay.
PAGE 1
8/13/2019 Bayes Network
3/12
Task 3 Structure of the Bayesian network
Figure 1 - Structure of the BN
If the student likes or not the class depends on whether there is a delay to start the lesson and which
the subject was taught. Some late in class could happens if the lecturer does not come in time and
this could be affected by the place and the weather or not. Finally the subject only depends on aboutwhat the lecturer thinks that is good for teaching for his students.
PAGE 2
8/13/2019 Bayes Network
4/12
Task 4 Quantify the Bayesian network
P(Lecturer) Peter Antal Another0.9 0.1
Peter is the official lecturer of the AI classes, therefore just in specials occasions will be not him to
be a lecturer.
P(Winter) True False
0.25 0.75
In Hungary, its winter about 25% of the time.
P(Place) E504 Q408
0.5 0.5
There are two classes per week and on Mondays it is in E504 and on Wednesday it is in Q408.
P(Subject) BNs Another
Lecturer=Peter Antal 0.7 0.3
Lecturer=Another 0.5 0.5
Peter seems to like of BNs and the others times that was not him teaching, the classes were middle
about BNs.
P(Delay) True False
Lecturer=Another,Winter=False,Place=Q408 0.1 0.9
Lecturer=Another,Winter=False,Place=E504 0.3 0.7
Lecturer=Another,Winter=True,Place=Q408 0.1 0.9
Lecturer=Another,Winter=True,Place=E504 0.3 0.7Lecturer=Peter Antal,Winter=False,Place=Q408 0.2 0.8
Lecturer=Peter Antal,Winter=False,Place=E504 0.3 0.7
Lecturer=Peter Antal,Winter=True,Place=Q408 0.3 0.7
Lecturer=Peter Antal,Winter=True,Place=E504 0.4 0.6
PAGE 3
8/13/2019 Bayes Network
5/12
Delays occur with greater intensity when the classes are at building E504 because the lecturer need
get the projector far from that building. Despite this, most of the classes are in time.
P(Agreeableness) Bad Good Excellent
Subject=Another,Delay=True 0.2 0.5 0.3
Subject=Another,Delay=False 0.1 0.4 0.5
Subject=BNs,Delay=True 0.2 0.4 0.4
Subject=BNs,Delay=False 0.1 0.4 0.5
In general the students do not complain about the AI classes, just in case of some delay, but
nevertheless are just few ones. The best classes are when does not have a delay and the subject is
BNs.
Task 5 Inference and sensitivity analysis
My inferences are as follows:
What is the probability that the class will be excellent when nothing else is given? P(Agreeableness=Excellent) = 0.464
If the class start late, which is the chance of the class being excellent? P(Agreeableness=Excellent|Delay=true) = 0.368
Peter needed miss the class because the winter and then the class was in building E504,which is the probability of the one has been excellent?
P(Agreeableness=Excellent|Lecturer=Another,Winter=true,Place=E504) = 0.455
Because of the rigorous winter Peter could not go to class and the substitute lecturer taughtsearch methods because him did not liked BNs. What the students though about the class?
P(Agreeableness|Lecturer=Another,Subject=Another,
Winter= true)
Bad Good Excellent
0.12 0.42 0.46
If the class was excellent, what is the chance of the lecture has been Peter Antal?
PAGE 4
8/13/2019 Bayes Network
6/12
P(Lecturer=Peter Antal|Agreeableness=Excellent) = 0.898
The Sensitivity Analysis of P(Agreeableness=excellent|Subject,Delay). Such analysis could be seen
on Figure 1:
Figure 2 - Example of Sensitivity Analysis
Computing the values using normal inference:
P(Agreeableness|Delay) Excellent
Delay=true 0.368
Delay=False 0.5
Difference is: ~0.132
P(Agreeableness|Subject) Excellent
Subject=BNs 0.473
Subject=Another 0.447
PAGE 5
8/13/2019 Bayes Network
7/12
Difference is: ~0.026
As we can see, the value of Delay has a greater impact on the random variable Agreeableness than
Subject.
Task 6 Compare Differences of constructed and learnt models
The parameters I used for sample generation and learning:
Sample Size: 10000Prior: CHMax. parent count: 5Max. permutation: 80
The learnt structure:
Figure 3 - Learning with 10000 samples
As we can see, the structure resembles the original except for the Winter random variable. The
CPTs are changed accordingly the new model which can be seen in the attached
PHAS_LZ1LUZ_HW_AI2013_learning_structure_10000s.xml.
Proper examination whether the two models are the same is out of this works reach but let us retry
some queries:
P(Agreeableness=Excellent) = 0.472 (prev.: 0.464) P(Agreeableness=Excellent|Delay=true) = 0.365 (prev.: 0.368)
PAGE 6
8/13/2019 Bayes Network
8/12
P(Agreeableness=Excellent|Lecturer=Another,Winter=true,Place=E504) = 0.447 (prev.:0.455)
As we can see, even the model learnt does not be exactly identical to the designed the queries had
almost the same result which indicates the resemblance.
Task 7 Analysis estimation biases
OVERCONFIDENCE
I have changed the Lecturer and Delay variables to make my model overconfident.
P(Lecturer) Peter Antal Another
0.915 0.0842
P(Delay) True False
Lecturer=Another,Winter=False,Place=Q408 0.0842 0.9158
Lecturer=Another,Winter=False,Place=E504 0.2584 0.7416
Lecturer=Another,Winter=True,Place=Q408 0.0842 0.9158
Lecturer=Another,Winter=True,Place=E504 0.2584 0.7416
Lecturer=Peter Antal,Winter=False,Place=Q408 0.1702 0.8298
Lecturer=Peter Antal,Winter=False,Place=E504 0.2584 0.7416
Lecturer=Peter Antal,Winter=True,Place=Q408 0.2584 0.7416
Lecturer=Peter Antal,Winter=True,Place=E504 0.3489 0.6511
Rerunning my queries, I got the following result:
P(Agreeableness=Excellent) = 0.469 (prev.: 0.464)
P(Agreeableness=Excellent|Delay=true) = 0.368 (prev.: 0.368)
P(Agreeableness=Excellent|Lecturer=Another,Winter=true,Place=E504) = 0.461 (prev.:0.455)
PAGE 7
8/13/2019 Bayes Network
9/12
P(Agreeableness|Lecturer=Another,Subject=Another,
Winter= true)
Bad Good Excellent
0.117
(prev.:0.12)
0.417 (prev.:
0.42)
0.466 (prev.:
0.46)
P(Lecturer=Peter Antal|Agreeableness=Excellent) = 0.914 (prev.: 0.898)It is possible see that the probability of the class be excellent if the lecturer is Peter Antal increases.
This happens because before the variable Delay had a greater bias that associated the delay on classes
with the Lecturer Peter Antal and with overconfidence the delay associated with Peter decreased.
UNDERCONFIDENCEI have changed the Lecturer and Delay variables to make my model underconfident.
P(Lecturer) Peter Antal Another
0.5626 0.4374
P(Delay) True False
Lecturer=Another,Winter=False,Place=Q408 0.4374 0.5626
Lecturer=Another,Winter=False,Place=E504 0.4557 0.5443
Lecturer=Another,Winter=True,Place=Q408 0.4374 0.5626
Lecturer=Another,Winter=True,Place=E504 0.4557 0.5443
Lecturer=Peter Antal,Winter=False,Place=Q408 0.4438 0.5562
Lecturer=Peter Antal,Winter=False,Place=E504 0.4557 0.5443
Lecturer=Peter Antal,Winter=True,Place=Q408 0.4557 0.5443
Lecturer=Peter Antal,Winter=True,Place=E504 0.474 0.526
Rerunning my queries, I got the following result:
P(Agreeableness=Excellent) = 0.437 (prev.: 0.464)
P(Agreeableness=Excellent|Delay=true) = 0.361 (prev.: 0.368)
PAGE 8
8/13/2019 Bayes Network
10/12
P(Agreeableness=Excellent|Lecturer=Another,Winter=true,Place=E504) = 0.431 (prev.:0.455)
P(Agreeableness|Lecturer=Another,Subject=Another,
Winter= true)
Bad Good Excellent
0.145(prev.:0.12)
0.445 (prev.:0.42)
0.410 (prev.:0.46)
P(Lecturer=Peter Antal|Agreeableness=Excellent) = 0.567 (prev.: 0.898)We can see for all queries about Agreeableness had a lower excellent value. This is due to the
previously values for the Lecturer and Delay had a bias in the sense that the lecturer be almost always
Peter Antal and the Delay in classes happens in very few quantities. When the Peter is the lecturer
the chance of has a class about BNs is higher and has less delays and seems that the students likemore to learn about BNs and when the classes starts on time. With this two facts underconfidents,
the positive effect in Agreeableness was lower.
On the last query about which the chance of the lecturer be Peter if the class was excellent had a big
decrease once previously had a bias in the variable Delay in sense that when the lecturer was Peter
Antal the delay happens with less frequency and then the classes were better and also had a bias on
the variable Lecturer that in the most part of the classes the lecturer was Peter Antal and combined
with less Delay the class were nicer. Therefore the effect of the Peter in the Agreeableness of the
classes is lower now.
Task 8 Effects of model uncertainty and sample size on learning
Undersampled model:
The parameters I used for sample generation and learning:
Sample Size: 1000Prior: CH
Max. parent count: 5Max. permutation: 80
The learnt structure:
PAGE 9
8/13/2019 Bayes Network
11/12
We can see that the model learnt has significant differences compare with the original one. Thearrows now indicate dependencies that didnt have before. In this sense we can make a few queriesto discover how much this changes influence in the knowledge of the model:
Original Learnt (10000samples)
Learnt (1000samples)
P(Agreeableness=Excellent)0.464 0.472 0.476
P(Agreeableness=Excellent|Delay=true)0.368 0.365 0.391
P(Agreeableness=Excellent|Lecturer=Another,Winter
=true,Place=E504)0.455 0.447 0.457
Analyzing the results we can see that the role of the samples has a big influence on the results. Aswe can see Agreeableness is an independent variable and because this it tends to be greater without
influence of the evidences.
Underconfident model:
The parameters I used for sample generation and learning:
PAGE 10
8/13/2019 Bayes Network
12/12
Sample Size: 10000Prior: CHMax. parent count: 5Max. permutation: 80
The learnt structure:
Analyzing the structured learned we can see that in this time there are no isolated nodes, but there
are serious problems with the dependencies and the newer structure does not remember the original
one. Lets check how it is accurate comparing the queries:
Originalunderconfident
Learntunderconfident
P(Agreeableness=Excellent)0.437 0.467
P(Agreeableness=Excellent|Delay=true)0.361 0.369
P(Agreeableness=Excellent|Lecturer=Another,Wint
er=true,Place=E504)
0.431 0.467
Looking in the table we can see that the queries had similar answers without big discrepancies but
of course was expected that the accuracy would be lower than the original one.
Analyzing both test realized we can conclude that the models probability distributions (CPTs) and
sample size make a big difference when learning is being talked about. In the first one, with few
samples, the entire structure was prejudiced and in the second one the dependencies were big
influenced by the probabilities distributions.
PAGE 11