Bayes Network

Embed Size (px)

Citation preview

  • 8/13/2019 Bayes Network

    1/12

    Pablo Silva LZ1LUZ [email protected] | Artificial Intelligence | December 13, 2013

    Knowledge and data

    engineering with BayesiannetworksREPORT HOMEWORK

  • 8/13/2019 Bayes Network

    2/12

    Task 1 Domain description

    My model is about AI classes. The purpose of the model is to find out the probability that an AI

    class be interesting under some situations.

    Task 2 Variable definitions

    VARIABLES:

    Lecturer:o Represents: the lecturer of the AI class in someday.o Value Range: Peter Lang, Pter Antal.o Depends on:nothing.

    Winter:o Represents: the current season is autumn.o Value Range:binary (true, false).o Depends on:nothing.

    Place:o Represents: the building where the classes are on.o Value Range:E504, Q408.o Depends on:nothing.

    Subject:o Represents: the subject taught in class.o Value Range: Bayesian networks, another.o Depends on: Lecturer.

    Delay:o Represents: the lecturer comes in time for the class.o Value Range:binary (true, false).o Depends on:Lecturer, Winter, Place.

    Agreeableness:o Represents: how interesting was the class for the students.o Value Range: excellent, good, bado Depends on: Subject and Delay.

    PAGE 1

  • 8/13/2019 Bayes Network

    3/12

    Task 3 Structure of the Bayesian network

    Figure 1 - Structure of the BN

    If the student likes or not the class depends on whether there is a delay to start the lesson and which

    the subject was taught. Some late in class could happens if the lecturer does not come in time and

    this could be affected by the place and the weather or not. Finally the subject only depends on aboutwhat the lecturer thinks that is good for teaching for his students.

    PAGE 2

  • 8/13/2019 Bayes Network

    4/12

    Task 4 Quantify the Bayesian network

    P(Lecturer) Peter Antal Another0.9 0.1

    Peter is the official lecturer of the AI classes, therefore just in specials occasions will be not him to

    be a lecturer.

    P(Winter) True False

    0.25 0.75

    In Hungary, its winter about 25% of the time.

    P(Place) E504 Q408

    0.5 0.5

    There are two classes per week and on Mondays it is in E504 and on Wednesday it is in Q408.

    P(Subject) BNs Another

    Lecturer=Peter Antal 0.7 0.3

    Lecturer=Another 0.5 0.5

    Peter seems to like of BNs and the others times that was not him teaching, the classes were middle

    about BNs.

    P(Delay) True False

    Lecturer=Another,Winter=False,Place=Q408 0.1 0.9

    Lecturer=Another,Winter=False,Place=E504 0.3 0.7

    Lecturer=Another,Winter=True,Place=Q408 0.1 0.9

    Lecturer=Another,Winter=True,Place=E504 0.3 0.7Lecturer=Peter Antal,Winter=False,Place=Q408 0.2 0.8

    Lecturer=Peter Antal,Winter=False,Place=E504 0.3 0.7

    Lecturer=Peter Antal,Winter=True,Place=Q408 0.3 0.7

    Lecturer=Peter Antal,Winter=True,Place=E504 0.4 0.6

    PAGE 3

  • 8/13/2019 Bayes Network

    5/12

    Delays occur with greater intensity when the classes are at building E504 because the lecturer need

    get the projector far from that building. Despite this, most of the classes are in time.

    P(Agreeableness) Bad Good Excellent

    Subject=Another,Delay=True 0.2 0.5 0.3

    Subject=Another,Delay=False 0.1 0.4 0.5

    Subject=BNs,Delay=True 0.2 0.4 0.4

    Subject=BNs,Delay=False 0.1 0.4 0.5

    In general the students do not complain about the AI classes, just in case of some delay, but

    nevertheless are just few ones. The best classes are when does not have a delay and the subject is

    BNs.

    Task 5 Inference and sensitivity analysis

    My inferences are as follows:

    What is the probability that the class will be excellent when nothing else is given? P(Agreeableness=Excellent) = 0.464

    If the class start late, which is the chance of the class being excellent? P(Agreeableness=Excellent|Delay=true) = 0.368

    Peter needed miss the class because the winter and then the class was in building E504,which is the probability of the one has been excellent?

    P(Agreeableness=Excellent|Lecturer=Another,Winter=true,Place=E504) = 0.455

    Because of the rigorous winter Peter could not go to class and the substitute lecturer taughtsearch methods because him did not liked BNs. What the students though about the class?

    P(Agreeableness|Lecturer=Another,Subject=Another,

    Winter= true)

    Bad Good Excellent

    0.12 0.42 0.46

    If the class was excellent, what is the chance of the lecture has been Peter Antal?

    PAGE 4

  • 8/13/2019 Bayes Network

    6/12

    P(Lecturer=Peter Antal|Agreeableness=Excellent) = 0.898

    The Sensitivity Analysis of P(Agreeableness=excellent|Subject,Delay). Such analysis could be seen

    on Figure 1:

    Figure 2 - Example of Sensitivity Analysis

    Computing the values using normal inference:

    P(Agreeableness|Delay) Excellent

    Delay=true 0.368

    Delay=False 0.5

    Difference is: ~0.132

    P(Agreeableness|Subject) Excellent

    Subject=BNs 0.473

    Subject=Another 0.447

    PAGE 5

  • 8/13/2019 Bayes Network

    7/12

    Difference is: ~0.026

    As we can see, the value of Delay has a greater impact on the random variable Agreeableness than

    Subject.

    Task 6 Compare Differences of constructed and learnt models

    The parameters I used for sample generation and learning:

    Sample Size: 10000Prior: CHMax. parent count: 5Max. permutation: 80

    The learnt structure:

    Figure 3 - Learning with 10000 samples

    As we can see, the structure resembles the original except for the Winter random variable. The

    CPTs are changed accordingly the new model which can be seen in the attached

    PHAS_LZ1LUZ_HW_AI2013_learning_structure_10000s.xml.

    Proper examination whether the two models are the same is out of this works reach but let us retry

    some queries:

    P(Agreeableness=Excellent) = 0.472 (prev.: 0.464) P(Agreeableness=Excellent|Delay=true) = 0.365 (prev.: 0.368)

    PAGE 6

  • 8/13/2019 Bayes Network

    8/12

    P(Agreeableness=Excellent|Lecturer=Another,Winter=true,Place=E504) = 0.447 (prev.:0.455)

    As we can see, even the model learnt does not be exactly identical to the designed the queries had

    almost the same result which indicates the resemblance.

    Task 7 Analysis estimation biases

    OVERCONFIDENCE

    I have changed the Lecturer and Delay variables to make my model overconfident.

    P(Lecturer) Peter Antal Another

    0.915 0.0842

    P(Delay) True False

    Lecturer=Another,Winter=False,Place=Q408 0.0842 0.9158

    Lecturer=Another,Winter=False,Place=E504 0.2584 0.7416

    Lecturer=Another,Winter=True,Place=Q408 0.0842 0.9158

    Lecturer=Another,Winter=True,Place=E504 0.2584 0.7416

    Lecturer=Peter Antal,Winter=False,Place=Q408 0.1702 0.8298

    Lecturer=Peter Antal,Winter=False,Place=E504 0.2584 0.7416

    Lecturer=Peter Antal,Winter=True,Place=Q408 0.2584 0.7416

    Lecturer=Peter Antal,Winter=True,Place=E504 0.3489 0.6511

    Rerunning my queries, I got the following result:

    P(Agreeableness=Excellent) = 0.469 (prev.: 0.464)

    P(Agreeableness=Excellent|Delay=true) = 0.368 (prev.: 0.368)

    P(Agreeableness=Excellent|Lecturer=Another,Winter=true,Place=E504) = 0.461 (prev.:0.455)

    PAGE 7

  • 8/13/2019 Bayes Network

    9/12

    P(Agreeableness|Lecturer=Another,Subject=Another,

    Winter= true)

    Bad Good Excellent

    0.117

    (prev.:0.12)

    0.417 (prev.:

    0.42)

    0.466 (prev.:

    0.46)

    P(Lecturer=Peter Antal|Agreeableness=Excellent) = 0.914 (prev.: 0.898)It is possible see that the probability of the class be excellent if the lecturer is Peter Antal increases.

    This happens because before the variable Delay had a greater bias that associated the delay on classes

    with the Lecturer Peter Antal and with overconfidence the delay associated with Peter decreased.

    UNDERCONFIDENCEI have changed the Lecturer and Delay variables to make my model underconfident.

    P(Lecturer) Peter Antal Another

    0.5626 0.4374

    P(Delay) True False

    Lecturer=Another,Winter=False,Place=Q408 0.4374 0.5626

    Lecturer=Another,Winter=False,Place=E504 0.4557 0.5443

    Lecturer=Another,Winter=True,Place=Q408 0.4374 0.5626

    Lecturer=Another,Winter=True,Place=E504 0.4557 0.5443

    Lecturer=Peter Antal,Winter=False,Place=Q408 0.4438 0.5562

    Lecturer=Peter Antal,Winter=False,Place=E504 0.4557 0.5443

    Lecturer=Peter Antal,Winter=True,Place=Q408 0.4557 0.5443

    Lecturer=Peter Antal,Winter=True,Place=E504 0.474 0.526

    Rerunning my queries, I got the following result:

    P(Agreeableness=Excellent) = 0.437 (prev.: 0.464)

    P(Agreeableness=Excellent|Delay=true) = 0.361 (prev.: 0.368)

    PAGE 8

  • 8/13/2019 Bayes Network

    10/12

    P(Agreeableness=Excellent|Lecturer=Another,Winter=true,Place=E504) = 0.431 (prev.:0.455)

    P(Agreeableness|Lecturer=Another,Subject=Another,

    Winter= true)

    Bad Good Excellent

    0.145(prev.:0.12)

    0.445 (prev.:0.42)

    0.410 (prev.:0.46)

    P(Lecturer=Peter Antal|Agreeableness=Excellent) = 0.567 (prev.: 0.898)We can see for all queries about Agreeableness had a lower excellent value. This is due to the

    previously values for the Lecturer and Delay had a bias in the sense that the lecturer be almost always

    Peter Antal and the Delay in classes happens in very few quantities. When the Peter is the lecturer

    the chance of has a class about BNs is higher and has less delays and seems that the students likemore to learn about BNs and when the classes starts on time. With this two facts underconfidents,

    the positive effect in Agreeableness was lower.

    On the last query about which the chance of the lecturer be Peter if the class was excellent had a big

    decrease once previously had a bias in the variable Delay in sense that when the lecturer was Peter

    Antal the delay happens with less frequency and then the classes were better and also had a bias on

    the variable Lecturer that in the most part of the classes the lecturer was Peter Antal and combined

    with less Delay the class were nicer. Therefore the effect of the Peter in the Agreeableness of the

    classes is lower now.

    Task 8 Effects of model uncertainty and sample size on learning

    Undersampled model:

    The parameters I used for sample generation and learning:

    Sample Size: 1000Prior: CH

    Max. parent count: 5Max. permutation: 80

    The learnt structure:

    PAGE 9

  • 8/13/2019 Bayes Network

    11/12

    We can see that the model learnt has significant differences compare with the original one. Thearrows now indicate dependencies that didnt have before. In this sense we can make a few queriesto discover how much this changes influence in the knowledge of the model:

    Original Learnt (10000samples)

    Learnt (1000samples)

    P(Agreeableness=Excellent)0.464 0.472 0.476

    P(Agreeableness=Excellent|Delay=true)0.368 0.365 0.391

    P(Agreeableness=Excellent|Lecturer=Another,Winter

    =true,Place=E504)0.455 0.447 0.457

    Analyzing the results we can see that the role of the samples has a big influence on the results. Aswe can see Agreeableness is an independent variable and because this it tends to be greater without

    influence of the evidences.

    Underconfident model:

    The parameters I used for sample generation and learning:

    PAGE 10

  • 8/13/2019 Bayes Network

    12/12

    Sample Size: 10000Prior: CHMax. parent count: 5Max. permutation: 80

    The learnt structure:

    Analyzing the structured learned we can see that in this time there are no isolated nodes, but there

    are serious problems with the dependencies and the newer structure does not remember the original

    one. Lets check how it is accurate comparing the queries:

    Originalunderconfident

    Learntunderconfident

    P(Agreeableness=Excellent)0.437 0.467

    P(Agreeableness=Excellent|Delay=true)0.361 0.369

    P(Agreeableness=Excellent|Lecturer=Another,Wint

    er=true,Place=E504)

    0.431 0.467

    Looking in the table we can see that the queries had similar answers without big discrepancies but

    of course was expected that the accuracy would be lower than the original one.

    Analyzing both test realized we can conclude that the models probability distributions (CPTs) and

    sample size make a big difference when learning is being talked about. In the first one, with few

    samples, the entire structure was prejudiced and in the second one the dependencies were big

    influenced by the probabilities distributions.

    PAGE 11