Upload
buiquynh
View
231
Download
0
Embed Size (px)
Citation preview
KI091322 Kecerdasan Buatan Materi 13: Learning Probabilistic Models
(Bayesian Network)
Teknik Informatika, Fakultas Teknologi Informasi
Institut Teknologi Sepuluh Nopember Surabaya
2012 Pengembangan Bahan Ajar sebagai Pendukung Student Centered-Learning
(SCL) melalui e-Learning : Share ITS
[AIMA] Russel, Stuart J., Peter Norvig, "Artificial Intelligence, A Modern Approach" 3rd Ed., Prentice Hall, New Jersey, 2010
Daftar Materi 1. Pengenalan Intelligent Agents, Uninfomed Search (pencarian tanpa
informasi tambahan)
2. Informed Search (Heuristics)
3. Local Search & Optimization (pencarian dengan optimasi)
4. Adversarial Search (pencarian dengan melihat status pihak lainnya)
5. Constraint Satisfaction Problems (pencarian dengan batasan kondisi)
6. Reasoning: propositional logic
7. Reasoning: predicate logic
8. Reasoning: first order logic
9. Reasoning: inference in first order logic (forward-backward chaining untuk inference)
10. Learning Probabilistic Models (1+2)
2
Latar Belakang
• Learning Probabilistic Models = solusi yang diambil dari ketidakpastian kondisi
• Kurangnya informasi di problem nyata
• Solusi dari problem nyata dapat ditemukan jika ada pembelajaran kemungkinan kejadian
• Langkah praktis: buat representasi problem nyata menjadi model dugaan berdasarkan probabilitas (probabilistic inference )
• Pendekatan dasar: teknik Bayesian
3
Ketidakpastian (Uncertainty)
Didefinisikan At = ke bandara t menit sebelum penerbangan
Apakah kejadian At membuat penumpang tidak terlambat? Permasalahan: jalan macet, ban bocor dan semacamnya Kemungkinan dugaan kejadian: “A25 memungkinkan penumpang tidak terlambat
penerbangan JIKA tidak ada kecelakaan di jalan, tidak ada kemacetan,
tidak ada hujan, tidak terjadi ban bocor, dll.” (A1440 juga memungkinkan tepat waktu tetapi harus
bermalam hari sebelumnya di hotel bandara …)
Keputusan dengan Ketidakpastian Kondisi
Semisal dugaan kemungkinan P(A25 jadi on time | …) = 0.04
P(A90 jadi on time | …) = 0.70
P(A120 jadi on time | …) = 0.95
P(A1440 jadi on time | …) = 0.9999
Keputusan yang diambil bergantung pada preferences untuk missing flight vs. waktu tunggu di bandara, etc.
– Utility theory is used to represent and infer preferences
– Decision theory = probability theory + utility theory
Probabilitas
• Probabilitas koin (bagian depan, head, dan belakang, tail)
6
P(COIN = tail) =1
2
P(tail) =1
2
Probabilitas
• Probabilitas penderita kanker
7
P(has cancer) = 0.02
Þ P(Ø has cancer) = 0.98
Probabilitas Gabung, Joint Probability
• Variasi kejadian (event): cancer, test result
– Berdasarkan data di lapangan (misal 1000 pasien)
8
P(has cancer, test positive)
Has cancer? Test positive? P(C,TP)
yes yes 0.018
yes no 0.002
no yes 0.196
no no 0.784
18 org + dari hasil tes
784 org - dari hasil tes
Probabilitas Bersyarat, Conditional Probability
• Pertanyaan diagnosa: seberapa mungkin indikasi kanker ditemukan dengan tes?
9
P(has cancer | test positive) =?
P(C | TP) = P(C, TP) /P(TP) = 0.018 / 0.214 = 0.084
Has cancer? Test positive? P(TP, C)
yes yes 0.018
yes no 0.002
no yes 0.196
no no 0.784
Bayes Network (BN) vs Independence
10
MODEL BN: P(cancer) and P(Test positive | cancer) PREDICTION BN: Hitung P(Test positive) DIAGNOSTIC REASONING BN:
Hitung P(Cancer | test positive)
Cancer
Test positive
Cancer
Test positive
versus
Bayes Network Independence
)TP()C(
)TP,C(
PP
P
Notasi Grafik: Coin
X1 X2 Xn
• N independent coin flips
• Tidak ada ketergantungan untuk setiap kejadian pelemparan: absolute independence
11
Example: Coin Flips
h 0.5
t 0.5
h 0.5
t 0.5
h 0.5
t 0.5
X1 X2 Xn
Only distributions whose variables are absolutely independent
can be represented by a Bayes’ net with no arcs. 12
Notasi Grafik: Traffic
• Variables: – R: It rains
– T: There is traffic
• Model 1: independence
• Model 2 (better): rain causes traffic
• Model 2 lebih baik karena lebih sesuai dengan kondisi nyata
R
T
13
Example: Traffic
R
T
+r 1/4
r 3/4
+r +t 3/4
t 1/4
r +t 1/2
t 1/2
14
Independence
• 2 variabel disebut independent jika:
– Dengan kata lain:
– Tertulis :
15
Contoh: Independence
• N pelemparan koin (fair, independent):
h 0.5
t 0.5
h 0.5
t 0.5
h 0.5
t 0.5
16
Contoh Lain: Independence?
T W P
warm sun 0.4
warm rain 0.1
cold sun 0.2
cold rain 0.3
T W P
warm sun 0.3
warm rain 0.2
cold sun 0.3
cold rain 0.2
T P
warm 0.5
cold 0.5
W P
sun 0.6
rain 0.4
17
T = temperature W = weather
Contoh Bayes Network: Sakit Gigi
18
Cavity
Catch Headache
P(cavity)
P(catch | cavity) P(toothache | cavity)
1 parameter
2 parameters 2 parameters
Versus: 23-1 = 7 parameters
P(toothache) = 0.108 + 0.012 + 0.016 + 0.064 = 0.2
Conditional Independence • P(Toothache, Cavity, Catch) has 23 – 1 = 7 independent entries • If I have a Toothache, a dental probe might be more likely to
catch • But: if I have a cavity, the probability that the probe catches
doesn't depend on whether I have a toothache: – P(+catch | +toothache, +cavity) = P(+catch | +cavity)
• The same independence holds if I don’t have a cavity: – P(+catch | +toothache, cavity) = P(+catch| cavity)
• Catch is conditionally independent of Toothache given Cavity: – P(Catch | Toothache, Cavity) = P(Catch | Cavity)
• Equivalent statements: – P(Toothache | Catch , Cavity) = P(Toothache | Cavity) – P(Toothache, Catch | Cavity) = P(Toothache | Cavity) P(Catch | Cavity) – One can be derived from the other easily
• Tertulis : 19
Contoh Bayes Network: Rumput & Air
20
Contoh Bayes Network: Rumput & Air
• Apabila fakta bahwa kondisi rumput adalah basah, maka bagaimana kemungkinan terjadinya sprinkler menyala, dan kemungkinan terjadinya hujan.
normalizing constant
kemungkinan
Contoh Bayes Network: Car
22
Contoh: Alarm Network
• Variables
– B: Burglary
– A: Alarm goes off
– M: Mary calls
– J: John calls
– E: Earthquake!
23
Burglary Earthquake
Alarm
John calls
Mary calls
P(J | M) = P(J)? No P(A | J, M) = P(A | J)? P(A | J, M) = P(A)? No P(B | A, J, M) = P(B | A)? Yes P(B | A, J, M) = P(B)? No P(E | B, A ,J, M) = P(E | A)? No P(E | B, A, J, M) = P(E | A, B)? Yes
Example: Alarm Network
Burglary Earthqk
Alarm
John calls
Mary calls
B P(B)
+b 0.001
b 0.999
E P(E)
+e 0.002
e 0.998
B E A P(A|B,E)
+b +e +a 0.95
+b +e a 0.05
+b e +a 0.94
+b e a 0.06
b +e +a 0.29
b +e a 0.71
b e +a 0.001
b e a 0.999
A J P(J|A)
+a +j 0.9
+a j 0.1
a +j 0.05
a j 0.95
A M P(M|A)
+a +m 0.7
+a m 0.3
a +m 0.01
a m 0.99
Bayes Net Semantics
• A set of nodes, one per variable X
• A directed, acyclic graph
• A conditional distribution for each node – A collection of distributions over X, one for each
combination of parents’ values
– CPT: conditional probability table – Description of a noisy “causal” process
A1
X
An
A Bayes net = Topology (graph) + Local Conditional Probabilities 25
Probabilities in BNs
• Bayes nets implicitly encode joint distributions – As a product of local conditional distributions – To see what probability a BN gives to a full assignment, multiply all the
relevant conditionals together:
– Example:
• This lets us reconstruct any entry of the full joint • Not every BN can represent every joint distribution
– The topology enforces certain conditional independencies
26
Bayes’ Nets
• A Bayes’ net is an efficient encoding of a probabilistic model of a domain
• Questions we can ask:
– Inference: given a fixed BN, what is P(X | e)? – Representation: given a BN graph, what kinds of
distributions can it encode? – Modeling: what BN is most appropriate for a given
domain?
27
Causal Chains
• This configuration is a “causal chain”
– Is X independent of Z given Y?
– Evidence along the chain “blocks” the influence
X Y Z
Yes!
X: Low pressure
Y: Rain
Z: Traffic
28
Common Cause
• Another basic configuration: two effects of the same cause – Are X and Z independent?
– Are X and Z independent given Y?
– Observing the cause blocks influence between effects.
X
Y
Z
Yes!
Y: Alarm
X: John calls
Z: Mary calls
29
Common Effect
• Last configuration: two causes of one effect (v-structures)
– Are X and Z independent? • Yes: the ballgame and the rain cause traffic, but
they are not correlated
• Still need to prove they must be (try it!)
– Are X and Z independent given Y? • No: seeing traffic puts the rain and the ballgame
in competition as explanation?
– This is backwards from the other cases • Observing an effect activates influence between
possible causes.
X
Y
Z
X: Raining
Z: Ballgame
Y: Traffic
30
The General Case
• Any complex example can be analyzed using these three canonical cases
• General question: in a given BN, are two variables independent (given evidence)?
• Solution: analyze the graph
31
Reachability
• Recipe: shade evidence nodes
• Attempt 1: Remove shaded nodes. If two nodes are still connected by an undirected path, they are not conditionally independent
• Almost works, but not quite – Where does it break?
– Answer: the v-structure at T doesn’t count as a link in a path unless “active”
R
T
B
D
L
32
Reachability (D-Separation) • Question: Are X and Y
conditionally independent given evidence vars {Z}? – Yes, if X and Y “separated” by Z – Look for active paths from X to Y – No active paths = independence!
• A path is active if each triple is active: – Causal chain A B C where B is
unobserved (either direction) – Common cause A B C where B is
unobserved – Common effect (aka v-structure) A B C where B or one of its
descendents is observed
• All it takes to block a path is a single inactive segment
Active Triples Inactive Triples
Example
34
Yes R
T
B
T’
Example
35
R
T
B
D
L
T’
Yes
Yes
Yes
Example
• Variables:
– R: Raining
– T: Traffic
– D: Roof drips
– S: I’m sad
• Questions:
T
S
D
R
Yes
36
Causality?
• When Bayes’ nets reflect the true causal patterns: – Often simpler (nodes have fewer parents) – Often easier to think about – Often easier to elicit from experts
• BNs need not actually be causal – Sometimes no causal net exists over the domain – End up with arrows that reflect correlation, not causation
• What do the arrows really mean? – Topology may happen to encode causal structure – Topology only guaranteed to encode conditional independence
37