View
6
Download
0
Category
Preview:
Citation preview
Web-‐Mining Agents Augmen'ng Probabilis'c Graphical Models with Ontology
Informa'on: Object Classifica'on
Prof. Dr. Ralf Möller Dr. Özgür L. Özçep
Universität zu Lübeck Ins'tut für Informa'onssysteme
Tanya Braun (Exercises)
Large-‐Scale Object Recogni'on using Label Rela'on Graphs
Jia Deng1,2, Nan Ding2, Yangqing Jia2, Andrea Frome2, Kevin Murphy2, Samy Bengio2, Yuan Li2, Hartmut Neven2, Hartwig Adam2
University of Michigan1, Google2
Based on:
Object Classifica'on
• Assign seman'c labels to objects
Corgi Puppy
Dog
Cat
✔ ✔
✖ ✔
Object Classifica'on
• Assign seman'c labels to objects
Probabili'es
0.9
0.8
0.9
0.1
Corgi Puppy
Dog
Cat
Object Classifica'on
• Assign seman'c labels to objects
Feature Extractor
Features Classifier Probabili'es
0.9
0.8
0.9
0.1
Corgi Puppy
Dog
Cat
Object Classifica'on
• Mul'class classifier: So`max (mul'nomial)
Corgi Puppy
Dog
Cat
/
/
/
/+
Assumes mutual exclusive labels.
0.2
0.4
0.3
0.1
• Independent binary classifiers: Logis'c Regression
Corgi Puppy
Dog
Cat
0.4
0.8
0.6
0.2
No assump'ons about rela'ons.
Object labels have rich rela'ons
Corgi Puppy
Dog Cat
Exclusion Hierarchical
Dog
Cat Corgi Puppy
Overlap So`max: all labels are mutually exclusive L Logis'c Regression: all labels overlap L
Goal: A new classifica'on model
Respects real world label rela'ons
Corgi Puppy
Dog
Cat
0.9
0.8
0.9
0.1 Corgi Puppy
Dog Cat
Visual Model + Knowledge Graph
Corgi Puppy
Dog
Cat
Visual Model
0.9
0.8
0.9
0.1
Knowledge Graph
Joint Inference
Assump5on in this work: Knowledge graph is given and fixed.
Agenda
• Encoding prior knowledge (HEX graph) • Classifica'on model • Efficient Exact Inference
Agenda
• Encoding prior knowledge (HEX graph) • Classifica'on model • Efficient Exact Inference
Hierarchy and Exclusion (HEX) Graph
Corgi Puppy
Dog Cat
Exclusion Hierarchical
• Hierarchical edges (directed) • Exclusion edges (undirected)
Examples of HEX graphs
Car Bird
Dog Cat
Male Female
Person
Child
Boy Round
Red Shiny
Thick
Mutually exclusive All overlapping Combina'on
Girl
State Space: Legal label configura'ons
Dog Cat Corgi Puppy
0 0 0 0
0 0 0 1
0 0 1 0
0 0 1 1
1 0 0 0
…
1 1 0 0
1 1 0 1
…
Corgi Puppy
Dog Cat
Each edge defines a constraint.
State Space: Legal label configura'ons
Corgi Puppy
Dog Cat Dog Cat Corgi Puppy
0 0 0 0
0 0 0 1
0 0 1 0
0 0 1 1
1 0 0 0
…
1 1 0 0
1 1 0 1
…
Hierarchy: (dog, corgi) can’t be (0,1)
Each edge defines a constraint.
State Space: Legal label configura'ons
Corgi Puppy
Dog Cat Dog Cat Corgi Puppy
0 0 0 0
0 0 0 1
0 0 1 0
0 0 1 1
1 0 0 0
…
1 1 0 0
1 1 0 1
… Exclusion: (dog, cat) can’t be (1,1)
Hierarchy: (dog, corgi) can’t be (0,1)
Each edge defines a constraint.
Agenda
• Encoding prior knowledge (HEX graph) • Classifica'on model • Efficient Exact Inference
Pr(y | x) = 1Z(x)
φi (xi, yi )i∏
HEX Classifica'on Model • Pairwise Condi'onal Random Field (CRF)
x ∈ Rn y ∈ {0,1}nInput scores Binary Label vector
ψi, j (yi, yj )i, j∏
Pr(y | x) = 1Z(x)
φi (xi, yi )i∏
HEX Classifica'on Model • Pairwise Condi'onal Random Field (CRF)
x ∈ Rn y ∈ {0,1}nBinary Label vector
φi (xi, yi ) =sigmoid(xi )1− sigmoid(xi )
if yi =1if yi = 0
Unary: same as logis'c regression
ψi, j (yi, yj )i, j∏
Input scores
Pr(y | x) = 1Z(x)
φi (xi, yi )i∏
HEX Classifica'on Model • Pairwise Condi'onal Random Field (CRF)
x ∈ Rn y ∈ {0,1}nBinary Label vector
à All illegal configura5ons have probability zero.
ψi, j (yi, yj )i, j∏
φi (xi, yi ) =sigmoid(xi )1− sigmoid(xi )
Unary: same as logis'c regression
ψi, j (yi, yj ) =1
If violates constraints
Otherwise
0
Pairwise: set illegal configura5on to zero
Input scores
if yi =1if yi = 0
Pr(y | x) = 1Z(x)
φi (xi, yi )i∏ ψi, j (yi, yj )
i, j∏
HEX Classifica'on Model • Pairwise Condi'onal Random Field (CRF)
x ∈ Rn y ∈ {0,1}nBinary Label vector
Z(x) = φi (xi, yi )i∏ ψi, j (yi, yj )
i, j∏
y∈{0,1}n∑
Par55on func5on: Sum over all (legal) configura5ons
Input scores
ψi, j (yi, yj )i, j∏Pr(y | x) = 1
Z(x)φi (xi, yi )
i∏
HEX Classifica'on Model • Pairwise Condi'onal Random Field (CRF)
x ∈ Rn y ∈ {0,1}nBinary Label vector
Pr(yi =1| x) =1
Z(x)φi (xi, yi )
i∏ ψi, j (yi, yj )
i, j∏
y:yi=1∑
Probability of a single label: marginalize all other labels.
Input scores
Special Case of HEX Model
• So`max
Car Bird
Dog Cat
Round
Red Shiny
Mutually exclusive All overlapping
Thick
• Logis'c Regressions
Pr(yi =1| x) =exp(xi )
1+ exp(x j )j∑ Pr(yi =1| x) =
11+ exp(−xi )
Learning
Corgi Puppy
Dog
Cat DNN
Label: Dog
Maximize marginal probability of observed labels
Back Propaga'on
Dog Corgi Puppy
Cat
1 ? ?
?
Pr(Dog =1)
Loss = − logPr(Dog =1)
DNN = Deep Neural Network
Agenda
• Encoding prior knowledge (HEX graph) • Classifica'on model • Efficient Exact Inference
Naïve Exact Inference is Intractable • Inference: – Compu'ng par''on func'on – Perform marginaliza'on
• HEX-‐CRF can be densely connected (large treewidth)
Observa'on 1: Exclusions are good
Car Bird
Dog Cat
• Lots of exclusions à Small state space à Efficient inference • Realis'c graphs have lots of exclusions. • Rigorous analysis in paper.
Number of legal states is O(n), not O(2n).
Observa'on 2: Equivalent graphs
Dog Cat
Corgi
Puppy Pembroke Welsh Corgi
Cardigan Welsh Corgi
Dog Cat
Corgi
Puppy Pembroke Welsh Corgi
Cardigan Welsh Corgi
Observa'on 2: Equivalent graphs
Sparse equivalent • Small Treewidth J • Dynamic programming
Dog Cat
Corgi
Puppy Pembroke Welsh Corgi
Cardigan Welsh Corgi
Dog Cat
Corgi
Puppy Pembroke Welsh Corgi
Cardigan Welsh Corgi
Dog Cat
Corgi
Puppy Pembroke Welsh Corgi
Cardigan Welsh Corgi
Dense equivalent • Prune states J • Can brute force
A
BF
B
ED
C
G
F
BC
F
HEX Graph Inference
A
B
E
D
C
G
F
A
B
E
D
C
G
F
A
B
E
D
C
G
F
A
BF
B
ED
C
G
F
BC
F2.Build Junc5on Tree (offline)
5. Message Passing
on
legal states (o
nline)
31
Digression: Polytrees • A network is singly connected (a polytree) if it contains no undirected loops.
Theorem: Inference in a singly connected network can be done in linear 'me*. Main idea: in variable elimina'on, need only maintain distribu'ons over single nodes. * in network size including table sizes.
C D
© Jack Breese (Microso`) & Daphne Koller (Stanford)
32
The problem with loops
Rain
Cloudy
Grass-‐wet
Sprinkler
P(c) 0.5
P(r) c c
0.99 0.01 P(s) c c
0.01 0.99
determinis'c or
The grass is dry only if no rain and no sprinklers.
P(g) = P(r, s) ~ 0
© Jack Breese (Microso`) & Daphne Koller (Stanford)
33
The problem with loops contd.
= P(r, s)
P(g | r, s) P(r, s) + P(g | r, s) P(r, s)
+ P(g | r, s) P(r, s) + P(g | r, s) P(r, s)
0
1 0
0
= P(r) P(s) ~ 0.5 ·∙0.5 = 0.25
problem
~ 0
P(g) =
© Jack Breese (Microso`) & Daphne Koller (Stanford)
34
Variable elimina'on
A
B
C
P(c) = Σ P(c | b) Σ P(b | a) P(a) b a
P(b)
x
P(A) P(B | A)
P(B, A) Σ A P(B)
x
P(C | B)
P(C, B) Σ B P(C)
© Jack Breese (Microso`) & Daphne Koller (Stanford)
35
Inference as variable elimina'on
• A factor over X is a func'on from val(X) to numbers in [0,1]: – A CPT is a factor – A joint distribu'on is also a factor
• BN inference: – factors are mul'plied to give new ones – variables in factors summed out
• A variable can be summed out as soon as all factors men'oning it have been mul'plied.
© Jack Breese (Microso`) & Daphne Koller (Stanford)
36
Variable Elimina'on with loops
Smoking
Gender Age
Cancer
Lung Tumor
Serum Calcium
Exposure to Toxics
x
P(A,G,S)
P(A) P(S | A,G) P(G)
P(A,S) Σ G
Σ E,S P(C)
P(L | C) x P(C,L) Σ C P(L)
Complexity is exponen'al in the size of the factors
P(E,S) Σ A
P(A,E,S)
P(E | A)
x
P(C | E,S)
P(E,S,C)
x
© Jack Breese (Microso`) & Daphne Koller (Stanford)
37
Join trees*
P(A) P(S | A,G)
P(G)
P(A,S) x x x A, G, S
E, S, C
C, L C, S-‐C
A join tree is a par'ally precompiled factoriza'on
Smoking
Gender Age
Cancer
Lung Tumor
Serum Calcium
Exposure to Toxics
* aka Junc'on Tree, Lauritzen-‐Spiegelhalter, or Hugin algorithm, …
A, E, S
© Jack Breese (Microso`) & Daphne Koller (Stanford)
Junc'on Tree Algorithm
• Converts Bayes Net into an undirected tree – Joint probability remains unchanged – Exact marginals can be computed
• Why ??? – Uniform treatment of Bayes Net and MRF – Efficient inference is possible for undirected trees
Recommended