20
Efficient Probabilistic Logic Reasoning with Graph Neural Networks Le Song Associate Professor, CSE, College of Computing Associate Director, Center for Machine Learning Georgia Institute of Technology

Efficient Probabilistic Logic Reasoning with Graph Neural

  • Upload
    others

  • View
    15

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Efficient Probabilistic Logic Reasoning with Graph Neural

Efficient Probabilistic Logic Reasoning with Graph Neural Networks

Le Song

Associate Professor, CSE, College of ComputingAssociate Director, Center for Machine Learning

Georgia Institute of Technology

Page 2: Efficient Probabilistic Logic Reasoning with Graph Neural

Learning with small dataβ€’ Combine learning and logic reasoning

β€’ Learning: generalize with noisy inputβ€’ Logic/symbolic reasoning: predict without data

Data requirement

Noise Tolerance Deep

learning

Symbolic

Desired

Example of logic inference: HorseShape(c) ∧ StripePattern(c) β‡’ Zebra(c)

+

β‡’Zebra

Horse

2

Page 3: Efficient Probabilistic Logic Reasoning with Graph Neural

Graph neural networks (GNN/Structure2vec)β€’ Obtain embedding via iterative updates:

1. Initialize πœ‡πœ‡π‘–π‘–(0) = β„Ž π‘Šπ‘Š1𝑋𝑋𝑖𝑖 ,βˆ€ 𝑖𝑖

2. Iterate 𝑇𝑇 times

πœ‡πœ‡π‘–π‘–(𝑑𝑑) ← β„Ž π‘Šπ‘Š1πœ‡πœ‡π‘–π‘–

(π‘‘π‘‘βˆ’1) + π‘Šπ‘Š2 οΏ½π‘—π‘—βˆˆπ’©π’© 𝑖𝑖

πœ‡πœ‡π‘—π‘—(π‘‘π‘‘βˆ’1)

neural network

𝑋𝑋6

𝑋𝑋1

𝑋𝑋2 𝑋𝑋3

𝑋𝑋4

𝑋𝑋5

πœ‡πœ‡1(0)πœ‡πœ‡1(1)

𝑄𝑄 𝑄𝑄 , 𝑄𝑄

Supervised Learning

GenerativeModels

ReinforcementLearning

πœ‡πœ‡π‘–π‘–(𝑇𝑇) πœ‡πœ‡π‘–π‘–

(𝑇𝑇) πœ‡πœ‡π‘˜π‘˜π‘‡π‘‡

π‘˜π‘˜βˆˆπ‘†π‘†πœ‡πœ‡π‘—π‘—(𝑇𝑇)

πœ‡πœ‡2(1)

πœ‡πœ‡6(1)

πœ‡πœ‡3(1)

πœ‡πœ‡5(1)

πœ‡πœ‡4(1)

[Dai, et al. ICML 16]3

Page 4: Efficient Probabilistic Logic Reasoning with Graph Neural

Logical expressiveness of GNNβ€’ Logical classifiers

β€’ classifies each node according to whether it satisfies a formulaβ€’ the formula is expressed in first order predicate logic (FO)

4

graded modal logic βŠ‚ FOC2 βŠ‚ FO

WL-test=

GNN

=

GNN(∞ depth)=

(finite depth)

≀

R-GNN

Readout(global information)

[Barcelo, et al. ICLR’20]

FOC2

β€’ 2 variableβ€’ With Counting quantifier, βˆƒβ‰₯𝑁𝑁

𝑓𝑓 π‘₯π‘₯ ≔ Red π‘₯π‘₯ ∧ βˆƒβ‰₯3𝑦𝑦 [Β¬Edge π‘₯π‘₯,𝑦𝑦 ∧ 𝐡𝐡𝐡𝐡𝐡𝐡𝐡𝐡 𝑦𝑦 ]

graded modal logic βŠ‚ FOC2

β€’ Guarded by Edge π‘₯π‘₯,𝑦𝑦 .β€’ Allow βˆƒβ‰₯𝑁𝑁𝑦𝑦 Edge π‘₯π‘₯,𝑦𝑦 ∧ Predicate 𝑦𝑦 , β€’ but do not allow βˆƒβ‰₯𝑁𝑁𝑦𝑦 Predicate 𝑦𝑦𝑓𝑓 π‘₯π‘₯ ≔ Red π‘₯π‘₯ ∧ βˆƒβ‰₯3𝑦𝑦 [Edge π‘₯π‘₯,𝑦𝑦 ∧ 𝐡𝐡𝐡𝐡𝐡𝐡𝐡𝐡 𝑦𝑦 ]

Page 5: Efficient Probabilistic Logic Reasoning with Graph Neural

More compact representation and lower error

Harvard clean energy dataset, 2.3 million organic molecules, predict power conversion efficiency (0 -12 %)

0.1M 1M 10M 100M 1000M

0.0850.0950.1200.150

0.280

Dimension

MAE

Embedded MF

Embedded BP

Weisfeiler-LehmanLevel 6

HashedWL Level 6

Structure2Vecreduces model

size by 10,000x !

Le Song 5[Dai, et al. ICML’16]

Page 6: Efficient Probabilistic Logic Reasoning with Graph Neural

Fraudulent account detection

Alipay: online payment platform

new accounts in a month: millions of nodes and edges.

Fake account can increase system level risk

Account – Device Network

Bad

Good

device

[Liu, et al. CCS’17, CIKM’17, AAAI’19]Recall

Precision

detection comparisonrule structure2vecnode2vec

6

Page 7: Efficient Probabilistic Logic Reasoning with Graph Neural

Knowledge base reasoningβ€’ Many domains need knowledge bases …

β€’ Contain incorrect, incomplete, duplicated recordsβ€’ Need link prediction, attribution classification,

record de-duplication.

β€’ Many existing rules/logic

β€’ Graph neural networksβ€’ not explicitly use existing knowledge

7

Page 8: Efficient Probabilistic Logic Reasoning with Graph Neural

UW-CSE dataset detailsβ€’ 22 relations

β€’ Teach, publish …

β€’ Task goalβ€’ Predict who is whose advisorβ€’ Zero observed facts for query predicates

β€’ 94 crowd-sourced FOL formulasadvisedBy(s, p) β‡’ professor(p)

advisedBy(s, p) β‡’Β¬yearsInProgram(s, Year_1)

professor(x) β‡’Β¬student(y)

publication(p, x) v publication(p,y) v student(x) v Β¬student(y) β‡’ professor(y)

student(x) v Β¬advisedBy(x,y) β‡’ tempAdvisedBy(x,y)… S1 S3

P2 Paper1publish

Course1

8

Page 9: Efficient Probabilistic Logic Reasoning with Graph Neural

π“—π“—π“žπ“ž

Bipartite graph representation for knowledge baseβ€’ Entity, 𝓒𝓒 = 𝐴𝐴,𝐡𝐡,𝐢𝐢,𝐷𝐷…‒ Predicate (attribute | relation), π‘Ÿπ‘Ÿ β‹… :𝓒𝓒 Γ— 𝓒𝓒 Γ— β‹― ↦ 0,1

β€’ Eg. Smoke(x), Friend(x,x’), Like(x,x’)

F(A,B)=1

F(A,C)=1

S(B)=1

S(A)=1

F(A,D)=1

A B C D

F(B,C) F(B,D) F(C,D) S(C) S(D)

π“–π“–π“šπ“š

predicatebinary variable latent variable

9

Page 10: Efficient Probabilistic Logic Reasoning with Graph Neural

π“žπ“ž 𝓗𝓗

Markov logic networksβ€’ Use logic formula 𝑓𝑓 β‹… :𝓒𝓒 Γ— 𝓒𝓒 Γ— β‹―Γ— 𝓒𝓒 ↦ 0,1 for potential functions

β€’ Eg. formula 𝑓𝑓(x,x’): Friend(x,x’) ∧ Smoke(x) β‡’Smoke(x’)

F(A,B)=1

F(A,C)=1

S(B)=1

S(A)=1

F(A,D)=1 F(B,C) F(B,D) F(C,D) S(C) S(D)

𝑓𝑓(A,B) 𝑓𝑓(A,C) 𝑓𝑓(A,D) 𝑓𝑓(B,C) 𝑓𝑓(B,D) 𝑓𝑓(C,D)

𝑀𝑀𝑀𝑀𝑀𝑀

𝑃𝑃 π“žπ“ž,𝓗𝓗 =1𝑍𝑍

exp �𝑓𝑓

π‘€π‘€π‘“π‘“οΏ½π‘Žπ‘Žπ‘“π‘“

πœ™πœ™π‘“π‘“(π‘Žπ‘Žπ‘“π‘“)

𝑀𝑀𝑓𝑓: formula weight, eg. πœ™πœ™π‘“π‘“ π‘₯π‘₯, π‘₯π‘₯π‘₯ : Β¬Friend (x,x’)∨¬Smoke (x) ∨ Smoke (x’)10

Page 11: Efficient Probabilistic Logic Reasoning with Graph Neural

π“žπ“ž 𝓗𝓗

Challenges in inferenceβ€’ Large grounded network, 𝑂𝑂 𝑛𝑛2 in the number 𝑛𝑛 of entities!

β€’ Enumerate configuration over 𝑂𝑂(𝑛𝑛2) binary variables, with 𝑂𝑂 2𝑛𝑛2 possibilities.

F(A,B)=1

F(A,C)=1

S(B)=1

S(A)=1

F(A,D)=1 F(B,C) F(B,D) F(C,D) S(C) S(D)

𝑓𝑓(A,B) 𝑓𝑓(A,C) 𝑓𝑓(A,D) 𝑓𝑓(B,C) 𝑓𝑓(B,D) 𝑓𝑓(C,D)

𝑀𝑀𝑀𝑀𝑀𝑀

𝑃𝑃 F B, C π“žπ“ž 𝑃𝑃 S C π“žπ“ž

𝑍𝑍 = οΏ½π‘Žπ‘Žπ‘Žπ‘Žπ‘Žπ‘Ž π‘π‘π‘π‘π‘π‘π‘π‘π‘–π‘–π‘π‘π‘Žπ‘Žπ‘π‘ π‘π‘π‘π‘π‘π‘π‘π‘π‘–π‘–π‘›π‘›π‘Žπ‘Žπ‘‘π‘‘π‘–π‘–π‘π‘π‘›π‘› 𝑝𝑝𝑓𝑓 π‘£π‘£π‘Žπ‘Žπ‘£π‘£π‘–π‘–π‘Žπ‘Žπ‘π‘π‘Žπ‘Žπ‘π‘ π‘£π‘£π‘Žπ‘Žπ‘Žπ‘Žπ‘£π‘£π‘π‘π‘π‘

exp �𝑓𝑓

π‘€π‘€π‘“π‘“οΏ½π‘Žπ‘Žπ‘“π‘“

πœ™πœ™π‘“π‘“(π‘Žπ‘Žπ‘“π‘“)

11

Page 12: Efficient Probabilistic Logic Reasoning with Graph Neural

Variational EM

PosteriorLikelihood

GCNMLN

KnowledgeGraph

πœ™πœ™π‘“π‘“(β‹…)

formula potential predicate posterior

12

Variational EM

𝑃𝑃 π“žπ“ž,𝓗𝓗 =1𝑍𝑍 exp οΏ½

𝑓𝑓

π‘€π‘€π‘“π‘“οΏ½π‘Žπ‘Žπ‘“π‘“

πœ™πœ™π‘“π‘“(π‘Žπ‘Žπ‘“π‘“) 𝑄𝑄 𝓗𝓗 π“žπ“ž

[Zhang ICLR’20]

Page 13: Efficient Probabilistic Logic Reasoning with Graph Neural

ExpressGNN: use GNN in variational inferenceβ€’ GNN on original KB (π“–π“–π“šπ“š) to get embedding πœ‡πœ‡π΄π΄, πœ‡πœ‡π΅π΅,πœ‡πœ‡πΆπΆ , πœ‡πœ‡π·π· for entities

πœ‡πœ‡π΄π΄, πœ‡πœ‡π΅π΅,πœ‡πœ‡πΆπΆ , πœ‡πœ‡π·π· = 𝐺𝐺𝑀𝑀𝑀𝑀(π“–π“–π“šπ“š;πœƒπœƒ)

π“žπ“ž

F(A,B)=1

F(A,C)=1

S(B)=1

S(A)=1

F(A,D)=1

A B C D

πœ‡πœ‡π΄π΄ πœ‡πœ‡π΅π΅ πœ‡πœ‡πΆπΆ πœ‡πœ‡π·π·

π“–π“–π“šπ“š

13[Zhang ICLR’20]

πœ‡πœ‡π‘–π‘–(𝑑𝑑) ← β„Ž π‘Šπ‘Š1πœ‡πœ‡π‘–π‘–

(π‘‘π‘‘βˆ’1) + π‘Šπ‘Š2 οΏ½π‘—π‘—βˆˆπ’©π’© 𝑖𝑖

πœ‡πœ‡π‘—π‘—(π‘‘π‘‘βˆ’1)

graph neural networks

Page 14: Efficient Probabilistic Logic Reasoning with Graph Neural

ExpressGNN: use GNN in variational inferenceβ€’ GNN on original KB (π“–π“–π“šπ“š) to get embedding πœ‡πœ‡π΄π΄, πœ‡πœ‡π΅π΅,πœ‡πœ‡πΆπΆ , πœ‡πœ‡π·π· for entities

πœ‡πœ‡π΄π΄, πœ‡πœ‡π΅π΅,πœ‡πœ‡πΆπΆ , πœ‡πœ‡π·π· = 𝐺𝐺𝑀𝑀𝑀𝑀(π“–π“–π“šπ“š;πœƒπœƒ)

β€’ 𝑄𝑄 F B, C = 1 π“žπ“ž : = 11+exp πœ‡πœ‡π΅π΅

⊀Θ𝐹𝐹 πœ‡πœ‡πΆπΆ+πœ”πœ”π΅π΅βŠ€πœ”πœ”πΉπΉ

, 𝑄𝑄 S C = 1 π“žπ“ž : = 11+exp πœƒπœƒπ‘†π‘†

⊀ πœ‡πœ‡πΆπΆ,πœ”πœ”πΆπΆ

π“žπ“ž

F(A,B)=1

F(A,C)=1

S(B)=1

S(A)=1

F(A,D)=1

𝓗𝓗F(B,C) F(B,D) F(C,D) S(C) S(D)

A B C D

πœ‡πœ‡π΄π΄ πœ‡πœ‡π΅π΅ πœ‡πœ‡πΆπΆ πœ‡πœ‡π·π·

π“–π“–π“šπ“šβ€²

14[Zhang ICLR’20]

πœ‡πœ‡π‘–π‘–(𝑑𝑑) ← β„Ž π‘Šπ‘Š1πœ‡πœ‡π‘–π‘–

(π‘‘π‘‘βˆ’1) + π‘Šπ‘Š2 οΏ½π‘—π‘—βˆˆπ’©π’© 𝑖𝑖

πœ‡πœ‡π‘—π‘—(π‘‘π‘‘βˆ’1)

graph neural networks

Page 15: Efficient Probabilistic Logic Reasoning with Graph Neural

F(A,B)=1

F(A,C)=1

S(B)=1

S(A)=1

F(A,D)=1

A B C D

F(B,C) F(B,D) F(C,D) S(C) S(D)

F(A,B)=1

F(A,C)=1

S(B)=1

S(A)=1

F(A,D)=1 F(B,C) F(B,D) F(C,D) S(C) S(D)

𝑓𝑓(A,B) 𝑓𝑓(A,C) 𝑓𝑓(A,D) 𝑓𝑓(B,C) 𝑓𝑓(B,D) 𝑓𝑓(C,D)

Is GNN embedding expressive enough?

if and only if

and

𝐹𝐹 π‘₯π‘₯, π‘¦π‘¦π“–π“–π“šπ“šβ€²

𝐹𝐹 π‘₯π‘₯β€²,𝑦𝑦π‘₯

π‘₯π‘₯π“–π“–π“šπ“š π‘₯π‘₯π‘₯ 𝑦𝑦

π“–π“–π“šπ“š 𝑦𝑦π‘₯

not true

𝐹𝐹 π‘₯π‘₯,𝑦𝑦𝑀𝑀𝑀𝑀𝑁𝑁

𝐹𝐹 π‘₯π‘₯β€²,𝑦𝑦π‘₯

𝑀𝑀𝑀𝑀𝑀𝑀

π“–π“–π“šπ“šβ€²

15[Zhang ICLR’20]

𝑄𝑄 F B, C = 1 π“žπ“ž: = 1

1+exp πœ‡πœ‡π΅π΅βŠ€Ξ˜πΉπΉ πœ‡πœ‡πΆπΆ+πœ”πœ”π΅π΅

βŠ€πœ”πœ”πΉπΉ

Page 16: Efficient Probabilistic Logic Reasoning with Graph Neural

Variational EM

PosteriorLikelihood

GCNMLN

KnowledgeGraph

πœ™πœ™π‘“π‘“(β‹…)

formula potential predicate posterior

16

Variational EM

𝑃𝑃 π“žπ“ž,𝓗𝓗 =1𝑍𝑍 exp οΏ½

𝑓𝑓

π‘€π‘€π‘“π‘“οΏ½π‘Žπ‘Žπ‘“π‘“

πœ™πœ™π‘“π‘“(π‘Žπ‘Žπ‘“π‘“) 𝑄𝑄 𝓗𝓗 π“žπ“ž

[Zhang et al. ICLR’20]

Page 17: Efficient Probabilistic Logic Reasoning with Graph Neural

Cora dataset details (CS paper citations)β€’ 10 relation types

β€’ Author, Title, Venue, HasWordTitle, …

β€’ Task goal (entity resolution)β€’ De-duplicate citations, authors, and venuesβ€’ Zero observed facts for query predicates

β€’ 46 crowd-sourced FOL formulas

A1 A2

W1 Paper1HasWordTitle

SameAuthor?

Author(bc1,a1) v Author(bc2,a2) v SameAuthor(a1,a2) β‡’ SameBib(bc1,bc2)

HasWordAuthor(a1, w) v HasWordAuthor(a2, w) β‡’ SameAuthor(a1, a2)

Title(bc1,t1) v Title(bc2,t2) v SameBib(bc1,bc2) β‡’ SameTitle(t1,t2)

SameVenue(v1,v2) v SameVenue(v2,v3) β‡’ SameVenue(v1,v3)

Title(bc1, t1) v Title(bc2, t2) v HasWordTitle(t1, +w) v HasWordTitle(t2, +w) β‡’ SameBib(bc1, bc2)

…17

Page 18: Efficient Probabilistic Logic Reasoning with Graph Neural

Inference accuracy and timeβ€’ Area under precision-recall curve (AUC-PR)

β€’ Inference wall clock time

ExpressGNN

18

Page 19: Efficient Probabilistic Logic Reasoning with Graph Neural

Large Facebook15K-237 datasetβ€’ # entity: 15K, #ground predicate: 50M, #ground formula: 679B

β€’ Increase training data for target predicates

19

Page 20: Efficient Probabilistic Logic Reasoning with Graph Neural

Learning with small dataβ€’ Combining learning and logic reasoning

β€’ Learning: generalize with noisy inputβ€’ Logic/symbolic reasoning: predict without data

Data requirement

Noise Tolerance Deep

learning

Symbolic

Desired

Symbolic reasoning: HorseShape(c) ∧ StripePattern(c) β‡’ Zebra(c)

Convolutional NNLife is greatRecurrent NN

Learning based approaches

20