36
KB-LDA: Jointly Learning a Knowledge Base of Hierarchy, Relations, and Facts Dana Movshovitz-Attias and William W Cohen ACL, July 28, 2015 Computer Science Department

KB-LDA: Jointly Learning a Knowledge Base of Hierarchy ...dmovshov/docs/dma_acl_2015_kblda_pres… · KB-LDA: Jointly Learning a Knowledge Base of Hierarchy, Relations, and Facts

  • Upload
    others

  • View
    9

  • Download
    0

Embed Size (px)

Citation preview

Page 1: KB-LDA: Jointly Learning a Knowledge Base of Hierarchy ...dmovshov/docs/dma_acl_2015_kblda_pres… · KB-LDA: Jointly Learning a Knowledge Base of Hierarchy, Relations, and Facts

KB-LDA: Jointly Learning a Knowledge Base of Hierarchy,

Relations, and Facts Dana Movshovitz-Attias and William W Cohen

ACL, July 28, 2015

Computer Science Department

Page 2: KB-LDA: Jointly Learning a Knowledge Base of Hierarchy ...dmovshov/docs/dma_acl_2015_kblda_pres… · KB-LDA: Jointly Learning a Knowledge Base of Hierarchy, Relations, and Facts

Who is Barack Obama’s wife?

Page 3: KB-LDA: Jointly Learning a Knowledge Base of Hierarchy ...dmovshov/docs/dma_acl_2015_kblda_pres… · KB-LDA: Jointly Learning a Knowledge Base of Hierarchy, Relations, and Facts

Barack Obama Michelle Obama married

wife of husband of

Who is Barack Obama’s wife?

Knowledge Base!

Page 4: KB-LDA: Jointly Learning a Knowledge Base of Hierarchy ...dmovshov/docs/dma_acl_2015_kblda_pres… · KB-LDA: Jointly Learning a Knowledge Base of Hierarchy, Relations, and Facts

What is the initial size of an ArrayList?

What is the Python equivalent of a HashMap?

What is the run time of quick sort?

Page 5: KB-LDA: Jointly Learning a Knowledge Base of Hierarchy ...dmovshov/docs/dma_acl_2015_kblda_pres… · KB-LDA: Jointly Learning a Knowledge Base of Hierarchy, Relations, and Facts

Knowledge Base (KB) Construction Ontology-Guided Construction

•  NELL •  FreeBase •  Yago •  WordNet

Open IE

•  ReVerb •  TextRunner

organism! location!

university!

entity!

city!person! animal!

(Beijing, is the capital of, China)!(Penticton, has very, warm summers)!(Goods, can be defined in, a variety of ways)!

Page 6: KB-LDA: Jointly Learning a Knowledge Base of Hierarchy ...dmovshov/docs/dma_acl_2015_kblda_pres… · KB-LDA: Jointly Learning a Knowledge Base of Hierarchy, Relations, and Facts

Reasoning with Ontologies

organism! location!

university!

entity!

city!person! animal!

✔ Ontologies give information context

✔ Easy to extract domain-specific information

Page 7: KB-LDA: Jointly Learning a Knowledge Base of Hierarchy ...dmovshov/docs/dma_acl_2015_kblda_pres… · KB-LDA: Jointly Learning a Knowledge Base of Hierarchy, Relations, and Facts

Reasoning with Ontologies

organism! location!

university!

entity!

city!person! animal!

✔ Ontologies give information context

✔ Easy to extract domain-specific information

✘  Ontologies are •  expensive •  require prior knowledge •  incomplete/outdated

✘  Ontology structure and facts are often drawn from different corpus statistics

software!

restaurant!

Page 8: KB-LDA: Jointly Learning a Knowledge Base of Hierarchy ...dmovshov/docs/dma_acl_2015_kblda_pres… · KB-LDA: Jointly Learning a Knowledge Base of Hierarchy, Relations, and Facts

Goal: Data-Driven Knowledge Base

•  Schema and facts are drawn from corpus and jointly optimized

•  Unsupervised: learn latent corpus structure together with best-matching facts

person! location!

university!

entity!

Barack Obama

Michelle Obama

Sasha Malia father!

parent!mother! Harvard studied at!Columbia

CMU

city!

Honolulu Chicago

born in!

married!

Page 9: KB-LDA: Jointly Learning a Knowledge Base of Hierarchy ...dmovshov/docs/dma_acl_2015_kblda_pres… · KB-LDA: Jointly Learning a Knowledge Base of Hierarchy, Relations, and Facts

Automatically Learning a KB with Structure and Facts

I successfully done with provision on both side client and server!and when I call this code for the first time 1st if(line no. 2) and 2nd if(line no. 8) gets executed and if I call this code 2nd time then 1st else(line no.4) get executed as client is already provisioned but here it execute 2nd if(line no. 8) instead of executing 2nd else(line no. 10) and throws an exception at line no. 9!

serverprovision.Apply(); "There is already an object named 'schema_info' in the database.”!And if i try to synchronize then it throws an exception at line syncOrchestrator.Synchronize(); "The current operation could not be completed because the database is not provisioned for sync or you not!

.4) get executed as client is already provisioned but here it execute 2nd if(line no. 8) instead of executing 2nd else(line no. 10) and throws an exception at line no. 9 serverprovision.Apply(); "There is already an object named 'schema_info' in the database.”!And if i try to synchronize then it throws an exception at line!

Corpus

Collection of Documents from StackOverflow

Relation Extraction

KB-LDA

Ontology Relations

Documents

KB Evaluation

People! Places!

Universities!

Everything!

Barack Obama!Michelle Obama!married!

Sasha! Malia!

father!parent!mother! Harvard!studied at!

Columbia!CMU!

Cities!

Honolulu!Chicago!

born in!

Page 10: KB-LDA: Jointly Learning a Knowledge Base of Hierarchy ...dmovshov/docs/dma_acl_2015_kblda_pres… · KB-LDA: Jointly Learning a Knowledge Base of Hierarchy, Relations, and Facts

KB-LDA

Ontology Relations

Documents

Automatically Learning a KB with Structure and Facts

I successfully done with provision on both side client and server!and when I call this code for the first time 1st if(line no. 2) and 2nd if(line no. 8) gets executed and if I call this code 2nd time then 1st else(line no.4) get executed as client is already provisioned but here it execute 2nd if(line no. 8) instead of executing 2nd else(line no. 10) and throws an exception at line no. 9!

serverprovision.Apply(); "There is already an object named 'schema_info' in the database.”!And if i try to synchronize then it throws an exception at line syncOrchestrator.Synchronize(); "The current operation could not be completed because the database is not provisioned for sync or you not!

.4) get executed as client is already provisioned but here it execute 2nd if(line no. 8) instead of executing 2nd else(line no. 10) and throws an exception at line no. 9 serverprovision.Apply(); "There is already an object named 'schema_info' in the database.”!And if i try to synchronize then it throws an exception at line!

Corpus

Collection of Documents from StackOverflow

Relation Extraction!

KB Evaluation

People! Places!

Universities!

Everything!

Barack Obama!Michelle Obama!married!

Sasha! Malia!

father!parent!mother! Harvard!studied at!

Columbia!CMU!

Cities!

Honolulu!Chicago!

born in!

Page 11: KB-LDA: Jointly Learning a Knowledge Base of Hierarchy ...dmovshov/docs/dma_acl_2015_kblda_pres… · KB-LDA: Jointly Learning a Knowledge Base of Hierarchy, Relations, and Facts

Pattern-based Relation Extraction

“places such as Hawaii” “places including cities, towns and villages”

Y such as X!X is a Y!

Y including X!

Languages such as Java!IDEs including Eclipse and NetBeans!

Integer and other data types!

Ontology

Hearst patterns Hypernym-hyponyms from StackOverflow

Concept(places)

Instance(Hawaii) Sub-Concept(cities)

Page 12: KB-LDA: Jointly Learning a Knowledge Base of Hierarchy ...dmovshov/docs/dma_acl_2015_kblda_pres… · KB-LDA: Jointly Learning a Knowledge Base of Hierarchy, Relations, and Facts

Pattern-based Relation Extraction

Subject-Verb-Object patterns

“Barack Obama was born in Hawaii” “Barack Obama was born in a hospital”

Class implements interface!Constructor throws exception!

Output fills buffer!

SVOs from StackOverflow

Instance(Barack Obama) Instance(Hawaii)

Concept(hospital)

born in

Relations

Page 13: KB-LDA: Jointly Learning a Knowledge Base of Hierarchy ...dmovshov/docs/dma_acl_2015_kblda_pres… · KB-LDA: Jointly Learning a Knowledge Base of Hierarchy, Relations, and Facts

Open Structure of Extracted Relations

places

Hawaii cities Barack Obama

Hospital born in

born in

•  No distinction between concepts and instances •  Parsing errors: “class not found exception”

•  Structure includes cycles: haskell

programming language magic Noise!

view

behavior ui element Ambiguity!

php script

Page 14: KB-LDA: Jointly Learning a Knowledge Base of Hierarchy ...dmovshov/docs/dma_acl_2015_kblda_pres… · KB-LDA: Jointly Learning a Knowledge Base of Hierarchy, Relations, and Facts

KB Evaluation

People! Places!

Universities!

Everything!

Barack Obama!Michelle Obama!married!

Sasha! Malia!

father!parent!mother! Harvard!studied at!

Columbia!CMU!

Cities!

Honolulu!Chicago!

born in!

Automatically Learning a KB with Structure and Facts

I successfully done with provision on both side client and server!and when I call this code for the first time 1st if(line no. 2) and 2nd if(line no. 8) gets executed and if I call this code 2nd time then 1st else(line no.4) get executed as client is already provisioned but here it execute 2nd if(line no. 8) instead of executing 2nd else(line no. 10) and throws an exception at line no. 9!

serverprovision.Apply(); "There is already an object named 'schema_info' in the database.”!And if i try to synchronize then it throws an exception at line syncOrchestrator.Synchronize(); "The current operation could not be completed because the database is not provisioned for sync or you not!

.4) get executed as client is already provisioned but here it execute 2nd if(line no. 8) instead of executing 2nd else(line no. 10) and throws an exception at line no. 9 serverprovision.Apply(); "There is already an object named 'schema_info' in the database.”!And if i try to synchronize then it throws an exception at line!

Corpus

Collection of Documents from StackOverflow

Relation Extraction

KB-LDA!

Ontology Relations

Documents

Page 15: KB-LDA: Jointly Learning a Knowledge Base of Hierarchy ...dmovshov/docs/dma_acl_2015_kblda_pres… · KB-LDA: Jointly Learning a Knowledge Base of Hierarchy, Relations, and Facts

KB-LDA Model

Sj

Oj

Vj

zSj

zOj

zVj

↵R

⇡R

NR

Relations�I

�tI

NTI

�R

�tR

NTR

↵O⇡O

zCi

zIi

Ci

Ii

NO

Ontology

↵D⇡D

zE1l

zE12

E1l

E2l

Nd,I

Nd,R

ND

Documents

Figure 1: Plate Diagram of KB-LDA.

p(⇡O,�,�, �, < C, I >,< zC , zI >,< S,O, V >,< zS , zO, zV >,⇡R|↵O,↵R, �C , �I , �R) =

TCY

tC=1

Dir(�tC |�C)⇥

TIY

tI=1

Dir(�tI |�I)⇥

TRY

tR=1

Dir(�tR |�R)⇥

Dir(⇡O|↵O)NOY

i=1

⇡<zCi

,zIi>

O �CizCi

�IizIi

Dir(⇡R|↵R)NRY

j=1

⇡<zSj

,zOj,zVj

>

R �SjzSj

�OjzOj

�VjzVj

p̂(zi =< zCi , zIi > | < Ci, Ii >, z¬i, < C, I >¬i,⇡O,↵O)

/✓nO¬i<zCi

,zIi>+↵O

◆⇥

(n¬izCi

,Ci+ �C)(n¬i

zIi ,Ii+ �I)

(P

C n¬izCi

,C > +|EC |�C)(P

I n¬izIi ,I

> +|EI |�I)

1

Page 16: KB-LDA: Jointly Learning a Knowledge Base of Hierarchy ...dmovshov/docs/dma_acl_2015_kblda_pres… · KB-LDA: Jointly Learning a Knowledge Base of Hierarchy, Relations, and Facts

KB-LDA Model Sj

Oj

Vj

zSj

zOj

zVj

↵R

⇡R

NR

Relations�I

�tI

NTI

�R

�tR

NTR

↵O⇡O

zCi

zIi

Ci

Ii

NO

Ontology

↵D⇡D

zE1l

zE12

E1l

E2l

Nd,I

Nd,R

ND

Documents

Figure 1: Plate Diagram of KB-LDA.

p(⇡O,�,�, �, < C, I >,< zC , zI >,< S,O, V >,< zS , zO, zV >,⇡R|↵O,↵R, �C , �I , �R) =

TCY

tC=1

Dir(�tC |�C)⇥

TIY

tI=1

Dir(�tI |�I)⇥

TRY

tR=1

Dir(�tR |�R)⇥

Dir(⇡O|↵O)NOY

i=1

⇡<zCi

,zIi>

O �CizCi

�IizIi

Dir(⇡R|↵R)NRY

j=1

⇡<zSj

,zOj,zVj

>

R �SjzSj

�OjzOj

�VjzVj

p̂(zi =< zCi , zIi > | < Ci, Ii >, z¬i, < C, I >¬i,⇡O,↵O)

/✓nO¬i<zCi

,zIi>+↵O

◆⇥

(n¬izCi

,Ci+ �C)(n¬i

zIi ,Ii+ �I)

(P

C n¬izCi

,C > +|EC |�C)(P

I n¬izIi ,I

> +|EI |�I)

1

person location

university

entity

Barack Obama Michelle Obama married

Sasha Malia

father parent

mother Harvard studied at

Columbia CMU

city

Honolulu Chicago

born in

Page 17: KB-LDA: Jointly Learning a Knowledge Base of Hierarchy ...dmovshov/docs/dma_acl_2015_kblda_pres… · KB-LDA: Jointly Learning a Knowledge Base of Hierarchy, Relations, and Facts

KB-LDA Model Sj

Oj

Vj

zSj

zOj

zVj

↵R

⇡R

NR

Relations�I

�tI

NTI

�R

�tR

NTR

↵O⇡O

zCi

zIi

Ci

Ii

NO

Ontology

↵D⇡D

zE1l

zE12

E1l

E2l

Nd,I

Nd,R

ND

Documents

Figure 1: Plate Diagram of KB-LDA.

p(⇡O,�,�, �, < C, I >,< zC , zI >,< S,O, V >,< zS , zO, zV >,⇡R|↵O,↵R, �C , �I , �R) =

TCY

tC=1

Dir(�tC |�C)⇥

TIY

tI=1

Dir(�tI |�I)⇥

TRY

tR=1

Dir(�tR |�R)⇥

Dir(⇡O|↵O)NOY

i=1

⇡<zCi

,zIi>

O �CizCi

�IizIi

Dir(⇡R|↵R)NRY

j=1

⇡<zSj

,zOj,zVj

>

R �SjzSj

�OjzOj

�VjzVj

p̂(zi =< zCi , zIi > | < Ci, Ii >, z¬i, < C, I >¬i,⇡O,↵O)

/✓nO¬i<zCi

,zIi>+↵O

◆⇥

(n¬izCi

,Ci+ �C)(n¬i

zIi ,Ii+ �I)

(P

C n¬izCi

,C > +|EC |�C)(P

I n¬izIi ,I

> +|EI |�I)

1

person location

university

entity

Barack Obama Michelle Obama married

Sasha Malia

father parent

mother Harvard studied at

Columbia CMU

city

Honolulu Chicago

born in

Noun Topics

Michelle Obama Barack Obama

Malia Sasha

Page 18: KB-LDA: Jointly Learning a Knowledge Base of Hierarchy ...dmovshov/docs/dma_acl_2015_kblda_pres… · KB-LDA: Jointly Learning a Knowledge Base of Hierarchy, Relations, and Facts

KB-LDA Model Sj

Oj

Vj

zSj

zOj

zVj

↵R

⇡R

NR

Relations�I

�tI

NTI

�R

�tR

NTR

↵O⇡O

zCi

zIi

Ci

Ii

NO

Ontology

↵D⇡D

zE1l

zE12

E1l

E2l

Nd,I

Nd,R

ND

Documents

Figure 1: Plate Diagram of KB-LDA.

p(⇡O,�,�, �, < C, I >,< zC , zI >,< S,O, V >,< zS , zO, zV >,⇡R|↵O,↵R, �C , �I , �R) =

TCY

tC=1

Dir(�tC |�C)⇥

TIY

tI=1

Dir(�tI |�I)⇥

TRY

tR=1

Dir(�tR |�R)⇥

Dir(⇡O|↵O)NOY

i=1

⇡<zCi

,zIi>

O �CizCi

�IizIi

Dir(⇡R|↵R)NRY

j=1

⇡<zSj

,zOj,zVj

>

R �SjzSj

�OjzOj

�VjzVj

p̂(zi =< zCi , zIi > | < Ci, Ii >, z¬i, < C, I >¬i,⇡O,↵O)

/✓nO¬i<zCi

,zIi>+↵O

◆⇥

(n¬izCi

,Ci+ �C)(n¬i

zIi ,Ii+ �I)

(P

C n¬izCi

,C > +|EC |�C)(P

I n¬izIi ,I

> +|EI |�I)

1

person location

university

entity

Barack Obama Michelle Obama married

Sasha Malia

father parent

mother Harvard studied at

Columbia CMU

city

Honolulu Chicago

born in

Verb Topics

married to is husband of

is wife of …

Page 19: KB-LDA: Jointly Learning a Knowledge Base of Hierarchy ...dmovshov/docs/dma_acl_2015_kblda_pres… · KB-LDA: Jointly Learning a Knowledge Base of Hierarchy, Relations, and Facts

KB-LDA Model Sj

Oj

Vj

zSj

zOj

zVj

↵R

⇡R

NR

Relations�I

�tI

NTI

�R

�tR

NTR

↵O⇡O

zCi

zIi

Ci

Ii

NO

Ontology

↵D⇡D

zE1l

zE12

E1l

E2l

Nd,I

Nd,R

ND

Documents

Figure 1: Plate Diagram of KB-LDA.

p(⇡O,�,�, �, < C, I >,< zC , zI >,< S,O, V >,< zS , zO, zV >,⇡R|↵O,↵R, �C , �I , �R) =

TCY

tC=1

Dir(�tC |�C)⇥

TIY

tI=1

Dir(�tI |�I)⇥

TRY

tR=1

Dir(�tR |�R)⇥

Dir(⇡O|↵O)NOY

i=1

⇡<zCi

,zIi>

O �CizCi

�IizIi

Dir(⇡R|↵R)NRY

j=1

⇡<zSj

,zOj,zVj

>

R �SjzSj

�OjzOj

�VjzVj

p̂(zi =< zCi , zIi > | < Ci, Ii >, z¬i, < C, I >¬i,⇡O,↵O)

/✓nO¬i<zCi

,zIi>+↵O

◆⇥

(n¬izCi

,Ci+ �C)(n¬i

zIi ,Ii+ �I)

(P

C n¬izCi

,C > +|EC |�C)(P

I n¬izIi ,I

> +|EI |�I)

1

person location

university

entity

Barack Obama Michelle Obama married

Sasha Malia

father parent

mother Harvard studied at

Columbia CMU

city

Honolulu Chicago

born in

Ontology!

πo

Page 20: KB-LDA: Jointly Learning a Knowledge Base of Hierarchy ...dmovshov/docs/dma_acl_2015_kblda_pres… · KB-LDA: Jointly Learning a Knowledge Base of Hierarchy, Relations, and Facts

KB-LDA Model Sj

Oj

Vj

zSj

zOj

zVj

↵R

⇡R

NR

Relations�I

�tI

NTI

�R

�tR

NTR

↵O⇡O

zCi

zIi

Ci

Ii

NO

Ontology

↵D⇡D

zE1l

zE12

E1l

E2l

Nd,I

Nd,R

ND

Documents

Figure 1: Plate Diagram of KB-LDA.

p(⇡O,�,�, �, < C, I >,< zC , zI >,< S,O, V >,< zS , zO, zV >,⇡R|↵O,↵R, �C , �I , �R) =

TCY

tC=1

Dir(�tC |�C)⇥

TIY

tI=1

Dir(�tI |�I)⇥

TRY

tR=1

Dir(�tR |�R)⇥

Dir(⇡O|↵O)NOY

i=1

⇡<zCi

,zIi>

O �CizCi

�IizIi

Dir(⇡R|↵R)NRY

j=1

⇡<zSj

,zOj,zVj

>

R �SjzSj

�OjzOj

�VjzVj

p̂(zi =< zCi , zIi > | < Ci, Ii >, z¬i, < C, I >¬i,⇡O,↵O)

/✓nO¬i<zCi

,zIi>+↵O

◆⇥

(n¬izCi

,Ci+ �C)(n¬i

zIi ,Ii+ �I)

(P

C n¬izCi

,C > +|EC |�C)(P

I n¬izIi ,I

> +|EI |�I)

1

person location

university

entity

Barack Obama Michelle Obama married

Sasha Malia

father parent

mother Harvard studied at

Columbia CMU

city

Honolulu Chicago

born in

Ontology!

Extracted hypernym-hyponym relations:

websites ! google platforms ! stackoverflow

πo

websites platforms

applications

stackoverflow google

facobook

Noun Topic 1

Noun Topic 2

Page 21: KB-LDA: Jointly Learning a Knowledge Base of Hierarchy ...dmovshov/docs/dma_acl_2015_kblda_pres… · KB-LDA: Jointly Learning a Knowledge Base of Hierarchy, Relations, and Facts

KB-LDA Model Sj

Oj

Vj

zSj

zOj

zVj

↵R

⇡R

NR

Relations�I

�tI

NTI

�R

�tR

NTR

↵O⇡O

zCi

zIi

Ci

Ii

NO

Ontology

↵D⇡D

zE1l

zE12

E1l

E2l

Nd,I

Nd,R

ND

Documents

Figure 1: Plate Diagram of KB-LDA.

p(⇡O,�,�, �, < C, I >,< zC , zI >,< S,O, V >,< zS , zO, zV >,⇡R|↵O,↵R, �C , �I , �R) =

TCY

tC=1

Dir(�tC |�C)⇥

TIY

tI=1

Dir(�tI |�I)⇥

TRY

tR=1

Dir(�tR |�R)⇥

Dir(⇡O|↵O)NOY

i=1

⇡<zCi

,zIi>

O �CizCi

�IizIi

Dir(⇡R|↵R)NRY

j=1

⇡<zSj

,zOj,zVj

>

R �SjzSj

�OjzOj

�VjzVj

p̂(zi =< zCi , zIi > | < Ci, Ii >, z¬i, < C, I >¬i,⇡O,↵O)

/✓nO¬i<zCi

,zIi>+↵O

◆⇥

(n¬izCi

,Ci+ �C)(n¬i

zIi ,Ii+ �I)

(P

C n¬izCi

,C > +|EC |�C)(P

I n¬izIi ,I

> +|EI |�I)

1

person location

university

entity

Barack Obama Michelle Obama married

Sasha Malia

father parent

mother Harvard studied at

Columbia CMU

city

Honolulu Chicago

born in

Relations!

πR

Extracted SVO relation: Michelle, wife of, Barack

S V O

Page 22: KB-LDA: Jointly Learning a Knowledge Base of Hierarchy ...dmovshov/docs/dma_acl_2015_kblda_pres… · KB-LDA: Jointly Learning a Knowledge Base of Hierarchy, Relations, and Facts

KB-LDA Model Sj

Oj

Vj

zSj

zOj

zVj

↵R

⇡R

NR

Relations�I

�tI

NTI

�R

�tR

NTR

↵O⇡O

zCi

zIi

Ci

Ii

NO

Ontology

↵D⇡D

zE1l

zE12

E1l

E2l

Nd,I

Nd,R

ND

Documents

Figure 1: Plate Diagram of KB-LDA.

p(⇡O,�,�, �, < C, I >,< zC , zI >,< S,O, V >,< zS , zO, zV >,⇡R|↵O,↵R, �C , �I , �R) =

TCY

tC=1

Dir(�tC |�C)⇥

TIY

tI=1

Dir(�tI |�I)⇥

TRY

tR=1

Dir(�tR |�R)⇥

Dir(⇡O|↵O)NOY

i=1

⇡<zCi

,zIi>

O �CizCi

�IizIi

Dir(⇡R|↵R)NRY

j=1

⇡<zSj

,zOj,zVj

>

R �SjzSj

�OjzOj

�VjzVj

p̂(zi =< zCi , zIi > | < Ci, Ii >, z¬i, < C, I >¬i,⇡O,↵O)

/✓nO¬i<zCi

,zIi>+↵O

◆⇥

(n¬izCi

,Ci+ �C)(n¬i

zIi ,Ii+ �I)

(P

C n¬izCi

,C > +|EC |�C)(P

I n¬izIi ,I

> +|EI |�I)

1

person location

university

entity

Barack Obama Michelle Obama married

Sasha Malia

father parent

mother Harvard studied at

Columbia CMU

city

Honolulu Chicago

born in

Documents!Downloads on websites sometimes have an MD5 checksum, allowing people to confirm the integrity of the file. I have heard this is to allow not only corrupted files to be instantly identified before they cause a problem but also for for any malicious changes to be easily detected.

Page 23: KB-LDA: Jointly Learning a Knowledge Base of Hierarchy ...dmovshov/docs/dma_acl_2015_kblda_pres… · KB-LDA: Jointly Learning a Knowledge Base of Hierarchy, Relations, and Facts

KB-LDA Model Sj

Oj

Vj

zSj

zOj

zVj

↵R

⇡R

NR

Relations�I

�tI

NTI

�R

�tR

NTR

↵O⇡O

zCi

zIi

Ci

Ii

NO

Ontology

↵D⇡D

zE1l

zE12

E1l

E2l

Nd,I

Nd,R

ND

Documents

Figure 1: Plate Diagram of KB-LDA.

p(⇡O,�,�, �, < C, I >,< zC , zI >,< S,O, V >,< zS , zO, zV >,⇡R|↵O,↵R, �C , �I , �R) =

TCY

tC=1

Dir(�tC |�C)⇥

TIY

tI=1

Dir(�tI |�I)⇥

TRY

tR=1

Dir(�tR |�R)⇥

Dir(⇡O|↵O)NOY

i=1

⇡<zCi

,zIi>

O �CizCi

�IizIi

Dir(⇡R|↵R)NRY

j=1

⇡<zSj

,zOj,zVj

>

R �SjzSj

�OjzOj

�VjzVj

p̂(zi =< zCi , zIi > | < Ci, Ii >, z¬i, < C, I >¬i,⇡O,↵O)

/✓nO¬i<zCi

,zIi>+↵O

◆⇥

(n¬izCi

,Ci+ �C)(n¬i

zIi ,Ii+ �I)

(P

C n¬izCi

,C > +|EC |�C)(P

I n¬izIi ,I

> +|EI |�I)

1

person location

university

entity

Barack Obama Michelle Obama married

Sasha Malia

father parent

mother Harvard studied at

Columbia CMU

city

Honolulu Chicago

born in

Documents!Downloads on websites sometimes have an MD5 checksum, allowing people to confirm the integrity of the file. I have heard this is to allow not only corrupted files to be instantly identified before they cause a problem but also for for any malicious changes to be easily detected.

More in paper: Data-driven topic naming

Page 24: KB-LDA: Jointly Learning a Knowledge Base of Hierarchy ...dmovshov/docs/dma_acl_2015_kblda_pres… · KB-LDA: Jointly Learning a Knowledge Base of Hierarchy, Relations, and Facts

Training

Corpus!

5.5M documents

from StackOverflow

Topics!

50

Page 25: KB-LDA: Jointly Learning a Knowledge Base of Hierarchy ...dmovshov/docs/dma_acl_2015_kblda_pres… · KB-LDA: Jointly Learning a Knowledge Base of Hierarchy, Relations, and Facts

Automatically Learning a KB with Structure and Facts

I successfully done with provision on both side client and server!and when I call this code for the first time 1st if(line no. 2) and 2nd if(line no. 8) gets executed and if I call this code 2nd time then 1st else(line no.4) get executed as client is already provisioned but here it execute 2nd if(line no. 8) instead of executing 2nd else(line no. 10) and throws an exception at line no. 9!

serverprovision.Apply(); "There is already an object named 'schema_info' in the database.”!And if i try to synchronize then it throws an exception at line syncOrchestrator.Synchronize(); "The current operation could not be completed because the database is not provisioned for sync or you not!

.4) get executed as client is already provisioned but here it execute 2nd if(line no. 8) instead of executing 2nd else(line no. 10) and throws an exception at line no. 9 serverprovision.Apply(); "There is already an object named 'schema_info' in the database.”!And if i try to synchronize then it throws an exception at line!

Corpus

Collection of Documents from StackOverflow

Relation Extraction

KB-LDA

Ontology Relations

Documents

KB Evaluation!

People! Places!

Universities!

Everything!

Barack Obama!Michelle Obama!married!

Sasha! Malia!

father!parent!mother! Harvard!studied at!

Columbia!CMU!

Cities!

Honolulu!Chicago!

born in!

Page 26: KB-LDA: Jointly Learning a Knowledge Base of Hierarchy ...dmovshov/docs/dma_acl_2015_kblda_pres… · KB-LDA: Jointly Learning a Knowledge Base of Hierarchy, Relations, and Facts

Top Tokens of Learned Noun Topics

database!table query

database sql

column data

tables mysql index

columns

user information!name

images id

number text

password address strings

files string

web browsers!ie

firefox chrome safari

explorer ie7 ie6 ie8 ie9

internet explorer

programming language!java

python javascript

lists ruby c + perl

haskell jquery scala

Page 27: KB-LDA: Jointly Learning a Knowledge Base of Hierarchy ...dmovshov/docs/dma_acl_2015_kblda_pres… · KB-LDA: Jointly Learning a Knowledge Base of Hierarchy, Relations, and Facts

M-Turk Evaluation of Noun Topics

java python

javascript firefox ruby perl

“Which words are not related to programming

languages?”

1. Word Intrusion 0.78

2. Group Precision 0.93

0.70

Page 28: KB-LDA: Jointly Learning a Knowledge Base of Hierarchy ...dmovshov/docs/dma_acl_2015_kblda_pres… · KB-LDA: Jointly Learning a Knowledge Base of Hierarchy, Relations, and Facts

Relations Users!

!

user people

customer client player

interact with

clicks selects submits

hits moves

UI element!!

button form link item file

KB-LDA! Random!Avg(Experts) 0.7 0.13 1 Worker 0.9 0.69 2 Workers 0.63 0.22 3 Workers 0.15 0.05

“Is this a reasonable Software relationship?”!

Page 29: KB-LDA: Jointly Learning a Knowledge Base of Hierarchy ...dmovshov/docs/dma_acl_2015_kblda_pres… · KB-LDA: Jointly Learning a Knowledge Base of Hierarchy, Relations, and Facts

data information

types resources

objects

name images

id password address

memory pointer

time bytes buffer

object class array

element variable model

sites tools

applications

stackoverflow google

facebook firebug eclipse tomcat

table query

database sql

data

column row

index record value

integer bit

number value

list item tree

node

Page 30: KB-LDA: Jointly Learning a Knowledge Base of Hierarchy ...dmovshov/docs/dma_acl_2015_kblda_pres… · KB-LDA: Jointly Learning a Knowledge Base of Hierarchy, Relations, and Facts

data information

types resources

objects

name images

id password address

memory pointer

time bytes buffer

object class array

element variable model

sites tools

applications

stackoverflow google

facebook firebug eclipse tomcat

table query

database sql

data

column row

index record value

integer bit

number value

list item tree

node

Page 31: KB-LDA: Jointly Learning a Knowledge Base of Hierarchy ...dmovshov/docs/dma_acl_2015_kblda_pres… · KB-LDA: Jointly Learning a Knowledge Base of Hierarchy, Relations, and Facts

KB-LDA Topics versus Human-Provided Tags

serverprovision.Apply(); "There is already an object named 'schema_info' in the database.”!And if i try to synchronize then it throws an exception at line syncOrchestrator.Synchronize(); "The current operation could not be completed because the database is not provisioned for sync or you not!

python mysql

Tags

For each document: 1.  Find most probable topic (e.g., databases) 2.  Aggregate tags for topic

What tags are frequently associated with topics?

databases p=0.7 python mysql

Page 32: KB-LDA: Jointly Learning a Knowledge Base of Hierarchy ...dmovshov/docs/dma_acl_2015_kblda_pres… · KB-LDA: Jointly Learning a Knowledge Base of Hierarchy, Relations, and Facts

Top Tags Associated with Topics

regex string!python

php ruby

string character characters

text line

KB-LDA Topics!

css !html

jquery html5

javascript

element div css!

elements http

sql !mysql

database performance

php

table query

database !sql !

column

databases!

regular expressions!

html elements!

Human Tags!

Ontology Relations

Documents

Page 33: KB-LDA: Jointly Learning a Knowledge Base of Hierarchy ...dmovshov/docs/dma_acl_2015_kblda_pres… · KB-LDA: Jointly Learning a Knowledge Base of Hierarchy, Relations, and Facts

Domain-Specific Extraction from Open IE

ReVerb 15m triples

Triples with software entities

5k triples

Subject-verb-object are in KB-LDA Dictionary

SVOs extracted directly from StackOverflow

37k triples

�  (safari, supports, svg) �  (computer, is running, xp)

�  (view, looks, south) �  (people, can read, italian)

Page 34: KB-LDA: Jointly Learning a Knowledge Base of Hierarchy ...dmovshov/docs/dma_acl_2015_kblda_pres… · KB-LDA: Jointly Learning a Knowledge Base of Hierarchy, Relations, and Facts

Domain-Specific Extraction from Open IE

ReVerb 15m triples

Triples with software entities

5k triples

Subject-verb-object are in KB-LDA Dictionary

SVOs extracted directly from StackOverflow

37k triples

Domain-specific triples?

p(SVO|KB-LDA) p(SVO|ReVerb)

Page 35: KB-LDA: Jointly Learning a Knowledge Base of Hierarchy ...dmovshov/docs/dma_acl_2015_kblda_pres… · KB-LDA: Jointly Learning a Knowledge Base of Hierarchy, Relations, and Facts

0.0 0.2 0.4 0.6 0.8 1.0Recall

0.0

0.2

0.4

0.6

0.8

1.0P

reci

sion

KB-LDA versus ReVerb Ranking

KB-LDA, Best F1=0.73, AUC=0.67ReVerb, Best F1=0.72, AUC=0.57

Page 36: KB-LDA: Jointly Learning a Knowledge Base of Hierarchy ...dmovshov/docs/dma_acl_2015_kblda_pres… · KB-LDA: Jointly Learning a Knowledge Base of Hierarchy, Relations, and Facts

"  Corpus-driven KB construction: Jointly optimizes schema and facts

"  Unsupervised: Useful for exploration of new domains

"  KB-LDA can be used to extract domain-specific facts from an Open IE knowledge base

KB-LDA: Conclusion

www.cs.cmu.edu/~dmovshov Thank you!