64
Enterprise Knowledge Graphs for Large Scale Analytics Nidhi Rajshree, IBM Watson, USA, Nitish Aggarwal, IBM Watson, USA, Sumit Bhatia, IBM Research, India Anshu Jain, IBM Research, Almaden, USA

Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered

  • Upload
    others

  • View
    8

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered

EnterpriseKnowledgeGraphsforLargeScaleAnalytics

NidhiRajshree,IBMWatson,USA,Nitish Aggarwal,IBMWatson,USA,Sumit Bhatia, IBMResearch,IndiaAnshu Jain,IBMResearch,Almaden,USA

Page 2: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered

The material presented in this tutorial represents the personal opinion of the presenters and not of IBM and affiliated organization.

Page 3: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered

Outlineofthetutorial

Part 1: Knowledge Graph Construction• Introduction• DBpedia: Knowledge extraction• Approaches to extend knowledge graph• Knowledge extraction from scratch

Part 2: Knowledge Graph Analytics• Finding entities of interest• Entity exploration• Upcoming challenges

Page 4: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered

WhatisKnowledgeGraph

“The KnowledgeGraph isa knowledgebase usedby Google toenhanceits searchengine'ssearchresultswith semantic-searchinformationgatheredfromawidevarietyofsources.”

Page 5: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered

WhatisKnowledgeGraph

“The KnowledgeGraph isa knowledgebase usedby Google toenhanceits searchengine'ssearchresultswith semantic-searchinformationgatheredfromawidevarietyofsources.”

“AKnowledgegraph(i)mainlydescribesrealworldentitiesandinterrelations,organizedinagraph(ii)definespossibleclassesandrelationsofentitiesinaschema”(iii)allowspotentiallyinterrelatingarbitraryentitieswitheachother… [Paulheim H.]

“WedefinesaKnowledgeGraphasanRDFgraphconsistsofasetofRDFtripleswhereeachRDFtriple(s,p,o)isanorderedsetoffollowingRDFterm….”[Pujara J.alal.]

Page 6: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered

WhatisKnowledgeGraph

Nosingleformaldefinition…

• Definesrealworldentities

• Providesrelationshipsbetweenthem

Page 7: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered

WhatisKnowledgeGraph

Nosingleformaldefinition…

• Definesrealworldentities

• Providesrelationshipsbetweenthem

• Containsrulesdefinesthroughontologies

• Enablereasoningtoinfernewknowledge

Page 8: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered

WhyKnowledgeGraph

Building an intelligent system that can interact with human, requires knowledge about real world entities.

Page 9: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered

WhyKnowledgeGraph

Building an intelligent system that can interact with human, requires knowledge about real world entities.

• Enhance search results.

• Enhance ad sense.

• Help in language understanding.

• Enables knowledge discovery.

Page 10: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered

Isthereexistingknowledgegraphreadytouseformyapplication?

Page 11: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered

GoogleKnowledgeGraphFacebook

EntityGraph

MicrosoftSatori

LinkedInKnowledgeGraph

AmazonProductGraph

Page 12: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered
Page 13: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered

DBpedia:Knowledgeextraction

Page 14: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered

DBpedia:Knowledgeextraction

TheCityofNewYork,oftencalledNewYorkCity orsimplyNewYork,isthemostpopulouscityintheUnitedStates.

Page 15: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered

DBpedia:Knowledgeextraction

TheCityofNewYork,oftencalledNewYorkCity orsimplyNewYork,isthemostpopulouscityintheUnitedStates.

<NewYorkCity>,<CityIn><UnitedStates>.

<CityName>,<locatedIn><CountryName>.

Page 16: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered

DBpedia:Knowledgeextraction

Page 17: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered

DBpedia:Knowledgeextraction

TheCityofNewYork,oftencalledNewYorkCity orsimplyNewYork,isthemostpopulouscityintheUnitedStates.

Page 18: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered

DBpedia:Knowledgeextraction

Page 19: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered

DBpedia:Knowledgeextraction

<headentity>,<rel>< tailentity>

Page 20: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered

DBpedia:Knowledgeextraction

<headentity>,<rel>< tailentity>

WikipediaInfobox

Page 21: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered
Page 22: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered

DBpedia:Knowledgeextraction

Page 23: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered

DBpedia:Knowledgeextraction

Page 24: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered

DBpedia:Knowledgeextraction

Page 25: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered

DBpedia:Knowledgeextraction

Parsers

Ontology(Classes,properties)

dbr:IBM dbp:foundedBydbr:Charles_Ranlett_Flint

dbr:IBM dbp:foundedBydbr:Charles_Ranlett_Flint

dbr:IBM dbp:foundedBydbr:Charles_Ranlett_Flint

……………

Page 26: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered

(Research)problemsinknowledgegraphs

• Incomplete knowledge– Missing entities– Missing relations– Limited entity and relation types

Page 27: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered

(Research)problemsinknowledgegraphs

• Incomplete knowledge– Missing entities– Missing relations– Limited entity and relation types

• Incorrect knowledge– Wrong entity label recognition– Wrong entity and relation type– Wrong facts

Page 28: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered

(Research)problemsinknowledgegraphs

• Incomplete knowledge– Missing entities– Missing relations– Limited entity and relation types

• Incorrect knowledge– Wrong entity label recognition– Wrong entity and relation type– Wrong facts

• Inconsistency in knowledge– Different labels for same entity– Merging entities with same labels

Page 29: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered

Approachestoextendknowledgegraphs

• Extracting knowledge from Wikipedia tables– Large amount of raw data in form of tables– Tables have some implicit structure/patterns

Page 30: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered

Approachestoextendknowledgegraphs

• Extracting knowledge from Wikipedia tables– Large amount of raw data in form of tables– Tables have some implicit structure/patterns

Wiki:AFC_Ajax containingrelationsbetweenplayers,theirshirtnumber,andcountry

Page 31: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered

Approachestoextendknowledgegraphs

• <Wiki:AFC_Ajax,dbp:rel,Wiki:Andre_Onana>• 80%entitiesinthetablehaverelationdbp:rel withtheWikipediatitleentity

Wiki_AFC_Ajax• Other20%entitiesarelikelytohavethesamerelationshipdbp:rel withWiki_AFC_Ajax

[MunozE.atal.]UsingLinkedDatatoMineRDFfromWikipedia'sTables,WSDM2014

Page 32: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered

Approachestoextendknowledgegraphs

[MunozE.atal.]UsingLinkedDatatoMineRDFfromWikipedia'sTables,WSDM2014

• Features– Articlefeatures:no.oftables,length– Tablefeatures:no.ofrows,no.ofcolumns– Columnfeatures:no.ofentitiesincolumn,potentialrelations– Cellfeatures:no.ofentitiesinacell,lengthofcell– Manyothers

• Combinesusingclassificationmethod

Prec. Rec. F1

Rule-based 64.23 70.46 67.20

SVM 72.43 75.77 74.06

Logistic 79.62 79.01 79.31

Page 33: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered

Approachestoextendknowledgegraphs

[MunozE.atal.]UsingLinkedDatatoMineRDFfromWikipedia'sTables,WSDM2014

• Features– Articlefeatures:no.oftables,length– Tablefeatures:no.ofrows,no.ofcolumns– Columnfeatures:no.ofentitiesincolumn,potentialrelations– Cellfeatures:no.ofentitiesinacell,lengthofcell– Manyothers

• Combinesusingclassificationmethod

Prec. Rec. F1

Rule-based 64.23 70.46 67.20

SVM 72.43 75.77 74.06

Logistic 79.62 79.01 79.31

• Rules/heuristicsbasedmethodsmakesmistakes,andhardtocreateoneruleforeveryone.

• Eventhoughcombiningdifferentfeaturesachieves80%accuracy,itintroduces20%noise.

Page 34: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered

Tabledataislimited,weneedtogobeyond

Page 35: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered

Approachestoextendknowledgegraphs

• Missingentity/literalforarelation– “ChristopherA.WeltyisanAmericancomputerscientist,whoworksat

GoogleResearchinNY”• <dbr:Chris_Welty><employedBy><?>

– "TomCruiseandBradPittappearinInterviewwiththeVampire"• <dbr:Brad_Pitt><?><dbr:Tom_Cruise>

Page 36: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered

Approachestoextendknowledgegraphs

• Missingentity/literalforarelation– “ChristopherA.WeltyisanAmericancomputerscientist,whoworksat

GoogleResearchinNY”• <dbr:Chris_Welty><employedBy><?>

– "TomCruiseandBradPittappearinInterviewwiththeVampire"• <dbr:Brad_Pitt><?><dbr:Tom_Cruise>

• KnowledgeBaseCompletion– Similartolinkpredictioninsocialnetworkbutabitmorechallenging– Needtoidentifyrelationtypeinadditiontobinaryoutput.

Page 37: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered

Approachestoextendknowledgegraphs

• KnowledgeBaseCompletion– TransE:learntheentityandrelationembeddings byassumingthattranslation

ofentityembeddings correspondtotheirrelationembeddings.[Bordes etat.2013]

– S+R≈T,where<S,R,T>

– TransH:Learndifferententityembeddingfordifferentrelationships[Wangatel.2014]

– TransR:Learnentityandrelationembeddings indifferentspace,followingbytranslationperforminrelationspace.[LinY.atel.2015]

– Manymoremethods [NickelM.atal,2015]

Page 38: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered

Knowledgebasecompletionapproachesfocusonfindingmissingentities/relations

Page 39: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered

Needtoaddnewentitiesfromexternalsources

Page 40: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered

Needtoaddnewentitiesfromexternalsources

• Entityrecognitioninexternaltextresource• ManyNamedEntityRecognitionsystems

• LinkextractedentitytoKGorcreateanewnodeifitdoesnothaveacorrespondingentity

• TAC-KBP(EntityDiscoveryandLinkingtask)[JiH.atel.2016]

Page 41: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered

BuildingknowledgegraphsuchasDBpedia requireslotofmanualefforts

Page 42: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered

BuildingknowledgegraphsuchasDBpedia requireslotofmanualefforts

• Manyapplicationsrequiredomain/dataspecificcustomknowledgegraphs.

• CreatingschemawithclassstructureandconstraintsforeachKGisdifficult.

Page 43: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered

Howtocreateaknowledgegraphfromunstructuredtext?

Page 44: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered

JonathonWatsonworksatIBM.Hehasmorethan50patents,andwonbestinventorawardforhisinvention“NeuralChipbyJonWatsonetal.

Page 45: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered

JonathonWatsonworksatIBM.Hehasmorethan50patents,andwonbestinventorawardforhisinvention“NeuralChipbyJonWatsonetal.

Entityextraction

Relationextraction

Noisereduction KG

JonathonWatsonIBMJonWatson

employedBy(JonathonWatson,IBM)JonWatson

JonathonWatson,JonWatson

Page 46: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered

Relationextraction

• Supervised methods

Predefined schema (employedBy, bornOn, BirthPlace …)

Training data

JonathonWatsonworksatIBM.

JonathonWatsonjoinedIBM.

employedBy

employedBy

Page 47: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered

Relationextraction

• Supervised methods

Predefined schema (employedBy, bornOn, BirthPlace …)

Training data Test data

JonathonWatsonworksatIBM.

JonathonWatsonjoinedIBM.

employedBy

employedBy JonathonWatsonismanageratIBM.

?

Page 48: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered

Relationextraction

• Supervised methods

Predefined schema (employedBy, bornOn, BirthPlace …)

Training data Test data

JonathonWatsonworksatIBM.

JonathonWatsonjoinedIBM.

employedBy

employedBy JonathonWatsonismanageratIBM.

employedBy

Page 49: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered

Relationextraction

• Supervised methods

Pros: High accuracy and less noise

Cons: Hard and expensive to build labeled data

Page 50: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered

Relationextraction

• Supervised methods• Distantly supervised methods

employedBy (JonWatson,IBM)

affiliated(MichaelDecker,,SMU)

JonWatsonworksatIBM.

JonWatsonbecomesVPatIBM.……….

MichaelDeckerjoinsDataSciencegroupatSMU.

MichaelDeckerwonanationalfundingawardat

SMU.……….

Page 51: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered

Relationextraction

• Supervised methods• Distantly supervised methods

employedBy (JonWatson,IBM)

affiliated(MichaelDecker,,SMU)

JonWatsonworksatIBM.

JonWatsonbecomesVPatIBM.……….

MichaelDeckerjoinsDataSciencegroupatSMU.

MichaelDeckerwonanationalfundingawardat

SMU.……….

Trainingsentences

Page 52: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered

Relationextraction

• Supervised methods• Distantly supervised methods

Pros: Overcome the effort of labeling data

Cons: Dependency of existing knowledge graph and corresponding . text

Page 53: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered

Relationextraction

• Supervised methods• Distantly supervised methods• Unsupervised methods (OpenIE, Universal Schema)

JonathonWatsonworksatIBM.

JonathonWatsonjoinedIBM.

join

(ROOT(S(NP(JonWatson))(VP(VBZworks)(PP(INat)(NPIBM)))

(ROOT(S(NP(JonWatson))(VP(VBDjoined)(NPIBM))

work

Page 54: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered

Relationextraction

• Supervised methods• Distantly supervised methods• Unsupervised methods (OpenIE, Universal Schema)

Pros: eliminates the effort of labeling data

Cons: Noisy, large number of relations

Page 55: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered

Relationextraction

• Supervised methods• Distantly supervised methods• Unsupervised methods (OpenIE, Universal Schema)

Relation1 Relation2 Relation3

WorksemployerCompany

employedBy….

livesIncurrentCityCountry

….

VicePresidentexecutive

Boardmember

….

Page 56: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered

Relationextraction(UniversalSchema)

• Clustering using vector similarity• Matrix completion and fill the empty values [YaoL.atel.,

2012]

employeBy affiliated Leaderof

Jon x x

Michael x

Steve x x

Joyce x x x

Page 57: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered

Entitytypesidentification(UniversalSchema)

• Clustering using vector similarity• Matrix completion and fill the empty values [YaoL.atel.,

2012]

director musician actor

Jon x x

Michael x x

Steve x

Joyce x x

Page 58: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered

Relationextractionindomain

• Supervised methods – Need domain experts to label the data• Distantly supervised methods – Hard to find corresponding

text• Unsupervised methods (OpenIE, Universal Schema) – Noisy

A 59-year-old African American man with a past medical history of hypertension, benign prostatichypertrophy, type II diabetes mellitus for the past 15 years, and chronic back pain presents to the hospitalwith gross hematuria. The patient states that he noticed blood in his urine last night. The patient also reportsmild, intermittent flank pain. The patient states that his diabetes and blood pressure are well controlled withmedications, and that he has managed his chronic back pain with 2 aspirin per day for the past 4 years. Vitalsigns are Temp- 98.6°F, BP- 124/82 mm/Hg, pulse- 88/min, and RR- 14/min. Blood work is notable for HbA1Cof 6.5%. A pyelogram reveals a ring sign. His current fasting glucose is 140mmol/L.<br /><br />What is themost likely etiology of hematuria in this patient?

Symptom

Page 59: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered

Knowledgegraphsindomain

• Domain specific entity extraction is more challenging

• Limited relation types

• Less explicit mention of entity and relation types in text

• Creating simple schema requires domain experts

Page 60: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered

Knowledgegraph- Simple

JonathonWatsonworksatIBM.

MichaelDeckerjoinedIBM.

MichaelDeckerattendsSMU

JonathonWatson

IBM

MichaelDecker

SMU

Page 61: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered

Knowledgegraph- Simple+Schema

JonathonWatsonworksatIBM.

MichaelDeckerjoinsIBM.

MichaelDeckerattendedSMU

JonathonWatson

IBM

MichaelDecker

SMU

affiliated

affiliatedaffiliated

Page 62: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered

Knowledgegraph- Simple+Schema+Ontology

JonathonWatsonworksatIBM.

MichaelDeckerjoinedIBM.

MichaelDeckerattendsSMU

JonathonWatson

IBM

MichaelDecker

SMU

affiliated

affiliatedaffiliated

Domain,range,constraint

Page 63: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered

Summary

• Simple knowledge graph works for many applications

• Identify the requirement before finding the solution.

• Many knowledge graphs are publically available

Page 64: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered

https://www.youtube.com/watch?v=kao05ArIiok&feature=youtu.be