Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
Aprendizado de Máquina e Inferência em Grafos
de ConhecimentoDaniel N. R. da Silva
Artur ZivianiFabio Porto
dramos,ziviani,[email protected]ório Nacional de Computação Científica (LNCC)
Simpósio Brasileiro de Bancos de Dados (SBBD), Outubro de 2019
Machine Learning and Inference in Knowledge Graphs
Big Data ManagementComputational Reproducibility
Knowledge BasesMachine LearningNetwork Science
Scientific Workflowshttp://dexl.lncc.br/
Outline
• Introduction and Motivation• Applications• Data Models and Systems• Relational Machine Learning• Knowledge Graph Embeddings• Events, Softwares, and Datasets• Open Research
3
Introduction and Motivation
The third (current) rise of AI
Medical Image Analysis
Online Advertisement
Web Search
Drug Discovery and
Toxicology
Custom Relationship Management
Medical Informatics
Board Games Image Captioning
Natural Language
Processing
Self-Driving Vehicles
Recommendation Systems Robotics
Deng, Li. Artificial Intelligence in the Rising Wave of Deep Learning: The Historical Path and Future Outlook [Perspectives]. IEEE Signal Processing Magazine. 2018.
5
Knowledge Representation and Reasoning (KRR)How to symbolically encode human knowledge and reasoning in such a way that this encoded knowledge can be processed by a computer via encoded reasoning to obtain intelligent behavior.
6Chein, M. and Mugnier, M-L. Graph-based Knowledge Representation: Computational Foundations of Conceptual Graphs. Springer. 2008.
SurrogateSet of
Ontological Commitments
Fragmentary Theory of IntelligentReasoning
Medium for Efficient
Computation
Medium of Human
Expression
7
Ontology
1980s
1960s
SemanticNetworks
1998
The SemanticWeb
Porphyry'sCommentaries
300s AD
Linked OpenData
2006 KnowledgeGraph
2012
8https://www.gartner.com/smarterwithgartner/5-trends-appear-on-the-gartner-hype-cycle-for-emerging-technologies-2019/
9
Graph DBMS
https://db-engines.com/en/ranking_categories
Linked Open Data
10https://lod-cloud.net/
Knowledge Graph
• Network representation for knowledge;• Heterogeneous data integration;• Constrained by an ontology or data schema;• AI applications.
11
Knowledge Graph• Terminological Component
• C: Concepts• RC:Concept Relations• TC: Relationships between concepts• TC->A: Attributive relationships• A: Attributes• V: Attribute Values
• Assertional Component:• E: Entities• RE: Entity Relations• TE: Relationships between entities• TE->C: Instantiation relationships• TE->V: Attributive relationships
12
13
Related Terms
• Knowledge base: Set of sentences (assertions) expressed in a language called a knowledge representation language (semantics and syntax);
• (Graph) Database: Storage and data organization;
• Ontology: Formal explicit description of concepts in a domain of discourse. It provides sharable and reusable knowledge.
• Knowledge Based System:Keeps a knowledge base and performs reasoning.
14Noy, N. and McGuinness, D. Ontology development 101: A guide to creating your first ontology. 2001.Ehrlinger, L. and Wöß, W. Towards a definition of knowledge graphs. SEMANTiCS. 2016.
Related Fields
15
Artificial Intelligence
Data Management
Machine Learning
Natural Language Processing
Data Integration
Knowledge Representation and Reasoning
Data Modeling Information Retrieval
16
NELL
General-purpose KGs Bio & Medical KGs
Common-sense KGs & NLP
Product Graph & E-commerce KGs
Big Techs
• Apple Siri Knowlege Graph• Amazon Product Graph• Facebook Graph• Google Knowledge Graph• IBM Watson• Microsoft Satori
17
Knowledge Graphs
18
# Entities # Triples # Concepts # RelationsDBpedia 4 298 433 411 885 960 736 2 819Freebase 49 947 799 3 124 791 156 53 092 70 902OpenCyc 41 029 2 412 520 116 822 18 028Wikidata 18 697 897 748 530 833 302 280 1 874YAGO 5 130 031 1 001 461 792 569 751 106
Färber, M. and Rettinger, A. Which Knowledge Graph Is Best for Me? ArXiv. 2019.
Knowledge Graphs
19
0
0.2
0.4
0.6
0.8
1Accuracy
Trustworthiness
Consistency
Relevancy
Completeness
TimelinessEase ofunderstanding
Interoperability
Accessibility
Licensing
Interlinking
DBpedia Freebase OpenCyc Wikidata YAGO
Färber, M. and Rettinger, A. Which Knowledge Graph Is Best for Me? ArXiv. 2019.Färber, M. et al. Linked data quality of dbpedia, freebase, opencyc, wikidata, and yago. Semantic Web. 2018.
DBpedia
• Crowd-sourced community effort to extract structured content from Wikmedia projects;
• Served as Linked Data;• Ontology, including persons, places, creative
works, organizations, species, and diseases;• 125 languages;• Links to YAGO and Wikipedia.
Lehmann, J et al. DBpedia – A Large-scale, Multilingual Knowledge Base Extracted from Wikipedia.SWJ. 2012.https://dbpedia.org/
Structure in Wikipedia• Title• Abstract• Infoboxes• Geo-coodinates• Categories• Images• Links:
• other language versions• other Wikipedia pages• to the web;• redirections;• disambiguations.
21
YAGO
• YAGO (Yet Another Great Ontology).• Derived from Wikipedia, WordNet, and
Geonames.• 95% accuracy on sample facts.• Temporal Dimension:
• Before, after, during, and overlaps• Location Dimension:
• northOf, eastOf, southOf, westOf, nearby, and locatedIn.
• Multilingual facts.
Rebele, T. et al. YAGO - A Multilingual Knowledge Base from Wikipedia, Wordnet, and Geonames. ISWC. 2016.https://www.mpi-inf.mpg.de/departments/databases-and-information-systems/research/yago-naga/yago/
23https://gate.d5.mpi-inf.mpg.de/webyago3spotlxComp/SvgBrowser/
24
StructuredData
Ex.: Relational Databases.
Semi-structuredData
Ex.: XML and JSON documents.
Non-StructuredData
Ex.: Text, audio, image, and video files.
Knowledge BaseConstruction
Ex.: Entity and relation extraction.
Knowledge Graph
Refinement
Applications
Completion and Correction
Completeness
Freshness
Correctness
Example
(Marie Curie, date-of-birth, 07/11/1867)(Marie Curie, born-in, Poland)(Marie Curie, nationality, French)(Marie Curie, profession, Physicist)(Marie Curie, profession, Chemist)(Marie Curie, research-field, Radioactivity) 25
Marie Curie was born on November 7, 1867, in Warsaw, Poland. She was a french naturalized, physicist and chemist who conducted pioneering research in the fields of radioactivity.
Text fragment
Extracted triples
26
MarieCurie
Polish
Physicist
Radioactivity
Chemist
“07/11/1867”
French
born-in
profession
research-field
nationality
date-of-birth
profession
Challenges
27
Coverage/CompletenessHave we got the information we need?
CorrectnessIs information accurate?
FreshnessIs information up to date?
Gao, Yuqing. Building a Large-scale, Accurate and Fresh Knowledge Graph. KDD Tutorial 2018.
Applications
Applications
• Conversational Agents;• Data integration;• Fact checking and fake news detection;• Question Answering;• Recommendation Systems;• Search Engines.
29
Search Engines
30
Recommendation Systems
31
User
Cast Away
BackTo TheFuture
SavingPrivateRyan
BladeRunner
TomHanks
RobertZemeckis
HarrisonFord
StevenSpielberg
Drama
SciFi
Action
ForrestGump
JurassicPark
Movies theuser has watched.
Recommendedmovies.
starred-indirector-ofgenre
Wang, H et al. Exploring High-Order User Preference on the Knowledge Graph for Recommender Systems. ACM TOIS. 2019.
Conversational Agents
32
User
ChatBot
NaturalLanguageProcessing
QueryGeneration
ContextManagement
DomainKnowledge
NaturalLanguageGeneration
Jonh Doe: @bot How much money do I have in my account?
Account Balance
Jonh Doe
JDAccount $ 0,99
Bot: @johndoe You have $0,99 inyour bank account.
isA
has
has
owns
Question Answering
33
Which films directed by Steven Spielberg won the Oscar Award?
Oscar Award
StevenSpielberg
Which
won directed film
nsubjdep nsubj
nmod:by
det
Zheng, W et al. Question Answering Over Knowledge Graphs: Question Understanding Via Template Decomposition. PVLDB. 2019.https://nlp.stanford.edu/software/lex-parser.shtml
StevenSpielberg
Film
Oscar Award
? ? ... ?directed
isA
won
Fact Checking and Fake News Detection
34
Barack Obama
Columbia University
Association ofAmerican Universities
Canada
Stephen Harper
Calagary
Naheed Neshi
Islam
Ciampaglia, G. L. Computational Fact Checking from Knowledge Networks. PLOS ONE. 2015.
Barack Obama secretlypractices Islam.
Personalized Medicine
35iASiS: Big Data to Support Precision Medicine and Public Health Policy. 2017.
36Vidal, M-E et al. Semantic Data Integration of Big Biomedical Data for Supporting Personalised Medicine. Springer. 2019.
Literome
• Extract genomic knowledge from PubMed articles;
• Knowledge pertinent to genomic medicine:• Directed genic interactions such as pathways;• Genotype–phenotype associations.
37Poon, H. Literome: PubMed-scale genomic knowledge base in the cloud. Bioinformatics. 2014.
Project HANOVER
• Machine reading for accelerating “Curations as a service” (CaaS) in precision medicine:
• Molecular Tumor Board:• Cancer is a thousand diseases driven by disparate
genetic mutations.• Real-world evidence:
• FDA-approved drug takes over a decade and costs more than $2 billion.
• Clinical trial matching:• 20% of clinical trials fail due to insufficient patients.
38https://hanover.azurewebsites.net/
ResearchSpace
• Semantic Web / Linked Data application;• Museums, libraries, archives: knowledge or
memory institutions• CIDOC-CRM:
• Conceptual Reference Model
39Oldman, D. and Tanase, D. Reshaping the Knowledge Graph by Connecting Researchers, Data and Practices in ResearchSpace. ISWC. 2018.https://www.researchspace.org.
40Oldman, D. and Tanase, D. Reshaping the Knowledge Graph by Connecting Researchers, Data and Practices in ResearchSpace. ISWC. 2018.https://www.researchspace.org.
Drug Discovery
Kanza, S. and Frey, J. G. A new wave of innovation in Semantic web tools for drug discovery. Expert Opnion on Drug Discovery. 2019.
Drug Discovery Data
Drug Discovery RDF Drug Discovery Ontologies
Drug Discovery Knowledge Bases
Semantic Search for Drug Discovery
Drug DiscoveryKnowledge Graphs
Intelligent Applications for DD
New Drug Discovery
Hum
an C
urat
ion
41
Data Models and Systems
Data Models
• Labeled Property Graph:• Node-oriented;
• Resource Description Framework (RDF):• Triple-oriented.
43
Labeled Property Graph
• Graph Databases:• Ex.: Amazon Neptune, Neo4J, JanusGraph, and
TigerGraph.• Query languages:
• Cypher, GCore, GS, Gremlim, and PGQL
44
https://aws.amazon.com/neptune/https://neo4j.com/https://janusgraph.org/https://www.tigergraph.com/
Labeled Property Graph
45
n1: PersonfirstName = “João”lastName = “Silva
n2: PersonfirstName = “Maria”lastName = “Santos”
n5: Posttext = “I love italian food.”lang = “en”
n3: Taglabel = “italian”
n4: Taglabel = “food”
e2: createsdate = 2019-09-04
e1: likesdate = 2019-09-05
e4: has
e3: has
Nodes- One or more labels- Set of propertiesEdges- Edge type- Set of properties
RDF
46
Nodes: Resources, literals or blank nodes Subjects (resource or blank node) Objects (resource, literal or blank node)
Edges: Predicates (resource)
Literal:- Can be interpreted as datatypes - Encoded as strings- Represent data values
Subject ObjectPredicate
Triple
RDF
• IRI/URI (Internationalized/Uniform Resource Identifier)
• Serialization: JSON-LD, N-Triples, RDF/XML, and Turtle
• Query Languages: SPARQL.
47
@prefix dbr: <http://dbpedia.org/resource>.@prefix dbo: <http://dbpedia.org/ontology>.@prefix foaf: <http://xmlns.com/foaf/0.1/> .
dbr:Albert_Einstein rdf:type foaf:Person; dbo:birthDate "1879-03-14"^^xsd:date.
dbr:Albert_Einstein
foaf:Person 1879-03-14
rdf:type dbo:birthDate
xsd:date
Systems for Knowledge Bases
• Grakn• Ontotext GraphDB• Amazon Neptune
48
Grakn
• Hyper-relational deductive database:• Entity and relationship types reasoning;• Rule-based reasoning.
• Core:• Built on top of JanusGraph, Apache TinkerPop,
Apache Hadoop, Apache Cassandra, Apache Spark• Data Model based on Entity-Relationship
Model and Hypergraphs;• Graql:
• OLTP, OLAP
49https://grakn.ai/
50https://dev.grakn.ai/docs/concept-api/overview
Concept
Type Rule Thing
EntityType
AttributeType
RelationType Entity
Attribute
Relation
51
01. define02. name sub attribute, datatype string;03.03. person sub entity, has name, plays person_;04. friend sub relation, relates person_, relates person_;05. is_friend sub rule,06. when {07. (person_: $p1, person_: $p2) isa friend;08. (person_: $p2, person_: $p3) isa friend;09. }, then {10. (person_: $p1, person_: $p2) isa friend;11. };12.13. insert14. $p1 isa person, has name "A";15. $p2 isa person, has name "B";16. $p3 isa person, has name "C";17.18. match19. $p1 isa person, has name "A";20. $p2 isa person, has name "B";21. $p4 isa person, has name "C";22. insert23. $r (person_: $p1, person_: $p2) isa friend;24. $r (person_: $p2, person_: $p3) isa friend;25. commit26.27. match28. $p1 isa person; $p2 isa person; 29. $r (person_: $p1, person_: $p2) isa friend;30. get31. $r;
“A”
“B”
“C”
@has
friend
person
name
owner value
person_ person_
Inferred
Ontotext GraphDB
• Triplestore:• RDFS, SPARQL, and RDF4J.
• Query Optimizer;• Reasoner (TREE Engineer);• Storage: Entity Pool
52http://graphdb.ontotext.com
53http://graphdb.ontotext.com/documentation/free/architecture-components.html
Amazon Neptune
• Graph service and database;• Amazon Web Services;• Labeled graph property:
• Apache Tinkerpop Gremlim• RDF
• SPARQL• Quad (subject, object, predicate, graph);
54https://aws.amazon.com/neptune/
55https://www.slideshare.net/AmazonWebServices/new-launch-amazon-neptune-overview-and-customer-use-cases-dat319-reinvent-2017
Knowledge Graph Tasks
Knowledge Graph Tasks
• Automated Construction• Refinement:
• Completion• Correction
• Analytics
57
Automated Construction
Knowledge Base/Graph Construction• Entity and Relation Extraction:
• Named Entity Recognition• Entity Alignment
• Systems:• DeepDive• Fonduer• Snorkel
59
Named Entity Recognition
60
Person State City
Pedro II of Brazil was the second and last monarch of the
Empire of Brazil. He was born in Rio de Janeiro to
Emperor Pedro I of Brazil and Empress Maria
Leopoldina,who were married at that time.
Entity Linking
61Q156774 Q939 Q84239 Q8678
Q217230
Pedro II of Brazil was the second and last monarch of the
Empire of Brazil. He was born in Rio de Janeiro to
Emperor Pedro I of Brazil and Empress Maria
Leopoldina,who were married at that time.
Entity Linking
62Abrams, R. Google Thinks I’m Dead (I know otherwise.) 2017. https://www.nytimes.com/2017/12/16/business/google-thinks-im-dead.html?_r=0
Relation Extraction
63
Pedro II of Brazil was the second and last monarch of
the Empire of Brazil. He was born in Rio de Janeiro to
Emperor Pedro I of Brazil and Empress Maria
Leopoldina, who were married at that time.
Q156774Q8678
Q939
Q84239
Q217230spouse
son
son born-in
emperor-of
DeepDive
CandidateMapping and
Feature ExtractionData Input
Supervision Learning andInference
Output
SpouseSpouse1 Spouse2
Tom RitaSarah Matthew
ErrorAnalysis
Ce Zhang, C. et al. DeepDive: Declarative Knowledge Base Construction. Communications of the ACM 2017.http://deepdive.stanford.edu/kbc
DeepDiveFeature Extractors: Populates the database using a set of SQL queries and UDFs. By default, one sentence per row using NLP.Candidate Mapping: SQL queries to produce possible mentions, entities and relations.
SupervisionEvidence Relation: Handlabeling and Distant Supervision.
Learning and InferenceFactor Graph similar to Markov Logic.Gibbs sampling.
Ce Zhang, C. et al. DeepDive: Declarative Knowledge Base Construction. Communications of the ACM 2017.http://deepdive.stanford.edu/kbc
Fonduer
KBCInitialization
CandidateGeneration
MultimodalFeaturization &
MultimodalLSTM
Supervision & Classification
Schema Matchers andThrottlers
LabelingFunctions
Phase 1 Phase 2 Phase 3Data Input
Output
DateOfBirthPerson Date
Madonna 16/08/1958Gisele B. 20/07/1980
User Input
ErrorAnalysis
66Ratner, A. J. et al. Snorkel: rapid training data creation with weak supervision. VLDB Journal. 2019.https://www.snorkel.org/
FonduerCandidate Generation (Apply UDFs: Matchers and Throtlers)Matchers: Returns mentions to entities.Throttlers: Decrease the # of relationship candidates.
Multimodal FeaturizationAssociate textual, structural, visual and tabular features to relationship candidates.
Supervision and ClassificationLabeling Function: Yields a label (-1, 0 or 1) for each candidate.Data Programming: Estimate the true label for each candidate.Multimodal BiLSTM: Estimate the true label for each candiate considering features.
67Ratner, A. J. et al. Snorkel: rapid training data creation with weak supervision. VLDB Journal. 2019.https://www.snorkel.org/
FonduerCandidate Generation (Apply UDFs: Matchers and Throtlers)Matchers: Returns mentions to entities.Throttlers: Decrease the # of relationship candidates.
- DateMatcher()- DictionaryMatch()- LambdaFunctionMatcher()- LambdaFunctionFigureMatcher()- LocationMatcher()- MiscMatcher()- NumberMatcher()- OrganizationMatcher()- PersonMatcher()- RegexMatchEach()- RegexMatchSpan() https://spacy.io/
- For each si in schema R(s1, ..., sn): - Define a set of matchers - Define the mention space- Cartesian product to yield candidates- Define throttlers: - Operates on candidates
Wu, S. et al. Fonduer: Knowledge Base Construction from Richly Formatted Data. SIGMOD. 2018.https://fonduer.readthedocs.io/en/latest/
Data Programming
69
Candidates Labeling Functions
GroundTruth Prob
Spouse1 Spouse2 L1 L2 L3 Yc1 Tom Hanks Rita Wilson 1 1 -1 ? .75c2 Matthew Broderick Sarah J. Parker 0 1 1 ? .80c3 Brad Pitt Jennifer Aniston 0 1 0 ? .42
Estimate the ground truth based on Labeling Functions agreements and disagreements.
Ratner, A. J. et al. Data programming: Creating large training sets, quickly. NIPS. 2016.Ratner, A. J. et al. Snorkel: rapid training data creation with weak supervision. VLDB Journal. 2019.https://www.snorkel.org/
Data Programming
70
L1
L2
L3
Y2
3
11
23
1
3
2
Accuracy: 1{Li == Y}Labeling Propensity: 1{Li != 0}
Pairwise Correlation: 1{Li == Lj}
w1 w2 w3 w4 w5 w6 w7 w8 w9
1 2 3 1 2 3 1 2 3
Indicator Function
Ratner, A. J. et al. Data programming: Creating large training sets, quickly. NIPS. 2016.Ratner, A. J. et al. Snorkel: rapid training data creation with weak supervision. VLDB Journal. 2019.https://www.snorkel.org/
Snorkel
• Data Labeling:• Labeling functions to heuristically or noisily label
some subset of the training examples;• Data augmentation:
• Transformation functions to heuristically generate new, modified training examples by transforming existing ones;
• Slicing:• Slicing functions to heuristically identify subsets of
the data the model should particularly care about.
71Ratner, A. J. et al. Snorkel: rapid training data creation with weak supervision. VLDB Journal. 2019.https://www.snorkel.org/
Multimodal BiLSTM
72Wu, S. et al. Fonduer: Knowledge Base Construction from Richly Formatted Data. SIGMOD. 2018.https://fonduer.readthedocs.io/en/latest/
Relational Machine Learning
Knowledge Graph Completion
74
Task Query Example Result Example
Triple Classification (Einstein, died-in, USA)? (Yes, 90%)
Link
Pre
dict
ion
Tail Prediction (Elvis Presley, starred-in, ?) (1, Blue Hawaii, 3.23), (2, Change of Habit, 3.12), ...
Head Prediction (?, starred-in, Casablanca) (1, Humphrey Bogart, 2.21), (2, Ingrid Bergman, 2.01), ...
Relation Prediction (Einstein, ?, Germany) (1, born-in, 5.01), (2, died-in, 1.23),...
Attribute Prediction (Obama, nationality, ?) (1, american, 2.21), (2, kenian, 1.02), ...
Entity Classification (Michael Jackson, isA, ?) (1, singer, 6.20), (2, composer, 5.22),...
(ranking, answer, score)
Relational Machine Learning • Probabilistic Graphical Models (PGMs)• Graphical Feature Models (GFMs)• Latent Feature Models (LFMs):
• Knowledge Graph Embedding.
75Maximilian, N. A review of relational machine learning for knowledge graphs. IEEE. 2016.
φθ: E X RE X E → Y Model
Parameters ScorePossible Triples
( , ) ( , )
( , )( , ) ( , )
( , )
Supervised Learning
76
Square Square
Triangle
Triangle Circle
Circle
Supervised Learning
Predictive Model
Square
Triangle
Circle
Predictive Model
77
João Maria
LucasParent
Spouse
Child
(João, Spouse, João)(João, Spouse, Maria)(João, Spouse, Lucas)(João, Parent, João)(João, Parent, Maria)(João, Parent, Lucas)(João, Child, João)(João, Child, Maria)(João, Child, Lucas)
(Maria, Spouse, João)(Maria, Spouse, Maria)(Maria, Spouse, Lucas)(Maria, Parent, João)(Maria, Parent, Maria)(Maria, Parent, Lucas)(Maria, Child, João)(Maria, Child, Maria)(Maria, Child, Lucas)
(Lucas, Spouse, João)(Lucas, Spouse, Maria)(Lucas, Spouse, Lucas)(Lucas, Parent, João)(Lucas, Parent, Maria)(Lucas, Parent, Lucas)(Lucas, Child, João)(Lucas, Child, Maria)(Lucas, Child, Lucas)
(Lucas, child, João)?
Possible Triples: Not observed + Observed Triples
78
( ,0)Spouse
( ,1)Child
( ,1)( ,0)( ,1)( ,0)
Child
Spouse
Spouse
Parent
Supervised Learning
Predictive Model
Predictive Model
Parent 1
Tensor Factorization Formulation
79
Based on observed relationships, does an unseen relationship exist?
Entities
Entit
ies
Relations
Relational Machine Learning
• Probabilistic Graphical Models (PGMs):The existence of a triple depends on the existence of other triples.
• Graphical Feature Models:The existence of a triple is conditionally independent of the existence of other triples given model parameters and observed features in the graph.
• Latent Feature Models:The existence of a triple is conditionally independent of the existence of other triples given model parameters and latent features.
80Maximilian, N. A review of relational machine learning for knowledge graphs. IEEE. 2016.
PGMs
81
E1 E2
Knowledge Graph:2 entities, 2 possible relations,8 possible triples
Dependency Graph:8 nodes (1 for each possible triple)27 possible pair-interdepedencies
- Yi is a RV denoting the ith triple existence.- PGM to model the distribution over possible worlds: Prob(Y1,Y2,Y3,Y4,Y5,Y6,Y7, Y8)
Y1
Y5
Y2 Y3
Y4
Y6 Y7
Y8
Markov Logic Networks
• Syntax: Weighted first-order formulas;• Semantics: Templates for Markov Nets;• Inference: Logical and Probabilistic;• Learning: Statistical and Inductive Logical
Programming.
82
Markov Logic Networks
• Set of pairs (Fi, wi):• Fi: First-order logic formula;• wi: A real number (weight).• ni: Number of satisfied groundings of Fi in y
(possible world).• PGM:
• One node for each grounding atom.• Edges between nodes appearing at the same
grounding formula.
83Richardson, M. and Domingos, P. Markov Logic Networks. Machine Learning. 2006.
Markov Logic Networks
84
Knowledge BaseRules∀x Smokes(x) → Cancer(x)∀x∀y Friends(x,y) → (Smokes(x) ↔ Smokes(y))
FactsAna(A)João(J)
Friends(A, J) Friends(J, A)
Cancer(A) Cancer(J)
Smokes(A) Smokes(J)
Friends(J, J)Friends(A, A)
Rule (feature) weights Number of times the rule is satisfied in the world y.
Richardson, M. and Domingos, P. Markov Logic Networks. Machine Learning. 2006.
Graphical Feature Models
• Graph observed features• Similarity Indices• Rule mining• Inductive Logic Programming
85
Path Ranking Algorithm
• Random walks of bounded length;• Feature Extraction
• Build feature vectors for triples;• Vectors are based on relation paths Pj(R1,...,Rn).
• Training:• Train an off-the-shelf machine learning model.
86Lao, N., Mitchell, T. and Cohen, W. W. Random walk inference and learning in a large scale knowledge base. EMNLP. 2011.
Relation Paths
87
A B C
R1
Probabilities of reaching C from A by following given relation paths:[ 1/4, 0, 0, 0, 0, 0]
R1-1 R2
-1
R2
R3-1
P1(R1, R2)P2(R2
-1, R1-1)
P3(R3, R2-1)
P4(R3-1, R1)
P5(R2, R3-1)
P6(R1-1, R3)
(A, R1, B), (B, R2,C) (A, R3, C)
R3
Maximilian, N. A review of relational machine learning for knowledge graphs. IEEE. 2016.
88
Enumeration of 2-length paths:
P1(Sp,Pa) P2(Pa-1,Sp-1) P3(Pa,Ch) P4(Ch-1,Pa-1) P5(Ch, Sp ) P6(Sp-1,Ch-1)
0 0 0 0 0 1/4
Feature vector associated with (Luísa, Parent, Bruna)
Ch-1
João Ana
Maria Lucas
Luísa
BrunaRafael
PaCh
SiCh
Sp-1
Si-1
SpPa-1Pa-1 PaCh-1
Sp
Sp-1
Ch: Child, Pa: Parent, Sp: Spouse, Si: Sibling
Maximilian, N. A review of relational machine learning for knowledge graphs. IEEE. 2016.
89
Enumeration of 2-length paths:
P1(Sp,Pa) P2(Pa-1,Sp-1) P3(Pa,Ch) P4(Ch-1,Pa-1) P5(Ch, Sp ) P6(Sp-1,Ch-1)
0 0 0 0 0 1/4
Feature vector associated with (Luísa, Parent, Bruna)
Maximilian, N. A review of relational machine learning for knowledge graphs. IEEE. 2016.
Ch-1
João Ana
Maria Lucas
Luísa
BrunaRafael
PaCh
SiCh
Sp-1
Si-1
SpPa-1Pa-1 PaCh-1
Sp
Sp-1
Ch: Child, Pa: Parent, Sp: Spouse, Si: Sibling
Path Ranking Algorithm
• Logistic regression• θ: model parameter vector• σ(x) = 1 / (1 + exp(-x)) • f(s,r,o) : (s,r,o) feature vector• y(s,r,o) : 1 iff (s,r,o) is in the knowledge graph
90
φθ(s,r,o) := σ(<θ, f(s,r,o)>)
91
n1
n2 n6
n5
n3 n7
n8
Feature Engineering- Engineered features to measure local neighborhoods;- Graph Kernels;- Summary statistics.
n4
Node Classification: Machine LearningModel or
Knowledge Graph Embeddings
Word Embeddings in NLP
93
king
man
womanqueen
walking
swimming
swamwalked
Spain
Italy
Rome
Madrid
Brasil
Brasilia
Bengio, Y. et al. Representation Learning: A Review and New Perspectives. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2013.
Embeddings
94
WordI
likeapple
Word EmbeddingI [0.19,...,0.87]
like [0.09,...,0,46]apple [0.97,...,0.22]
Semantic Triple(Steve Jobs, founder, Apple)(Bill gates,founder, Microsoft)
(Paris, capital-of, France)
Entitty/Relation EmbeddingApple [0.92,...,0.12]
Bill Gates [0.08,...,0.46]capital-of [0.67,...,0.37]founder [0.12,...,0.41]France [0.65,...,0.87]
Microsoft [0.78,...,0.21]Paris [0.98,...,0.46]
Steve Jobs [0.12,...,0.69]
Graph Representation Learning
95Espato, A. Innovations in Graph Representation Learning. 2019.https://ai.googleblog.com/2019/06/innovations-in-graph-representation.html.
Knowledge Graph Embedding (KGE)Embed components of a Knowledge Graph including entities and relations into continuos vector spaces, so as to simplify the manipulation while preserving the inhrent structure of the KG.
96Wang, Q. et al. Knowledge Graph Embedding: A Survey of Approaches and Applications. IEEE Transactions on Knowledge and Data Engineering. 2017.
KGE Models
97
plays-in
owns
acted-in
born-in
died-in
plays-in
owns
acted-in
born-in
died-in
EntityEmbedding
RelationEmbedding
KGE Models
• Translational Distance Models• Score functions based on distance.
• Semantic Matching Models• Score functions based on similarity.
• Multiplicative, additive and neural approaches.
98
Model requirements
• Expressiveness:Model KG properties and regularities as much as possible (symmetry, antisymmetry, inversion, composition, etc.)
• Space / time complexity:O(NE), O(NR), O(NT) in time and memory.
99
Anatomy of a KGE Model
• Knowledge Graph (KG);• Negative triples generation strategy;• Scoring function for a triple;• Loss function;• Optimization algorithm.
100
TransE
101Bordes, A. et al. Translating Embeddings for Modeling Multi-relational Data. NIPS. 2013.
Brasilia
Brasil
capital-of
Hierarchical structure
1-to-1 relation as translation (NLP)
vs
vo
vr
translation
TransE
• Constraints on entity embedding:• Prevent learning trivial representations.
• Limitations on dealing with 1-N, N-1, N-M relations.
102
v”Maryl Streep” + v”starred-in” = v”Death becomes her”v”Maryl Streep” + v”starred-in” = v”Doubt”
Movies are very different: cast, genre, director, etc.
Bordes, A. et al. Translating Embeddings for Modeling Multi-relational Data. NIPS. 2013.
• Project to relation-specific hyperplanes.
TransH
103
vs
vo⊥
vrvs⊥
vo wr
||wr||2 = 1
Wang, Z. et al. Knowledge Graph Embedding by Translating on Hyperplanes. AAAI. 2014.
Translational Distance Models
104
- K2GE;- ManifoldE;- SE;- STransE;- TransD;- TransE;- TransF;
- TransG;- TransH;- TransM;- TransR;- TranSparse;- UM.
Wang, Q. et al. Knowledge Graph Embedding: A Survey of Approaches and Applications. IEEE Transactions on Knowledge and Data Engineering. 2017.Nguyen, D. Q. An overview of embedding models of entities and relationships for knowledge base completion. ArxiV. 2019.
RESCAL
105
w12w11 w21 w22
vs2vs1 vo1 vo1
score
Too many parameters!!!
RESCAL
• DistMult: • Wr as diagonal matrix.• Can not deal with asymmetric relations.
• Complex:• Introduce complex value embedding;• Score based on the real part of embeddings.
Nickel, M. et al. A three-way model for collective learning on multi-relational data. ICML. 2011. Yang, B. et al. Embedding Entities and Relations for Learning and Inference in Knowledge Bases. ICLR. 2015.Trouillion, T. Knowledge Graph Completion via Complex Tensor Factorization. 2017.
Analogy
107
Man is to king what woman is to queen.king - man + woman = queen
Ethayarajh, K. et al. Towards Understanding Linear Word Analogies. ACL. 2019.Liu, H et al. Analogical Inference for Multi-relational Embeddings. PMLR. 2017.
Man
King
Woman
Queenfemale
female
royal royalHelps topredict
Analogy
108Liu, H et al. Analogical Inference for Multi-relational Embeddings. PMLR. 2017.
Commutative constraints.
Normality constraints.
SimplE
• Canonical Polyadic Decomposition.• Two vectors for each entity:
• One as subject and another as object.
109Kazemi, S. M. and Poole, D. SimplE Embedding for Link Prediction in Knowledge Graphs. NIPS. 2018.
Element-wise productTrilinear product
Shallow and Deep Models
110
Lookup Table
ei
ej
Embedding Dimension
# En
titie
s
Plausibility of (ei, rk, ej) existence
Shallow Models - Representation expressiviness depends on the embedding dimension;- Transductiveness.
Deep models- Overfitting;- Time and space.
a1 a2
a3 a4
b1 b2
b3 b4
ConvE
111Dettmers, T. et al. Convolutional 2D Knowledge Graph Embeddings. AAAI. 2018.
1D Convolution
2D Convolution
a1 a2 a3 a4
b1 b2 b3 b4a1 a2 a3 a4 b1 b2 b3 b4
f1 f2 f3f1*b2 + f2*b3 + f3*b4
a1 a2 a3 a4
b1 b2 b3 b4
f1 f2
f3 f4
no paddingstride 1
f1*0 + f2*0 + f3*0 + f4*a1
padding 1stride 1
f1*a4 + f2*0 + f3*b2 + f4*0
ConvE
112
vs vr
X
X
XX X
.3
.9
.1
.5
EmbeddingDropout .25
Feature MapDropout .25
HiddenDropout .25
Concatenate ConvolveFully Connected
projectionLogisticSigmoid
Matrix multiplication
with entity matrix
Embeddings ”Image” Feature maps Projection Logits Predictions
Dettmers, T. et al. Convolutional 2D Knowledge Graph Embeddings. AAAI. 2018.
Convolution Neural Network
113
Convolution Neural Network
114
Convolution Neural Network
115
Convolution Neural Network
116
Graph Convolution
117
Generalize the operation of convolutionfrom grid data to graph data.
Main idea: Learn a representation for a node taking into account its neighbors representations.
Wang, Q. et al. A Comprehensive Survey on Graph Neural Networks. ArXiv. 2019.
Differentiable message-passing framework
Kipf, T. N and Welling, M. Semi-Supervised Classification with Graph Convolutional Networks. ICLR. 2017.Schlichtkrull, M. et al. Modeling Relational Data with Graph Convolutional Networks. ESWC. 2018.
hidden state at l+1 layer hidden states at l layer
incoming messages for node u
element-wise activation function
- hu(l+1) is a real vector of dimension dl+1;
- Messages are often chosen to be identical to the set of incoming edges;- gm may be Whv.
Relational Graph Convolutional Networks
119Schlichtkrull, M. et al. Modeling Relational Data with Graph Convolutional Networks. ESWC. 2018.
r-neighborhood of entity e
self-relation specific matrix relation specific matrix
relation specific constant
120
h1 h2
h3 h4
h5h6
h
In
In
Out
h
h2
h3
h6
Self h
X
X
X
X
+ ReLU
Out h1 h4+X
In h5X
RGCN (Encoder) Decoder
Input 1stLayer
2ndLayer
nthLayer
...Score Edge
Loss
Last layer representations taken as latent features.
Relational Graph Convolutional Networks• Rapid growth in number of parameters:
• Overfitting (rare relations);• Models of very large size.
• Weights regularization:• Basis decomposition: linear combination of basis
transformations.• Block-diagonal decomposition: direct sum over a
set of low-dimensional matrices.
121
Graph Neural Networks
• Convolutional Graph Neural Networks• Graph Autoencoders• Recurrent Graph Neural Networks• Spatial-temporal Graph Neural Networks
122Wang, Q. et al. A Comprehensive Survey on Graph Neural Networks. ArXiv. 2019.
Attributive Relations
• Literals;• KGs often include:
• Numerical attributes (e.g., ages, dates, financial, and geoinformation);
• Textual attributes: (e.g., names, descriptions, and titles);
• Images (e.g., profile photos, flags, and posters).• Useful for entities with few relationships;• Regard attributes as entities.
123Gesese, G. A. and Russa Biswas, H. S. A comprehensive survey of knowledge graph embeddings with literals: Techniques and applications. ArXiv. 2019.
MKBE
• Multimodal Knowledge Base Embeddings;• Compositional encoding component:
• Different neural encoders for the variety of observed data.
• Embedding Multimodal Data:• Structured knowledge, numerical, text, and images.
124Pezeshkpour, P. et al. Embedding multimodal relational data for knowledge base completion. EMNLP. 2019.
125
Number...
Feed forward layer
Text
EmbeddingVector
Stacked bidirectional GRUs
... ... ... ...
CharacterLevel
SentenceLevel
CNN over wordembeddings
Image
Pretrained VGGNeton ImageNet
126
s
r
o
DenseLayer
DenseLayer
DenseLayer
ScoreFunction
GRU
CNN
VGGNet
Entity
Relation
Entity
Long Text
Short Text
Image
Embeddings
Score for thetriple (s,r,o)
Ontology
• Explicit and implicitly employ ontological knowledge on KGE learning;
• Restrictions on embedding space to guarantee expressiveness and consistency.
127
• Instance-view, ontology-view, and cross view;• Cross-view association model:
• Cross-view grouping (CG);• Cross-view transformation (CT).
• Intra-view model:• Default;• Hierarchy-aware.
• Specific Cost Functions.
JOIE
128Pezeshkpour, P. et al. Universal Representation Learning of Knowledge Bases byJointly Embedding Instances and Ontological Concepts. KDD. 2019.
B
A
X Y
subclass
isA
rInstance-view
cross-view
ontology-view
129
PersonEntities
CityEntities
Pelé
Adele
Person
Rio
Fortaleza
City
130
PersonEntities
CityEntities
Rio
Fortaleza
Adele
Pelé
CityEntities
PersonEntities
Pelé
Adele
Person
Rio
Fortaleza
City
fct
fct
131
Place
City
InstitutionState
isA isA
locatedIn
University
isA isA
works-in
Person
ScientistisA
Musician
ArtistisA isA
”Place”hierarchy
”Person”hierarchy
Model Training and Evaluation
• Triple Classification Protocol:• Test the model's ability to discriminate between true
and false triples.• Triple (s,r,o) is classified as positive if its score
exceeds a relation-specific decision threshold (learned on validation data).
• Entity Ranking Protocol:• Assess model performance in terms of ranking
answers to certain questions.
132Wang, Y. et al. On Evaluating Embedding Models for Knowledge Base Completion. RepL4NLP. 2019.
Evaluation Metrics
• Regression, classification and ranking metrics.
133
(Mean Reciprocal Rank)
134
s p o score rank
Neymar born-in Mogi 0.80 1 2
Neymar born-in Gama 0.70 2
Neymar born-in Kaká 0.20 3
Neymar born-in Neymar 0.10 4
Gama born-in Mogi 0.05 3
Kaká born-in Mogi 0.85 1
Mogi born-in Mogi 0.04 4
Kaká born-in Gama 0.80 2 1
Kaká born-in Kaká 0.10 3
Kaká born-in Neymar 0.20 4
Kaká born-in Mogi 0.85 1
Gama born-in Gama 0.04 4
Mogi born-in Gama 0.05 3
Neymar born-in Gama 0.70 2
Test triples:(Neymar, born-in, Mogi)(Kaká, born-in, Gama)
Entities:Kaká, Neymar, Mogi, and Gama
hits@1 = (1 + 0 + 0 + 1)/4 = 0.5hits@2 = (1 + 1 + 1 + 1)/4 = 1
MRR = (1 + 1/2 + 1/2 + 1)/4 = 0.75
Test Triple => Corrupted Triples:(s, r, o) => (s, r, o') and (s', r, o).
Negative Sampling
135
- Uniformally sample from entities/relation set.
- tph (Average number of tail entities per head)- hpt (Average number of head entities per tail)- Bernoulli with parameters: - tph / (tph + hpt) for replacing the head and hpt / (tph + hpt) for replacing the tail
- Corrupt a position (i.e. head or tail) using only entities that have appeared in that position with the same relation.
Perturb a triple:unusual
Pointwise
Square Error Loss
Hinge Loss
Logistic Loss
Pairwise
Hinge Loss
Logistic Loss
136
Closed WorldAssumption
Score given for triple t Label (0 or 1) for triple t
1-N Scoring
137
p y(s,r)
Probabilities Labels
Against allentities
Probability(s,r,o') beingtrue.
(s,r, o') label.
Event, Softwares, and Datasets
Conferences
139
AAAI
CIKM
EMNLP
ICML
KDD
NIPS
SIGMODPODS VLDB WSDM
Data Management:
WWW
Artificial Intelligence:
IJCAIICLRECMLPKDD
Natural Language Processing:
ACL
ESWC
• Automated Knowledge Base Construction;• Knowledge Graph Conference;• International Workshop on Challenges and
Experiences from Data Integration to Knowledge Graphs;
• Workshop on Knowledge Graph Technology and Applications;
• Workshop on Deep Learning for Knowledge Graphs.
140
KGE libraries and systems
• AmpliGraph (Tensorflow, Benchmark Datasets):• https://ampligraph.org/;
• DeepGraphLibrary (MXNet/Gluon and PyTorch):• https://www.dgl.ai/
• OpenKE (Tensorflow, Pretrained embeddings):• http://openke.thunlp.org
• PyKEEN (PyTorch):• http://pykeen.readthedocs.io
• PyTorch-BigGraph (PyTorch):• https://torchbiggraph.readthedocs.io/en/latest
141
AmpliGraph DeepGraphLibrary OpenKE PyKEEN PyTorch BigGraph
ConvEComplexDistMultERMLPHolERESCALR-GCNStructured Embedding
TransDTransETransHTransRTranSparseUnstructured Model
Model is implemented by the system.
PyTorch-BigGraph
• Built on PyTorch;• Models:
• ComplEx, DistMult, RESCAL, and TransE.• Features:
• Graph Partitioning;• Multi-threaded computation;• Distributed execution accross multiple machines;• Batched negative sampling.
143Lerer, A. et al. PyTorch-BigGraph: A Large-scale Graph Embedding System. 2019.https://torchbiggraph.readthedocs.io/en/latest/
PyTorch-BigGraph
144Lerer, A. et al. PyTorch-BigGraph: A Large-scale Graph Embedding System. 2019.https://torchbiggraph.readthedocs.io/en/latest/
DeepGraphLibrary
• Built on MXNet/Gluon and PyTorch.• Models:
• Graph Neural Networks: GCN, GAN, R-GCN, LGNN, and SSE.
• Generative Models: DGMG and JTNN.• Capsule, Transformer, and Universal Transformer
architecture.
145https://www.dgl.ai/
Bechmark Datasets• DB15K:
• https://github.com/nle-ml/mmkb
• FB15K:• https://github.com/nle-ml/mmkb
• FB15K-237:• https://github.com/TimDettmers/ConvE
• WN18:• https://everest.hds.utc.fr/doku.php?id=en:transe
• WN18RR• https://github.com/TimDettmers/ConvE
• YAGO3-10:• https://github.com/TimDettmers/ConvE
• YAGO15K:• https://github.com/nle-ml/mmkb
146
Open Research
Knowledge Graph Embeddings
• KGE and the lack of symbolic structures (e.g., rules and restrictions);
• Compare/Combine KGE to/with other approaches (e.g., Probabilistic Soft Logic + rule learning);
• KGE interpretability and explainability;• KGE consistency (e.g., embed formal
knowledge);
148
Knowledge Graph Embeddings
• KGE sparsity and uncertainty;• Temporal and spatial dynamics;• Other structures:
• Paths, motifs, graphlets, etc.• Representation:
• Hypergraphs (n-ary relations);• Meta-properties.
• Inductive vs. transductive learning.
149
Knowledge base/graph construction• Heterogeneous and multimodal information;• Multi-language knowledge bases;• Construction in specific domains;• Entity disambiguation and managing identity;
150
Knowledge base/graph construction• Managing operations at scale;• Virtual knowledge graph.• Continuously learning and self-correcting
systems.• Knowledge graph alignment.
151
152
StructuredData
Ex.: Relational Databases.
Semi-structuredData
Ex.: XML and JSON documents.
Non-StructuredData
Ex.: Text, audio, image, and video files.
Knowledge BaseConstruction
Ex.: Entity and relation extraction.
Knowledge Graph
Refinement
Applications
Completion and Correction
Completeness
Freshness
Correctness
dramos,ziviani,[email protected] autores agradecem ao CNPq, à FAPERJ e ao
CENPES/Petrobras pelo financiamento.
http://dexl.lncc.br/