Upload
mathias-niepert
View
227
Download
2
Embed Size (px)
Citation preview
Learning Convolutional Neural Networks for GraphsMathias Niepert Mohamed Ahmed Konstantin Kutzkov
NEC Laboratories Europe
GA-653449
2 Learning Convolutional Neural Networks for Graphs
Representation Learning for Graphs
Telecom Safety Transportation Industry Smart cities
(Edge) deployment
Deep Learning System
Intermediate Representation: Graphs
?
3 Learning Convolutional Neural Networks for Graphs
Problem Definition
▌ Input: Finite collection of graphs
● Nodes of any two graphs are not necessarily in correspondence● Nodes and edges may have attributes (discrete and continuous)
▌Problem: Learn a representation for classification/regression
▌ Example: Graph classification problem
…
[ ] = ?class
4 Learning Convolutional Neural Networks for Graphs
State of the Art: Graph Kernels
▌Define kernel based on substructures● Shortest paths● Random walks● Subtrees● ...
▌ Kernel is similarity function on pairs of graphs● Count the number of common substructures
▌Use graph kernels with SVMs
5 Learning Convolutional Neural Networks for Graphs
Patchy: Learning CNNs for Graphs
6 Learning Convolutional Neural Networks for Graphs
Patchy: Learning CNNs for Graphs
node sequence selection (w=6 nodes)
7 Learning Convolutional Neural Networks for Graphs
Patchy: Learning CNNs for Graphs
node sequence selection (w=6 nodes)
8 Learning Convolutional Neural Networks for Graphs
Patchy: Learning CNNs for Graphs
node sequence selection (w=6 nodes)
9 Learning Convolutional Neural Networks for Graphs
Patchy: Learning CNNs for Graphs
node sequence selection (w=6 nodes)
10 Learning Convolutional Neural Networks for Graphs
Patchy: Learning CNNs for Graphs
node sequence selection (w=6 nodes)
11 Learning Convolutional Neural Networks for Graphs
Patchy: Learning CNNs for Graphs
node sequence selection (w=6 nodes)
12 Learning Convolutional Neural Networks for Graphs
Patchy: Learning CNNs for Graphs
node sequence selection (w=6 nodes)
neighborhood assembly (at least k=4 nodes)
13 Learning Convolutional Neural Networks for Graphs
Patchy: Learning CNNs for Graphs
node sequence selection (w=6 nodes)
neighborhood assembly (at least k=4 nodes)
14 Learning Convolutional Neural Networks for Graphs
Patchy: Learning CNNs for Graphs
node sequence selection (w=6 nodes)
neighborhood assembly (at least k=4 nodes)
15 Learning Convolutional Neural Networks for Graphs
Patchy: Learning CNNs for Graphs
node sequence selection (w=6 nodes)
neighborhood assembly (at least k=4 nodes)
16 Learning Convolutional Neural Networks for Graphs
Patchy: Learning CNNs for Graphs
node sequence selection (w=6 nodes)
neighborhood assembly (at least k=4 nodes)
17 Learning Convolutional Neural Networks for Graphs
Patchy: Learning CNNs for Graphs
node sequence selection (w=6 nodes)
neighborhood assembly (at least k=4 nodes)
18 Learning Convolutional Neural Networks for Graphs
Patchy: Learning CNNs for Graphs
node sequence selection (w=6 nodes)
neighborhood normalization (exactly k=4 nodes)
1
2
3
4
1
2
3
41
23
41 2
3
41
2
3
4
neighborhood assembly (at least k=4 nodes)
12 3
4
19 Learning Convolutional Neural Networks for Graphs
Patchy: Learning CNNs for Graphs
neighborhood normalization
● normalized neighborhoods serve as receptive fields● node and edge attributes correspond to channels
Convolutional architecture
1234
1 2 3 4 a 1 … a m
1 2 3 4 a 1 … a n
1234
1 2 3 4 a 1 … a m
1 2 3 4 a 1 … a n
1234
1 2 3 4 a 1 … a m
1 2 3 4 a 1 … a n
…
1
2
3
4
1
2
3
41
23
41 2
3
41
2
3
4
12 3
4
20 Learning Convolutional Neural Networks for Graphs
Node Sequence Selection
▌We use centrality measures to generate the node sequences▌Nodes with similar structural roles are aligned across graphs
node sequence selection
A: Betweenness centrality B: Closeness centrality C: Eigenvector centrality D: Degree centrality
21 Learning Convolutional Neural Networks for Graphs
▌ Simple breadth-first expansion until at least k nodes added, or no additional nodes to add
Neighborhood Assembly
neighborhood assembly
22 Learning Convolutional Neural Networks for Graphs
▌Nodes of any two graphs should have similar position in the adjacency matrices iff their structural roles are similar
▌Result: For several distance measure pairs it is possible to efficiently compare labelingmethods without supervision
▌ Example: ||A – A’||1 and edit distanceon graphs
Graph Normalization Problem
1
2
3
4
normalization
distance measures in matrix and graph space
adjacency matrices under labeling
labeling method(centrality etc.)
23 Learning Convolutional Neural Networks for Graphs
Graph Normalization
Distance to root node
1
2
2
3
3
3
3
24 Learning Convolutional Neural Networks for Graphs
Graph Normalization
Distance to root node
Centrality measures, etc.
1
2
3
5
5
6
4
1
2
2
3
3
3
3
25 Learning Convolutional Neural Networks for Graphs
Graph Normalization
Distance to root node
Centrality measures, etc.
Canonicalization (break ties)
1
2
3
6
5
7
4
1
2
3
5
5
6
4
1
2
2
3
3
3
3
26 Learning Convolutional Neural Networks for Graphs
Computational Complexity
▌ At most linear in number of input graphs▌ At most quadratic in number of nodes for each graph
(depends on maximal node degree and centrality measure)
27 Learning Convolutional Neural Networks for Graphs
Convolutional Architecture
1 2 3 4 a 1 … a n
1 2 3 4 a 1 … a n
1 2 3 4 a 1 … a n
1 2 3 4 a 1 … a n
1 2 3 4 a 1 … a n …
v 1 …
v M
field size: 4, stride: 4, filters: M
28 Learning Convolutional Neural Networks for Graphs
Convolutional Architecture
a 1 … a n
a 1 … a n
a 1 … a n
a 1 … a n
a 1 … a n …
v 1 …
v M
field size: 4, stride: 4, filters: M
1 2 3 4 1 2 3 4 1 2 3 41 2 3 4 1 2 3 4
29 Learning Convolutional Neural Networks for Graphs
Convolutional Architecture
a 1 … a n
a 1 … a n
a 1 … a n
a 1 … a n
a 1 … a n …
v 1 …
v M
field size: 4, stride: 4, filters: M
v 1 …
v M…
1 2 3 4 1 2 3 4 1 2 3 41 2 3 4 1 2 3 4
30 Learning Convolutional Neural Networks for Graphs
Convolutional Architecture
a 1 … a n
a 1 … a n
a 1 … a n
a 1 … a n
a 1 … a n …
field size: 3, stride: 1, N filters
v 1 …
v N
field size: 4, stride: 4, filters: M
v 1 …
v M
v 1 …
v M …
1 2 3 4 1 2 3 4 1 2 3 41 2 3 4 1 2 3 4
31 Learning Convolutional Neural Networks for Graphs
Convolutional Architecture
a 1 … a n
a 1 … a n
a 1 … a n
a 1 … a n
a 1 … a n …
field size: 3, stride: 1, N filters
v 1 …
v N
field size: 4, stride: 4, filters: Mv 1
… v M
v 1 …
v M…
1 2 3 4 1 2 3 4 1 2 3 41 2 3 4 1 2 3 4
32 Learning Convolutional Neural Networks for Graphs
Convolutional Architecture
a 1 … a n
a 1 … a n
a 1 … a n
a 1 … a n
a 1 … a n …
field size: 3, stride: 1, N filters
field size: 4, stride: 4, filters: Mv 1
… v M
v 1 …
v N
v 1 …
v N
v 1 …
v M…
…
1 2 3 4 1 2 3 4 1 2 3 41 2 3 4 1 2 3 4
33 Learning Convolutional Neural Networks for Graphs
Experiments - Graph Classification
▌ Finite collection of graphs and their class labels
● Nodes of any two graphs are not necessarily in correspondence● Nodes and edges may have attributes (discrete and continuous)
▌ Learn a function from graphs to class labels
…
[ ] = ?class
class = 1 class = 0 class = 1
34 Learning Convolutional Neural Networks for Graphs
Experiments - Convolutional Architecture
1 2 … 10 a 1 … a n
1 2 … 10 a 1 … a n
1 2 … 10 a 1 … a n
1 2 … 10 a 1 … a n
1 2 … 10 a 1 … a n …
field size: k, stride: k, filters: 16
field size: 10, stride: 1, filters: 8
flatten, dense 128 units
softmax
1 … 16
1 … 1
6…1
… 8
1 … 8…
35 Learning Convolutional Neural Networks for Graphs
Classification Datasets
▌ MUTAG: Nitro compounds where classes indicate mutagenic effect on a bacterium (Salmonella Typhimurium)
▌ PTC: Chemical compounds where classes indicate carcinogenicity for male and female rats
▌ NCI: Chemical compounds where classes indicate activity against non-small cell lung cancer and ovarian cancer cell lines
▌ D&D: Protein structures where classes indicate whether structure is an enzyme or not
▌ ...
36 Learning Convolutional Neural Networks for Graphs
▌Q: How efficient and effective compared to graph kernels?▌ Apply Patchy to typical graph classification benchmark data
Experiments - Graph Classification
37 Learning Convolutional Neural Networks for Graphs
Experiments - Visualization
▌Q: What do learned edge filters look like?▌ Restricted Boltzmann machine applied to graphs▌ Receptive field size of hidden layer: 9
small instances of graphs weights of hidden nodes
graphs sampled from RBM
38 Learning Convolutional Neural Networks for Graphs
Discussion
▌ Pros:● Graph kernel design not required● Outperforms graph kernels on several datasets (speed and accuracy)● Incorporates node and edge features (discrete and continuous)● Supports visualizations (graph motifs, etc.)
▌Cons:● Prone to overfitting on smaller data sets (graph kernel benchmarks)● Shift from designing graph kernels to tuning hyperparameters● Graph normalization not part of learning
code to be released: patchy.neclab.eu