Convolutional Neural Networks on Graphs with Fast...

Convolutional Neural Networks on Graphs with Fast LocalizedSpectral Filtering

M. Defferrard1 X. Bresson2 P. Vandergheynst1

1EPFL, Lausanne, Switzerland

2Nanyang Technological University, Singapore

Itay Boneh, Asher KabakovitchTel-Aviv University, Deep Learning Seminar

Spectral FilteringNon-parametric

From convolution theorem:

x ∗G g = U(UTg � UT x

(g � UT x

)Or algebrically:

x ∗G g = U

g1 0. . .

I Not localized

I O(n) parameters to train

I O(n2) multiplications (no FFT)

Spectral FilteringNon-parametric

From convolution theorem:

x ∗G g = U(UTg � UT x

(g � UT x

)Or algebrically:

x ∗G g = U

g1 0. . .

I Not localized

I O(n) parameters to train

I O(n2) multiplications (no FFT)

Spectral FilteringPolynomial Parametrization

g is a continuous 1-D function, parametrized by θ:

x ∗G g = U(gθ(λ)� UT x

gθ(λ1) 0. . .

0 gθ(λn)

= Ugθ(Λ)UT x = gθ(L)x

Tφ = T~λ⇒ f (T ) = φf (Λ)φ−1

g is a continuous 1-D function, parametrized by θ:

x ∗G g = U(gθ(λ)� UT x

gθ(λ1) 0. . .

0 gθ(λn)

= Ugθ(Λ)UT x = gθ(L)x

Polynomial parametrization:

gθ(Λ) =K−1∑k=0

θkΛk

I K-localized: 1-hop for every L application

I K parameters to train - independent on the graph’s size

I Still O(n2) multiplications (multiplications with the basis U)

Impulse response on a 2D Euclidean domain

Impulse response on a graph

Spectral FilteringRecursive Polynomial Parametrization

Chebyshev polynomials: Tk(λ) = 2λTk−1(λ)− Tk−2(λ)

T0(λ) = 1

T1(λ) = λ

Parametrization:

gθ(L) =K−1∑k=0

θkTk(L)

L = 2Lλ−1n − I (orthonormal basis in [−1, 1])

Filtering:

y = gθ(L)x =K−1∑k=0

θkTk(L)x =K−1∑k=0

θk xk

Recurrence: xk = Tk(L)x = 2Lxk−1 − xk−2

x0 = x

x1 = Lx

I O(Kn) multiplications (actually O(K |ε|))

I No EVD of L

Filtering:

y = gθ(L)x =K−1∑k=0

θkTk(L)x =K−1∑k=0

θk xk

Recurrence: xk = Tk(L)x = 2Lxk−1 − xk−2

x0 = x

x1 = Lx

I O(Kn) multiplications (actually O(K |ε|))

I No EVD of L

Graph CNN

Graph CNNLearning Filters

ys,j =

Fin∑i=1

gθi,j (L)xs,i

∂L∂θi ,j

[xs,i ,0, . . . , xs,i ,K−1]T∂L∂ys,j

∂L∂xs,i

=Fout∑j=1

gθi,j (L)∂L∂ys,j

s = 1, . . . ,S - sample indexi = 1, . . . ,Fin - input feature map indexj = 1, . . . ,Fout - output feature map indexθi ,j - Fin × Fout Chebyshev coefficients vectors of order KL - Loss over a mini-batch of S samples

Graph CNNLearning Filters

ys,j =

Fin∑i=1

gθi,j (L)xs,i

∂L∂θi ,j

[xs,i ,0, . . . , xs,i ,K−1]T∂L∂ys,j

∂L∂xs,i

=Fout∑j=1

gθi,j (L)∂L∂ys,j

O (K |ε|FinFoutS)

Easily paralleled

Graph PoolingCoarsening

I Multilevel clustering algorithm

I Reduce the size of the graph by a specified factor (2)

I Do all this efficiently

Graclus multilevel clustering algorithm

I Maximizing local normalized cut

I Greedily pick an unmarked vertex i and match it with an unmatched vertex jwhich maximizes the local normalized cut Wi ,j(1/di + 1/dj).

I Extremely fast.

I Dividing the number of nodes by approximately 2.

I Might generate singletons (non matched) nodes. Solved by using fake nodes.

Graph PoolingCoarsening

I Multilevel clustering algorithm

I Reduce the size of the graph by a specified factor (2)

I Do all this efficiently

Graclus multilevel clustering algorithm

I Maximizing local normalized cut

I Greedily pick an unmarked vertex i and match it with an unmatched vertex jwhich maximizes the local normalized cut Wi ,j(1/di + 1/dj).

I Extremely fast.

I Dividing the number of nodes by approximately 2.

I Might generate singletons (non matched) nodes. Solved by using fake nodes.

Graph PoolingFast Pooling

Example: Pooling by 4

I Graclus generates singletons: n0 = 8→ n1 = 5→ n2 = 3

I By adding fake nodes we get: n2 = 3→ n1 = 6→ n0 = 12

I z = [max(x0, x1),max(x4, x5, x6),max(x8, x9, x10)]

I Balanced binary trees ⇒ efficient on GPUs.

ExperimentsMNIST

Wi ,j = e−‖xi−xj‖22/σ2

I 28×28 pixels + 192 fake nodes ⇒ n = |V| = 976

I 8-NN graph ⇒ |ε| = 3198

I Based on LeNet-5 ⇒ K = 5

ExperimentsMNIST

I 8-NN graph ⇒ |ε| = 3198

ExperimentsMNIST

I 8-NN graph ⇒ |ε| = 3198

I Isotropic filters (no orientation)

I Uninvestigated optimizations and initializations

Experiments20NEWS

I 18,846 text documentsassociated with 20 classes

I 10K most common wordsfrom the 94K unique words⇒ n = |V| = 10K

I 16-NN,Wi ,j = e−‖zi−zj‖

(zi - word2vec) ⇒|ε| = 132, 834

I x - bag-of-words model

Experiments20NEWS

Comparison with other methods (K = 5)

I Slightly worse than multinomial naiveBayes classifier.

I Defeats fully-connected networks withmuch less parameters.

Total training time divided by # ofgradient steps

I Scales as O(n) as opposed toO(n2)

Experiments20NEWS

Comparison with other methods (K = 5)

I Slightly worse than multinomial naiveBayes classifier.

I Defeats fully-connected networks withmuch less parameters.

Total training time divided by # ofgradient steps

I Scales as O(n) as opposed toO(n2)

ExperimentsGraph Quality

⇒ The data structure is important

⇒ A well constructed graph is important

⇒ Proper approximations (LSHForest) can be used for larger DBs.

Conclusions and Future Work

I Introduced a model with linear complexity.

I The quality of the input graph is of paramount importance.

I Local stationarity and compositionality are verified for text documents as long asthe graph is well constructed.

Convolutional Neural Networks on Graphs with Fast...

Documents

Constrained Convolutional Neural Networks for …vgg/rg/slides/ccnn1.pdf · Constrained Convolutional Neural Networks for Weakly Supervised Segmentation ... [CCNN] Convolutional Neural

Inductive Representation Learning on Large GraphsGraph convolutional networks. In recent years, several convolutional neural network architectures for learning over graphs have been

ImageNet Classification with Deep Convolutional Neural ... · Convolutional neural networks Output Hidden Data Here's a one-dimensional convolutional neural network Each hidden neuron

A Convolutional Neural Network Cascade for Face … · A Convolutional Neural Network Cascade for Face Detection ... using a GPU, and achieves state ... Convolutional Neural Network

Deep convolutional neural networks - Computer Science- …web.cs.ucdavis.edu/.../lee_lecture19_deeplearning_notes.pdf · Deep convolutional neural networks ... “Deep” architecture

Networks for Graphs NEC Laboratories Europe - Mathias … · Learning Convolutional Neural Networks for Graphs Mathias Niepert Mohamed Ahmed Konstantin Kutzkov NEC Laboratories Europe

Neural Predictive Coding using Convolutional Neural ...Neural Predictive Coding using Convolutional Neural Networks towards Unsupervised Learning of Speaker Characteristics Arindam

Convolutional Neural Networks Analyzed via Convolutional ...yromano/posters/ml_csc_poster.pdf · Convolutional Neural Networks ... Recently, convolutional sparse coding (CSC)

Generalizing Convolutional Neural Networks to Graph ...dzeng/BIOS740/Walker_Bios740.pdf · "Convolutional neural networks on graphs with fast localized spectral ﬁltering." Advances

Understanding Convolutional Neural Networks

Interpretable Convolutional Neural Networksopenaccess.thecvf.com/content...Convolutional_Neural_CVPR_2018_… · Interpretable Convolutional Neural Networks Quanshi Zhang, Ying Nian

Qualifying Oral Exam: Representation Learning on Graphs · Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering [De errard et al., 2016] Convolutional neural

How Powerful Are Graph Neural Networks - Aaltofalconr1/RecentAdvances2019...Graph Convolutional Neural Networks: I Representation learning paradigmcan be extended to graphs I Can e

Learning Convolutional Neural Networks for Graphs › assets › pdf › 2018 › ...25 Neural Representation Learning for Graphs Representation Learning for Knowledge Graphs Observation:

Convolutional Neural Networks for No-Reference Image Quality Assessmentopenaccess.thecvf.com/...Convolutional_Neural_Networks_2014_CVPR_paper.pdf · Convolutional Neural Networks

Visualizing and Understanding Convolutional Neural ... - arXiv and Understanding Convolutional Neural ... - arXiv

Convolutional Neural Networks Analyzed via Convolutional ... · PDF fileConvolutional Neural Networks Analyzed via Convolutional Sparse Coding ... Deep Learning, Convolutional Neural

Knowledge Representation in Graphs using Convolutional ... · Knowledge Representation in Graphs using Convolutional Neural Networks Armando Vieira Lidinwise, 1 Quality Court, WC2A

Convolutional Neural Networks Arise From Ising Models …sunilpai/convolutional-neural-networks.pdf · Convolutional Neural Networks Arise From Ising Models and Restricted Boltzmann

Convolutional Neural Networks (CNN)