Using hyperbolic large-margin classifiers for biological link prediction...

Using hyperbolic large-margin classifiers forbiological link prediction

SemDeep-5 @ IJCAI 2019

Asan Agibetov, Georg Dorffner, Matthias Samwald

Institute für Artificial Intelligence and Decision Support - Medical University ofVienna, Austria

Aug 12, 2019

Representing biological knowledge1

1“Shared hypothesis testing”, Agibetov et al., J. Biomed. Sem., 2018

Biological link prediction

Link prediction as distance based inference in embeddingspace

Representation learning in Hyperbolic space

Nickel and Kiela. NIPS 2017

Hierarchical relationship from hyperbolic embeddings

Nickel and Kiela. ICML 2018

Clusters of proteins and age groups from hyperboliccoordinates

Lobato et al. Bioinformatics 2018

Hyperbolic embeddings

Same as in Euclidean case we try to learn a link estimatorQ(u, v) 7→ [0, 1] (u, v node pairs) with MLE

▶ Pr(G) = ∏(u,v)∈Etrain Q(u, v) ∏(u,v)̸∈Etrain 1 − Q(u, v)▶ If Q perfect estimator then Pr(x) = 1 iff x = G (i.e., graph

can be fully reconstructed)

Embeddings are parameters Θ of link estimator Q; trained withcross-entropy loss L and negative sampling

▶ L(Θ) =∑

(u,v) log e−d(u,v)∑

v′∈neg(u) e−d(u,v′)

▶ But we perform all computations in hyperbolic space

Backpropagation to learn embeddings

Nickel and Kiela. NIPS 2017

Link prediction for multi-relational biological knowledgegraphs

Flatenning knowledge graphs 2

Turn KG into unlabelled directedgraph, s.t., no pair of nodes is

connected with more than one arc(directed edge)

Compensate reduced information with binary classifiers fine tuned for each relationtype

2Agibetov, Samwald. SemDeep-4@ISWC 2018

Hyperbolic Large-Margin classifier (SVM)

Agibetov, Samwald. SemDeep-4@ISWC 2018

Cho et al. arxiv 2018

Performance evaluation

Lessons learned

Benefit of learning hyperbolic embeddings

▶ fewer dimensions to capture latent semantic and hierarchicalinformation

▶ scalability and interpretability (easier to visualize 2 or 3dimensions)

From our preliminary results

▶ hyperbolic embeddings learn hierarchical relationships inUMLS better than Euclidean embeddings (lower dimensions)

▶ For complex and big graphs (BIO-KG) train hyperbolicembeddings for longer periods (> 500 epochs)

Open issues and future directions

▶ even with recent advances in Riemannian SGD optimization 3,learning hyperbolic embeddings still much slower than in theEuclidean case

▶ next steps should be focused on end-to-end hyperbolicembedding training (hyperbolic large-margin classifier loss isdirectly incorporated during the training process)

▶ code available at https://github.com/plumdeq/hsvm▶ contact: asan.agibetov@meduniwien.ac.at

3“Gradient descent in hyperbolic space”. Wilson and Leimeister, 2018

Why non-Euclidean space - (low-dim) manifolds

▶ Computing on a lowerdimensional space leadsto manipulating fewerdegrees of freedom

▶ Non-linear degrees offreedom often make moreintuitive sense

▶ cities on the earth arebetter localized givingtheir longitude andlatitude (2dimensions)

▶ instead of giving theirposition x, y, z in theEuclidean 3D space

Learning graph embeddings▶ Learn link estimate Q(u, v) 7→ [0, 1] (u, v node pairs) and

approximate graph structure (connectivity) with MLE(maximum likelihood estimation)4

▶ Pr(G) =∏

(u,v)∈Etrain Q(u, v)∏

(u,v)̸∈Etrain 1 − Q(u, v)

▶ If Q perfect estimatorthen Pr(x) = 1 iff x = G(i.e., graph can be fullyreconstructed)

▶ Q can be trained toestimate links at differentorders, i.e., approximateAn.

4“Graph likelihood”, Haija, … Perozzi, …, CIKM17, NeurIPS 2018

Similar principle as word2vec 5

5“word2vec”, Mikolov et al., NIPS 2014

What’s so special about Riemannian geometry - curvature

Model of hyperbolic geometry

Properties of hyperbolic geometry

Computing lengths in hyperbolic geometry

Approximation of graph distance

Using hyperbolic large-margin classifiers for biological link prediction...

Documents

Differentiation of Hyperbolic Functions. Differentiation of Hyperbolic Functions by M. Seppälä Hyperbolic Functions

Reflux Classifiers

Soft Large Margin classifiers

Personalized classifiers

Naïve Bayes Classifiers

Reducing multiclass to binary, 2000cseweb.ucsd.edu/~elkan/254spring01/aldebaro1.pdfAldebaro Klautau Reducing Multiclass to Binary: A Unifying Approach for Margin Classifiers 7 13 Algorithm

QUASI-HYPERBOLIC PLANES IN RELATIVELY HYPERBOLIC GROUPS

Advanced classifiers

8. Classifiers

Lecture 9. Support Vector Machines - GitHub Pages · Statistical Machine Learning (S2 2017) Deck 9 This lecture • Support vector machines (SVMs) as maximum margin classifiers •

More Classifiers

Large margin classifiers based on affine hulls · Hakan Cevikalp, Bill Triggs, Hasan Serhan Yavuz, Yalçın Küçük, Mahide Küçük, Atalay Barkana To cite this version: Hakan Cevikalp,

Ch 6. Kernel Methods by Aizerman et al. (1964). Re-introduced in the context of large margin classifiers by Boser et al. (1992). Vapnik (1995), Burges

ICCV2009: Max-Margin Ađitive Classifiers for Detection

Section 5.8 Hyperbolic Functions Hyperbolic Functionsstaff.katyisd.org/sites/thscalculusap/Larson Chapter 5 Textbook/Ch 5… · hyperbolic functions.The name hyperbolic functionarose

ECE 8443 – Pattern Recognition Objectives: Empirical Risk Minimization Large-Margin Classifiers Soft Margin Classifiers SVM Training Relevance Vector Machines

Dynamic Selection of Classifiers - UFPRSelection of classifiers A single or an ensemble of classifiers can be selected. Static: performed during training, the same selected classifiers

Spatial Intent Recognition usingweb.mit.edu/16.412j/www/html/Final Projects/FIN_Qu_Coffee... · 2005-05-20 · Spatial Intent Recognition using Optimal Margin Classifiers 16.412J

Hyperbolic Tilings with Truly Hyperbolic Crochet Motifsarchive.bridgesmathart.org/2014/bridges2014-405.pdf · Hyperbolic Tilings with Truly Hyperbolic Crochet Motifs ... Rose-Hulman

Online Learning of Maximum Margin Classifiers Kohei HATANO Kyusyu University (Joint work with K. Ishibashi and M. Takeda) p-Norm with Bias COLT 2008