30
www.Objectivity.com Ibrahim Sallam Director of Development

Ibrahim Sallam Director of Developmentgotocon.com/dl/goto-chicago-2013/slides/Ibrahim... · Ibrahim Sallam Director of Development • Graphs, what are they and why? ... Update Navigate

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Ibrahim Sallam Director of Developmentgotocon.com/dl/goto-chicago-2013/slides/Ibrahim... · Ibrahim Sallam Director of Development • Graphs, what are they and why? ... Update Navigate

www.Objectivity.com

Ibrahim Sallam Director of Development

Page 2: Ibrahim Sallam Director of Developmentgotocon.com/dl/goto-chicago-2013/slides/Ibrahim... · Ibrahim Sallam Director of Development • Graphs, what are they and why? ... Update Navigate

•  Graphs, what are they and why? •  Graph Data Management. Why do we need it? •  Problems in Distributed Graph •  How we solved the problems

Page 3: Ibrahim Sallam Director of Developmentgotocon.com/dl/goto-chicago-2013/slides/Ibrahim... · Ibrahim Sallam Director of Development • Graphs, what are they and why? ... Update Navigate
Page 4: Ibrahim Sallam Director of Developmentgotocon.com/dl/goto-chicago-2013/slides/Ibrahim... · Ibrahim Sallam Director of Development • Graphs, what are they and why? ... Update Navigate

h"p://opte.org  

Simple  Graph  

Node  <-­‐>  Node  

Page 5: Ibrahim Sallam Director of Developmentgotocon.com/dl/goto-chicago-2013/slides/Ibrahim... · Ibrahim Sallam Director of Development • Graphs, what are they and why? ... Update Navigate

•  “[finding] Japanese restaurants in New York City liked by people from Japan” – a challenging Facebook query.

•  "The value of any piece of information is only known when you can connect it with something else that arrives at a future point in time,” – CIA’s CTO Ira "Gus" Hunt

Page 6: Ibrahim Sallam Director of Developmentgotocon.com/dl/goto-chicago-2013/slides/Ibrahim... · Ibrahim Sallam Director of Development • Graphs, what are they and why? ... Update Navigate

Person   Building  ?  

Work   Live   RR   Visit   Eat   Shop  

Page 7: Ibrahim Sallam Director of Developmentgotocon.com/dl/goto-chicago-2013/slides/Ibrahim... · Ibrahim Sallam Director of Development • Graphs, what are they and why? ... Update Navigate

CRM,  Sales  &  Marke.ng   Network  

Mgmt,  Telecom  

Intelligence  (Government&  Business)  

Finance  

Healthcare  Research:  Genomics  

Social  Networks  

PLM  (Product  Lifecycle  Mgmt)  

IG

Page 8: Ibrahim Sallam Director of Developmentgotocon.com/dl/goto-chicago-2013/slides/Ibrahim... · Ibrahim Sallam Director of Development • Graphs, what are they and why? ... Update Navigate

•  Data structure for representing complicated relationships between entities.

•  Wide applications in… – Bioinformatics. – Social networks. – Network management… etc.

Page 9: Ibrahim Sallam Director of Developmentgotocon.com/dl/goto-chicago-2013/slides/Ibrahim... · Ibrahim Sallam Director of Development • Graphs, what are they and why? ... Update Navigate
Page 10: Ibrahim Sallam Director of Developmentgotocon.com/dl/goto-chicago-2013/slides/Ibrahim... · Ibrahim Sallam Director of Development • Graphs, what are they and why? ... Update Navigate

•  Everyone specializes – Doctors, Lawyers, Bankers, Developers

•  Why was data so normalized for so long ! •  NoSQL is all about the data specialist •  Specializing in… – Distribution / deployment – Physical data storage – Logical data model – Query mechanism

Page 11: Ibrahim Sallam Director of Developmentgotocon.com/dl/goto-chicago-2013/slides/Ibrahim... · Ibrahim Sallam Director of Development • Graphs, what are they and why? ... Update Navigate

Ingest  

Update  

Navigate  

Massive graph data require efficient and intelligent tools to analyze and understand it.

Page 12: Ibrahim Sallam Director of Developmentgotocon.com/dl/goto-chicago-2013/slides/Ibrahim... · Ibrahim Sallam Director of Development • Graphs, what are they and why? ... Update Navigate

•  Big/Fast data demands write performance •  Most NoSQL solutions allow you to scale

writes by… – Partitioning the data – Understanding your consistency requirements – Allowing you to defer conflicts

Ingest  

Page 13: Ibrahim Sallam Director of Developmentgotocon.com/dl/goto-chicago-2013/slides/Ibrahim... · Ibrahim Sallam Director of Development • Graphs, what are they and why? ... Update Navigate

App-­‐2  (Ingest  V2)  App-­‐2  

(E23{  V2V3})  

InfiniteGraph  

ObjecSvity/DB  Persistence  Layer  

App-­‐1  (Ingest  V1)  

App-­‐3  (Ingest  V3)  

V1   V2   V3  

App-­‐1  (E1  2{  V1V2})  

App-­‐3  

E12   E23  

Ingest  

Page 14: Ibrahim Sallam Director of Developmentgotocon.com/dl/goto-chicago-2013/slides/Ibrahim... · Ibrahim Sallam Director of Development • Graphs, what are they and why? ... Update Navigate

IG  Core/API  

C1  

C2  

C3  

E12  

E23  

Target  Con

tainers  

Pipeline  Containers  

E(1-­‐>2)  

E(3-­‐>1)  

E(2-­‐>3)  

E(2-­‐>1)  

E(2-­‐>3)  

E(3-­‐>1)  

E(1-­‐>2)  

E(3-­‐>2)  

E(1-­‐>2)  

E(2-­‐>3)  

E(3-­‐>1)  

E(2-­‐>1)  

E(2-­‐>3)  

E(3-­‐>1)  

E(3-­‐>2)  

E(1-­‐>2)  

Pipeline  

Agent  

Ingest  

Page 15: Ibrahim Sallam Director of Developmentgotocon.com/dl/goto-chicago-2013/slides/Ibrahim... · Ibrahim Sallam Director of Development • Graphs, what are they and why? ... Update Navigate

•  Excellent for efficient use of page cache •  Able to maintain full database consistency •  Achieves highest ingest rate in distributed

environments •  Almost always has highest “perceived” rate •  Trading Off :

•  Eventual consistency in graph (connections) •  Updates are still atomic, isolated and durable but phased •  External agent performs graph building

Ingest  

Page 16: Ibrahim Sallam Director of Developmentgotocon.com/dl/goto-chicago-2013/slides/Ibrahim... · Ibrahim Sallam Director of Development • Graphs, what are they and why? ... Update Navigate

1  client  

2  clients  

4  clients  

8  clients  

0  

50000  

100000  

150000  

200000  

250000  

300000  

350000  

400000  

450000  

500000  

1  

2  

4  

Nod

es  and

 Edges  per  secon

d  

#  of  Processes  

1  client  

2  clients  

4  clients  

8  clients  

Ingest  

Page 17: Ibrahim Sallam Director of Developmentgotocon.com/dl/goto-chicago-2013/slides/Ibrahim... · Ibrahim Sallam Director of Development • Graphs, what are they and why? ... Update Navigate

Distributed  API  

ApplicaSon(s)  

ParSSon  1   ParSSon  3  ParSSon  2   ParSSon  ...n  

Processor   Processor   Processor   Processor  

Partitioning and Read Replicas… easy right !

Page 18: Ibrahim Sallam Director of Developmentgotocon.com/dl/goto-chicago-2013/slides/Ibrahim... · Ibrahim Sallam Director of Development • Graphs, what are they and why? ... Update Navigate

Distributed  API  

ApplicaSon(s)  

ParSSon  1   ParSSon  3  ParSSon  2   ParSSon  ...n  

Processor   Processor   Processor   Processor  

Navigate  

Page 19: Ibrahim Sallam Director of Developmentgotocon.com/dl/goto-chicago-2013/slides/Ibrahim... · Ibrahim Sallam Director of Development • Graphs, what are they and why? ... Update Navigate

•  Detect local hops and perform in memory traversal

•  Send the partial path to the distributed processing to continue the navigation.

•  Intelligently cache remote data when accessed frequently

•  Route tasks to other hosts when it is optimal

Navigate  

Page 20: Ibrahim Sallam Director of Developmentgotocon.com/dl/goto-chicago-2013/slides/Ibrahim... · Ibrahim Sallam Director of Development • Graphs, what are they and why? ... Update Navigate

Processor  

Distributed  API  

ParSSon  1   ParSSon  2  

Processor  

ApplicaSon  

P(A,B,C,D)  

Navigate  

Page 21: Ibrahim Sallam Director of Developmentgotocon.com/dl/goto-chicago-2013/slides/Ibrahim... · Ibrahim Sallam Director of Development • Graphs, what are they and why? ... Update Navigate

•  Schema vs Schema-less – Database religion –  InfiniteGraph supports schema, but does not

restrict connection types between vertices – We take advantage of the schema when pruning. – …Working on a hybrid support.

Page 22: Ibrahim Sallam Director of Developmentgotocon.com/dl/goto-chicago-2013/slides/Ibrahim... · Ibrahim Sallam Director of Development • Graphs, what are they and why? ... Update Navigate

PaSent   PrescripSon  

Drug  

Ingredient  

Outcome  

Complaint  

Visit  

Allergy  

Physician  

Navigate  

Page 23: Ibrahim Sallam Director of Developmentgotocon.com/dl/goto-chicago-2013/slides/Ibrahim... · Ibrahim Sallam Director of Development • Graphs, what are they and why? ... Update Navigate

•  GraphViews are extremely powerful •  Allow Big Data to appear small ! •  Connection inference can lead to exponential

gains in query performance •  Views are reusable between queries •  Built into the native kernel

Navigate  

Page 24: Ibrahim Sallam Director of Developmentgotocon.com/dl/goto-chicago-2013/slides/Ibrahim... · Ibrahim Sallam Director of Development • Graphs, what are they and why? ... Update Navigate

•  Objectivity/DB is a proven foundation – Building distributed databases since 1993 – A complete database management system

•  Concurrency, transactions, cache, schema, query, indexing

•  It’s a Graph Specialist ! –  Simple but powerful API tailored for data navigation. – Easy to configure distribution model

Page 25: Ibrahim Sallam Director of Developmentgotocon.com/dl/goto-chicago-2013/slides/Ibrahim... · Ibrahim Sallam Director of Development • Graphs, what are they and why? ... Update Navigate

•  Physically co-locate “closely related” data •  Driven through a declarative placement model •  Dramatically speeds “local” reads

Facility  Data  Page(s)  PaSent  Data  Page(s)  

Mr  CiSzen  

Visit   Visit  

Dr  Jones  

San  Jose  

Facility  

Dr  Smith  

Primary    Physician  

Has  Has   With  

At  

Located   Located  

Facility  Data  Page(s)  

Dr  Blake  

Sunny-­‐vale  

Dr  Quinn  

Located   Located  

With  

At  

Page 26: Ibrahim Sallam Director of Developmentgotocon.com/dl/goto-chicago-2013/slides/Ibrahim... · Ibrahim Sallam Director of Development • Graphs, what are they and why? ... Update Navigate

Zone  2  Zone  1  

HostA  

IG  Core/API  

Distributed  Object  and  RelaSonship  Persistence  Layer  

Customizable  Placement  

HostB   HostC   HostX  

AddVertex()  

Page 27: Ibrahim Sallam Director of Developmentgotocon.com/dl/goto-chicago-2013/slides/Ibrahim... · Ibrahim Sallam Director of Development • Graphs, what are they and why? ... Update Navigate

Distributed  Data  Processing  Plagorm   Document  

Graph  Database  

RDBMS  

ParSSoned  Distributed  DB  (ohen  Document  /  KV)  

Users  

App

licaS

ons  

External  /  Legacy  Data   Tr

ansformaS

on  \  M

DM  

Business  

Page 28: Ibrahim Sallam Director of Developmentgotocon.com/dl/goto-chicago-2013/slides/Ibrahim... · Ibrahim Sallam Director of Development • Graphs, what are they and why? ... Update Navigate

•  Distributed update.

Update  

… we are working on it.

Page 29: Ibrahim Sallam Director of Developmentgotocon.com/dl/goto-chicago-2013/slides/Ibrahim... · Ibrahim Sallam Director of Development • Graphs, what are they and why? ... Update Navigate

Have we solved all the distributed graph problems?

Not quite, but…

We Learned to Stop Worrying.

Page 30: Ibrahim Sallam Director of Developmentgotocon.com/dl/goto-chicago-2013/slides/Ibrahim... · Ibrahim Sallam Director of Development • Graphs, what are they and why? ... Update Navigate