25
Trinity: A Distributed Graph Engine on a Memory Cloud Speaker: LIN Qian http://www.comp.nus.edu.sg/ ~linqian/

Trinity: A Distributed Graph Engine on a Memory Cloud

Embed Size (px)

DESCRIPTION

 

Citation preview

Page 1: Trinity: A Distributed Graph Engine on a Memory Cloud

Trinity: A Distributed Graph Engine on a Memory Cloud

Speaker: LIN Qianhttp://www.comp.nus.edu.sg/~linqian/

Page 2: Trinity: A Distributed Graph Engine on a Memory Cloud

Graph applications

Online query processing Low latencyOffline graph analytics High throughput

Page 3: Trinity: A Distributed Graph Engine on a Memory Cloud

Online queries

Random data accesse.g., BFS, sub-graph matching, …

Page 4: Trinity: A Distributed Graph Engine on a Memory Cloud

Offline computations

Performed iteratively

Page 5: Trinity: A Distributed Graph Engine on a Memory Cloud

Insight: Keeping the graph in memory

at least the topology

Page 6: Trinity: A Distributed Graph Engine on a Memory Cloud

Trinity

Online query + Offline analytics

Page 7: Trinity: A Distributed Graph Engine on a Memory Cloud

Random data access problem in large graph

computationGlobally addressable distr. memory

Random access abstraction

Page 8: Trinity: A Distributed Graph Engine on a Memory Cloud

Belief

High-speed network is more availableDRAM is cheaper

In-memory solution become practical

Page 9: Trinity: A Distributed Graph Engine on a Memory Cloud

“Trinity itself is not a system that comes with comprehensive built-in graph computation modules.”

Page 10: Trinity: A Distributed Graph Engine on a Memory Cloud

Trinity cluster

Page 11: Trinity: A Distributed Graph Engine on a Memory Cloud

Stack of Trinity system modules

User define: Graph schema, Communication protocols, Computation paradigms

Page 12: Trinity: A Distributed Graph Engine on a Memory Cloud

Memory cloud

Partition memory space into trunksHashing

Page 13: Trinity: A Distributed Graph Engine on a Memory Cloud

Memory trunks

2p > m1. Trunk level parallelism

2. Efficient hashing

Page 14: Trinity: A Distributed Graph Engine on a Memory Cloud

Hashing

Key-value storep-bit value i [0, 2∈ p – 1]

Inner trunk hash table

Page 15: Trinity: A Distributed Graph Engine on a Memory Cloud

Data partitioning and addressing

Benefit:Scalability Fault-tolerance

Page 16: Trinity: A Distributed Graph Engine on a Memory Cloud

Modeling graph

Cell: value + schemaRepresent a node in a cell

Page 17: Trinity: A Distributed Graph Engine on a Memory Cloud

TSL

Object-oriented cell manipulationData integration

Network communication

Page 18: Trinity: A Distributed Graph Engine on a Memory Cloud

Online queries

Traversal basedNew paradigm

Page 19: Trinity: A Distributed Graph Engine on a Memory Cloud

Vertex centric offline analytics

Restrictive vertex centric model

Page 20: Trinity: A Distributed Graph Engine on a Memory Cloud

Message passing optimization

Create a bipartite partition of the local graph

Buffer hub vertices

Page 21: Trinity: A Distributed Graph Engine on a Memory Cloud

A new paradigm for offline analytics

1. Aggregate answers from local computations

2. Employ probabilistic inference

Page 22: Trinity: A Distributed Graph Engine on a Memory Cloud

Circular memory management

• Aim to avoid memory gaps between large number of key-value pairs

Page 23: Trinity: A Distributed Graph Engine on a Memory Cloud

Fault tolerance

Heartbeat-based failure detectionBSP: checkpointing

Async.: “periodical interruption”

Page 24: Trinity: A Distributed Graph Engine on a Memory Cloud

Performance

Page 25: Trinity: A Distributed Graph Engine on a Memory Cloud

Performance (cont.)