Peer-to-Peer Distributed Shared Memory? Gabriel Antoniu, Luc Bougé, Mathieu Jan IRISA / INRIA & ENS Cachan/Bretagne France Dagstuhl seminar, October 2003

Peer-to-Peer Distributed Shared Memory?

Gabriel Antoniu, Luc Bougé, Mathieu Jan

IRISA / INRIA & ENS Cachan/Bretagne France

Dagstuhl seminar, October 2003

2

Why Are DSM Systems Interesting?

Allow to share mutable data in a distributed environment

Transparent access (local/remote) Transparent data localization (remote) (Lots of) consistency models and protocols

Node 0 Node 1

Migration ?Replication ?

3

What Do DSM Systems Usually Assume?

Protocols: implicit hypotheses! Static configuration (number of nodes)

Every node knows every node No node failure No dynamic node departure/arrival

Homogeneous architecture (processors, OS) Designed for small-scale environment

Typically clusters of workstations

Node 0 Node 1

Migration ?Replication ?

4

Challenge: Data Sharing on the Grid (1)

Distributed numerical simulations (code coupling)

Solid mechanics

Thermodynamics

Optics

Dynamics

Satellite design

5

Challenge: Data Sharing on the Grid (2)

Challenge for DSM systems: get larger! Need to integrate new hypotheses

Scalability Fault tolerance Dynamicity Heterogeneity

Just the opposite of DSM systems!

6

Large-Scale Data Sharing:Peer-to-Peer (P2P) Systems

Client

Internet

server

Client

ClientClient

Client

Client

ClientClient

Client

Client

CacheProxy

server

Congestion zone Client/Server

server server

Client/Server

Client/Server Client/

Server

Client/Server

Client/Server

Client/Server

Client/Server

Client/Server

Congestion zone

Client/server model

Peer-to-peer model

7

Data Sharing at a Large Scale: Peer-to-Peer Systems

Features: Excellent scalability: millions of nodes High volatility tolerance

But: Sharing read-only data Few exceptions: Oceanstore, Ivy, etc.

Question: What consistency models and protocols for a

large scale, dynamic environment?

8

DSM Systems vs. P2P Systems

DSM P2P

Scale 101-102 105-106

Dynamicity Null High

Resource homogeneit

y

Homogeneous (clusters)

Heterogeneous (Internet)

Control and trust

High Low

Topology Flat Flat

Data type Mutable Immutable

Typical applications

Scientific computation

File sharing and storage

9

Data Sharing: the Gap!

1 101 102 106

DSM

• Small-scale

• Static

• Homogeneous

P2P

• Large-scale

• Dynamic

• Heterogeneous

103

?

104 105

10

Idea: Hybrid Approach

DSM systems: consistency and transparent access

P2P systems: scalability and high dynamicityDSM Data Sharing Service P2P

Scale 101-102 103 - 104 105-106

Dynamicity Null Medium High

Resource homogeneit

y

Homogeneous (clusters)

Rather heterogeneous (clusters of clusters)

Heterogeneous (Internet)

Control and trust

High Medium Low

Topology Flat Hierarchical Flat

Data type Mutable Mutable Immutable

Typical applications

Scientific computation

Scientific computation and data storage

File sharing and storage

11

Why Such a Service?

Data sharing service for ASP environments

Persistent data Transparent localization Consistency Automatic redistribution

AGENT(s)

S1 S2 S3 S4

Client

A, B, C

Answer (C)

S2 !

Request

Op1(C, A, B)

Server Server Server Server

Op2(C, A, B)

Data Sharing Service

12

A Data Sharing Service for the Grid

Internet

Persistence

13


Internet

Data transfer

?

Transparent data location

14


Internet

Scalability

Internet

15


InternetInternet

Volatility tolerance

16

JXTA: a Framework for P2P Services

Open-source platform for programming P2P applications http://www.jxta.org

Peer Uniquely identified (ID) Address independent of physical location Multiple network access points (TCP, HTTP, etc)

Peer group

Peer

Peer

Peer Peer

Peer

PeerPeer

Peer

PeerPeer

Peer

Peer

FirewallPeer

PeerTCP/IP

HTTP

Peer ID

Peer ID

Peer ID

Peer ID

Peer ID

Peer ID

Peer ID

Peer ID

Firewall

17

JuxMem: an Architecture Proposal

juxmem group

cluster A group

cluster B group

cluster C group

data group

Physical architecture

Logical architecture

18

JuxMem API

Alloc (size, attribs)

Map (id, attribs)

Put (id, value)

Get (id)

Lock (id)

Unlock (id)

19

Managing Shared Data Blocks

Allocate a memory block = create a data group Data blocks replicated on providers Data blocks identified by the ID of the peer group Transparent access for clients via data ID

Consistency Current model: SC Assume data-race-free programs Simultaneous updates of replicas (logical multicast) Clients are not notified of updates

Synchronization One lock per data block Other mechanisms: in progress

20

Handling Peer Volatility

Provider volatility A manager per peer group Dynamic monitoring of available peers (cluster) Automatic replication of data blocks (data)

Manager volatility Periodic exchange of heartbeats Dynamic replication of managers if needed

juxmem group

cluster A group

cluster B group

cluster C group

data group

21

Implementation and Preliminary Evaluation

Implementation JXTA service, 5000 Java code lines

Experimental setup PentiumII: 450 Mhz and 256 MB of RAM FastEthernet 100 Mb/s Linux 2.4 Number of nodes: 20

Experiment Study provider volatility

22

Study: Provider Volatility (1)

juxmem group

cluster group

data group

Data size: one byte

Replication degree = 3

Data manager not killed

1 client: 100 iterations lock-put-unlock

16 providers

23


juxmem group

cluster group

data group

1 client: 100 iterations lock-put-unlock

16 providers

Data size: one byte

Replication degree = 3

Data manager not killed

24


Internal locking during replication Guarantee consistency during replica

creation Client is blocked

juxmem group

cluster group

data group

25


JXTA/Java Expensive underlying JXTA-level dynamic channel management

0

20

40

60

80

100

160 140 120 100 80 60 50 40 30

Provider volatility (seconds)

Rela

tive o

verh

ead (

%))

Reconfiguration time

11 seconds

Targeted volatility is weaker ( >> 80

seconds)

26

Summary

A hierarchical architecture for a data sharing service for the grid Hybrid approach: DSM and P2P systems Transparent access to data blocks Persistent storage Mutable data SC memory model for DRF access Active support for peer volatility

27

Ongoing Work

Studies Replication strategies for fault tolerance Consistency protocols in a dynamic

environment Co-scheduling computation and data

distribution Manage data-data affinity Integrate high-speed networks: Myrinet, SCI

Goal: build a Grid Data Service GDS project: http://www.irisa.fr/GDS Extensive evaluation on realistic codes

Actual execution: 100 nodes Simulation: 1,000-10,000 nodes

28

Questions?

29

Managing Memory Resources

cluster group

juxmem group

Size: 8 MB

Size: 8 MB

Memory provided

Provider advertisements: cluster group

Cluster advertisements: juxmem group

30

Allocation: How Does It Work?

21

3a

3a

3b

3b

4

5

6

8 MB?

Documents

Peer-to-Peer Distributed Shared Memory? Gabriel Antoniu, Luc Bougé, Mathieu Jan IRISA / INRIA & ENS Cachan/Bretagne France Dagstuhl seminar, October 2003