21
Sericce and Resource Discovery Supports over P2P Overlays Emanuele Carlini, Massimo Coppola Patrizio Dazzi, Domenico Laforenza, Laura Ricci SERVICE AND RESOURCE DISCOVERY SUPPORTS OVER P2P OVERLAYS EMANUELE CARLINI, MASSIMO COPPOLA, DOMENICO LAFORENZA, PATRIZIO DAZZI, LAURA RICCI International Conference on Ultra Modern Telecommunications, ICUMT Saint Petersburg, October 12-14th, 2009 Università degli Studi di Pisa Dipartimento di Informatica

Sericce and Resource Discovery Supports over P2P Overlays Emanuele Carlini, Massimo Coppola Patrizio Dazzi, Domenico Laforenza, Laura Ricci SERVICE AND

  • View
    222

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Sericce and Resource Discovery Supports over P2P Overlays Emanuele Carlini, Massimo Coppola Patrizio Dazzi, Domenico Laforenza, Laura Ricci SERVICE AND

Sericce and Resource Discovery Supportsover P2P Overlays

Emanuele Carlini, Massimo CoppolaPatrizio Dazzi, Domenico Laforenza, Laura Ricci

SERVICE AND RESOURCE DISCOVERY SUPPORTS OVER P2P OVERLAYS

EMANUELE CARLINI, MASSIMO COPPOLA,DOMENICO LAFORENZA,

PATRIZIO DAZZI, LAURA RICCI

International Conference on Ultra Modern Telecommunications, ICUMT

Saint Petersburg, October 12-14th, 2009

Università degli Studi di Pisa Dipartimento di Informatica

Page 2: Sericce and Resource Discovery Supports over P2P Overlays Emanuele Carlini, Massimo Coppola Patrizio Dazzi, Domenico Laforenza, Laura Ricci SERVICE AND

Sericce and Resource Discovery Supportsover P2P Overlays

Emanuele Carlini, Massimo CoppolaPatrizio Dazzi, Domenico Laforenza, Laura Ricci

INTRODUCTION

• Grid environments exploit a huge amount of geographically scattered computing resources

• Main features of large computational grids

– Dynamic environment

– Huge amount of heterogeneous resources

– Complex middlewares for accessing the resources

• XtreemOS: a research project funded by the European Commission

– main goal: definition of an Open Source, Grid enabled Operating System

– scalable and transparent management of large computational platforms

– federation of several virtual organizations

– users exploit the distributed system through a standard operating system interface

Page 3: Sericce and Resource Discovery Supports over P2P Overlays Emanuele Carlini, Massimo Coppola Patrizio Dazzi, Domenico Laforenza, Laura Ricci SERVICE AND

Sericce and Resource Discovery Supportsover P2P Overlays

Emanuele Carlini, Massimo CoppolaPatrizio Dazzi, Domenico Laforenza, Laura Ricci

SRDS: SERVICE AND RESOURCE

DISCOVERY• SRDS: a basic service of XtreemOS providing a highly distributed directory service

• SRDS main features

– enables resource look-up and exploitation in a multi-VO environment

– hides the effect of scale when exploiting individual systems

– may be exploited by different clients

• other modules of XtreemOS

• applications

– supports different kind of queries

• key-based

• multi-attribute

• range queries over dynamic attributes

Page 4: Sericce and Resource Discovery Supports over P2P Overlays Emanuele Carlini, Massimo Coppola Patrizio Dazzi, Domenico Laforenza, Laura Ricci SERVICE AND

Sericce and Resource Discovery Supportsover P2P Overlays

Emanuele Carlini, Massimo CoppolaPatrizio Dazzi, Domenico Laforenza, Laura Ricci

SRDS ARCHITECTURE

• SRDS exploits a set of P2P overlays where each overlay includes nodes from different virtual organizations

• The choice of the P2P model enables

– scalability

– low overhead

– fault tolerance

– management of information in a dynamic environment

• SRDS services are exploited by different clients, each one with different requirements.

– to cope with the diversity of these requirements, several P2P overlays characterized by different features have been defined (Distributed Hash Tables, structured overlays,...)

Page 5: Sericce and Resource Discovery Supports over P2P Overlays Emanuele Carlini, Massimo Coppola Patrizio Dazzi, Domenico Laforenza, Laura Ricci SERVICE AND

Sericce and Resource Discovery Supportsover P2P Overlays

Emanuele Carlini, Massimo CoppolaPatrizio Dazzi, Domenico Laforenza, Laura Ricci

SRDS: THE ARCHITECTURE

Facade:

an easy-to-extend multiple interface protocols

Query Provider (QP):

set of modules for client query translation

Information Management Layer(IML):

common interface to DHT-like overlays

ADS(Application Directory Service) =

Facade+ QP + IML

RSS Resource Selection Service

a P2P overlayallowing scalable resource

location in large overlays

Scalaris , Overlay Weaver:

DHT with different characteristics

Page 6: Sericce and Resource Discovery Supports over P2P Overlays Emanuele Carlini, Massimo Coppola Patrizio Dazzi, Domenico Laforenza, Laura Ricci SERVICE AND

Sericce and Resource Discovery Supportsover P2P Overlays

Emanuele Carlini, Massimo CoppolaPatrizio Dazzi, Domenico Laforenza, Laura Ricci

SRDS MAIN MODULES: ADS AND RSS

RSS (Resource Selection Service) supports resource discovery through queries on constant value attributes

CPU = IA32, MEM 2[4GB;), BANDWIDTH [512Kb=s;), DISK [128GB;),OS fLinux 2.6.19-1.2895, . . . , Linux 2.6.20-1.2944}

ADS (Application Directory Service) supports complex queries over dynamic attributes

Example:

the RSS selects a set of resources matching whose static attributes match the query constraints.

the descriptors of these resources are stored in the ADS.

the dynamic state of the resources (for instance, current free memory) is monitored through the ADS

RSS acts a machete, while ADS acts like a 'bistury'

Page 7: Sericce and Resource Discovery Supports over P2P Overlays Emanuele Carlini, Massimo Coppola Patrizio Dazzi, Domenico Laforenza, Laura Ricci SERVICE AND

Sericce and Resource Discovery Supportsover P2P Overlays

Emanuele Carlini, Massimo CoppolaPatrizio Dazzi, Domenico Laforenza, Laura Ricci

RSS: RESOURCE SELECTION SERVICE

• Supports resource discovery through multi attribute range queries over a set of static attributes, i.e.constant-valued attributes, known at inizialization time.

• RSS main features

– each node represents its own attributes in the overlay

– no delegation of the resource information to other nodes, like in DHT-based approaches

– speed up resource location

Page 8: Sericce and Resource Discovery Supports over P2P Overlays Emanuele Carlini, Massimo Coppola Patrizio Dazzi, Domenico Laforenza, Laura Ricci SERVICE AND

Sericce and Resource Discovery Supportsover P2P Overlays

Emanuele Carlini, Massimo CoppolaPatrizio Dazzi, Domenico Laforenza, Laura Ricci

ADS: THE QUERY PROVIDER (QP)

• Query Provider Layer: provides a set of modules devoted to query translation

• Implements a set of algorithms for the interpretation of the queries of different SRDS clients

• For instance, a job directory service is required to monitor the state of the jobs of an application/VO

– when a new job is created, the client submits an AddJob to the SRDS

– the AddJob operation is interpreted by a QP modules which translates it into a sequence of operations on the underlying DHT

• Check of the existence of a proper job directory service, if it does not exist, it requires its creation

• Insertion of the job ID into the DHT

• Insertion further information about the jobs under proper keys to suppor inverse queries

– The QP makes all these steps transparent for the user

Page 9: Sericce and Resource Discovery Supports over P2P Overlays Emanuele Carlini, Massimo Coppola Patrizio Dazzi, Domenico Laforenza, Laura Ricci SERVICE AND

Sericce and Resource Discovery Supportsover P2P Overlays

Emanuele Carlini, Massimo CoppolaPatrizio Dazzi, Domenico Laforenza, Laura Ricci

ADS: THE INFORMATION MANAGEMENT LAYER (IML)

Namespaces defines the context where the key is used. For instance different name space for different job directories ADS (Application Directory Service)

– provides an implementation of namespaces over DHT

– receives from a QP module an abstract operation:

OPQP = { op, keyM, valueM, NSpace, ClientType, ClientID }

– provides an implementation of namespaces

– generates an operation for the underlying DHT in the proper namespace

OPDHT ={ op, keyD, valueD, auxinfo }

where valueD: generally equals valueM

keyD: may differ from keyM because of namespace implementation

auxinfo: data expiration timeouts, user-defined secrets,....

Page 10: Sericce and Resource Discovery Supports over P2P Overlays Emanuele Carlini, Massimo Coppola Patrizio Dazzi, Domenico Laforenza, Laura Ricci SERVICE AND

Sericce and Resource Discovery Supportsover P2P Overlays

Emanuele Carlini, Massimo CoppolaPatrizio Dazzi, Domenico Laforenza, Laura Ricci

EXPLOITING NAMESPACES: AN EXAMPLE

• Network coordinates (NC) embedding system embed latency such as round trip times among nodes into some geometric space

• Each node is assigned network coordinates in the geometric space

• Unmisured round trip times is estimated by computing the distance between two nodes in the geometric space

To support

• direct queries, i.e. given the IP of the nodes return its network coordinates

• inverse queries given the X/Y coordinate of the node, find the the IP of the 'nearest' neighbours'

the ADS

• exploits three different namespaces: IP, X, Y

• each namespace may be mapped on a different DHT or on the same DHT and may have different characteristics

Page 11: Sericce and Resource Discovery Supports over P2P Overlays Emanuele Carlini, Massimo Coppola Patrizio Dazzi, Domenico Laforenza, Laura Ricci SERVICE AND

Sericce and Resource Discovery Supportsover P2P Overlays

Emanuele Carlini, Massimo CoppolaPatrizio Dazzi, Domenico Laforenza, Laura Ricci

NAMESPACE IMPLEMENTATION

Different choices for the implementation of the namepsaces:

a different DHT for each namespace a set of namespaces on the same DHT

Page 12: Sericce and Resource Discovery Supports over P2P Overlays Emanuele Carlini, Massimo Coppola Patrizio Dazzi, Domenico Laforenza, Laura Ricci SERVICE AND

Sericce and Resource Discovery Supportsover P2P Overlays

Emanuele Carlini, Massimo CoppolaPatrizio Dazzi, Domenico Laforenza, Laura Ricci

NAMESPACE IMPLEMENTATION

Single Ring Approach:

• DHT key is prefixed by the an identifier of the name space

• main drawback: DHT features, like replication strategy, fault repair strategy,... cannot be tuned according to the name space

Multiple Ring Approach

• On demand ring creation

• Parameters and policies of the DHT ring are customized at ring set-up time

• Some rings may always remain active

– include essential key space, for instance resource directories

• Smaller rings may have a shorter lifespan

– application rings, for instance job directory for a given application,....

Page 13: Sericce and Resource Discovery Supports over P2P Overlays Emanuele Carlini, Massimo Coppola Patrizio Dazzi, Domenico Laforenza, Laura Ricci SERVICE AND

Sericce and Resource Discovery Supportsover P2P Overlays

Emanuele Carlini, Massimo CoppolaPatrizio Dazzi, Domenico Laforenza, Laura Ricci

NAMESPACE IMPLEMENTATION

• The Current version of the ADS exploits two different rings, based on two different DHT, Scalaris, Overlay Weaver

• Scalaris

– A transactional based DHT

– Provides consistent replication of data

• Overlay Weaver

– implements different DHT

Chord, Pastry, CAN,...

– define a routing layer common to

all the DHTs.

The Overlay Weaver Architecture

Page 14: Sericce and Resource Discovery Supports over P2P Overlays Emanuele Carlini, Massimo Coppola Patrizio Dazzi, Domenico Laforenza, Laura Ricci SERVICE AND

Sericce and Resource Discovery Supportsover P2P Overlays

Emanuele Carlini, Massimo CoppolaPatrizio Dazzi, Domenico Laforenza, Laura Ricci

COMPLEX QUERIES ON DHT

DHT supports only basic key-value queries

More complex queries may be submitted by the SRDS clients

Multidimensional range queries on dynamic attributes

Examples

exact match query: Arch.='x86' and CPU-Speed='3 Ghz' and RAM='256MB'

partial match queries: CPU-Speed='3 Ghz' and RAM='256MB' (and Arch.=*)

range queries 1Ghz<CPU-Speed<'3Ghz' and 512MB<RAM<1Gb

similarity queries (o nearest neighbour queries) require the definition of a metric in the attribute space the user submits an exact match query, which defines a point P in the

attribute space. P may not correspond to any resource. output: k resources nearest to P, according to the defined metric

Page 15: Sericce and Resource Discovery Supports over P2P Overlays Emanuele Carlini, Massimo Coppola Patrizio Dazzi, Domenico Laforenza, Laura Ricci SERVICE AND

Sericce and Resource Discovery Supportsover P2P Overlays

Emanuele Carlini, Massimo CoppolaPatrizio Dazzi, Domenico Laforenza, Laura Ricci

RANGE QUERY SUPPORT

– an approach based on the MAAN proposal

– exploits the Chord DHT

Resource pubblication

– Each resource is described by k pairs (ai, vi)

– A locality preserving hashing function maps the

value of each attribute onto the DHT

H(vi) = (vi - vimin) x (2m -1) / (vimax – vimin)

2m : dimension of the key space

The descriptor of each resource is published

onto k DHT nodes

SRDS supports multiattribute range queries

Page 16: Sericce and Resource Discovery Supports over P2P Overlays Emanuele Carlini, Massimo Coppola Patrizio Dazzi, Domenico Laforenza, Laura Ricci SERVICE AND

Sericce and Resource Discovery Supportsover P2P Overlays

Emanuele Carlini, Massimo CoppolaPatrizio Dazzi, Domenico Laforenza, Laura Ricci

RANGE QUERY SUPPORT

• Consider a multi attribute range query a1[v1l, v1u], ...ak [vkl,...vku]

• The hashing function maps the range of each attribute onto a DHT range

• Selectivity of an attribute

Si = 2m/ H(viu) – H(vil)

• The dominant attribute ai= [vil,..viu] with the highest selectivity is choosen.

• The query is sent to H(vil) and is propagated on a DHT arc A till it reaches

H(viu)

• Each node on the A checks if the query satisfies all the query constraints

• The results are collected along A and sent by the H(viu) to the querying node

Page 17: Sericce and Resource Discovery Supports over P2P Overlays Emanuele Carlini, Massimo Coppola Patrizio Dazzi, Domenico Laforenza, Laura Ricci SERVICE AND

Sericce and Resource Discovery Supportsover P2P Overlays

Emanuele Carlini, Massimo CoppolaPatrizio Dazzi, Domenico Laforenza, Laura Ricci

PUBLICATION OPTIMIZATION

• SRDS optimizes the publication process of the resources defined by MAAN

• Publication optimization: exploits soft state cache to store the routing results obtained during the publication process

• Routing on the DHT is avoided if the routing path to a node is stored in the cache

Page 18: Sericce and Resource Discovery Supports over P2P Overlays Emanuele Carlini, Massimo Coppola Patrizio Dazzi, Domenico Laforenza, Laura Ricci SERVICE AND

Sericce and Resource Discovery Supportsover P2P Overlays

Emanuele Carlini, Massimo CoppolaPatrizio Dazzi, Domenico Laforenza, Laura Ricci

PUBLICATION OPTIMIZATION

• A second optimization is defined to avoid the publication of 'unpopular' attributes

• Popularity of an attribute A = number of times A is chosen as dominant in a query

– depends on the query distribution

• Descriptors associated with low popularity attributes are updated with lower frequency

• Popularity is

– dinamically refined in a distributed fashion by the nodes receiving the queries

– estimated at target nodes receiving the query and sent back to publishing nodes by put-reply messages

Page 19: Sericce and Resource Discovery Supports over P2P Overlays Emanuele Carlini, Massimo Coppola Patrizio Dazzi, Domenico Laforenza, Laura Ricci SERVICE AND

Sericce and Resource Discovery Supportsover P2P Overlays

Emanuele Carlini, Massimo CoppolaPatrizio Dazzi, Domenico Laforenza, Laura Ricci

SRDS EVALUATION

• testing environment: Grid 5000 Platform, nodes belons to different Grid 5000 clusters• all nodes publish information every 30s• a large fraction of nodes run queries every 100 ms.

Page 20: Sericce and Resource Discovery Supports over P2P Overlays Emanuele Carlini, Massimo Coppola Patrizio Dazzi, Domenico Laforenza, Laura Ricci SERVICE AND

Sericce and Resource Discovery Supportsover P2P Overlays

Emanuele Carlini, Massimo CoppolaPatrizio Dazzi, Domenico Laforenza, Laura Ricci

JOB DIRECTORY SERVICE EVALUATION

• 20-120 nodes belonging to two clusters of the Grid 5K platform each node performs publications over the DHT at fixed 30 seconds rate time interval between different requests 200 milliseconds• Latency of different operations are measured

•AddJob requires a set of put/get operations•RequestJob: a single DHT get

Page 21: Sericce and Resource Discovery Supports over P2P Overlays Emanuele Carlini, Massimo Coppola Patrizio Dazzi, Domenico Laforenza, Laura Ricci SERVICE AND

Sericce and Resource Discovery Supportsover P2P Overlays

Emanuele Carlini, Massimo CoppolaPatrizio Dazzi, Domenico Laforenza, Laura Ricci

CONCLUSIONS

• SRDS: a service and resourse discovery support developed for the XtreemOs distributed operating system

• Provides scalable and customisable information query support over large platforms

• Future works:

– testing SRDS on a large computing platform

– dynamic definition of namespaces on different DHTs

– definition of hierarchical name spaces

– investigation of further strategies for range queries (multi attribute range and neighbours query)