Document

Super-Peer-Based Routing and Clustering Strategies

forRDF-Based Peer-To-Peer Networks

Alexander Löser

Technische Universität Berlin, Germany

Wolfgang Nejdl, Martin Wolpers, Wolf Siberski, Christoph Schmitz, Mario Schlosser, Ingo Brunkhorst

Learning Lab Lower Saxony,Hannover/Karlsruhe, Germany

3

PADLR: Personalized Access to Digital Learning Resources

Heterogeneous Applicatio

ns Repositori

es Platforms

4

Edutella: Introduction

Main Goal: Achieve interoperability between heterogeneous metadata-driven (e-learning) systems

Provides metadata only, not the resources resources are fetched via http

Foundations Semantic Web Peer-to-Peer Federated Databases

Open source project (http://edutella.jxta.org) Uses other OSS: JXTA Platform, Jena, JUnit, Ant

Uses: Xerces, Jetty, ICU4J, XIndice, ...

http://edutella.jxta.org/





6

Schema-Based Peer-to-Peer Networks

User-definable schemas Structured schemas Query language

(system list not complete)

No central control Node autonomy Self organization

Database Systems

P2P

Sys

tem

s

Schema-based P2PSystems

schema-based

peer-to-peer

CANCHORD

DIRECTCONNECT

GNUTELLA

KAZAAP-GRID

NAPSTER

AMOSIIOBJECTGLOBE

TSIMMISTUKWILA

CHATTY WEB

EDUTELLA

PIAZZA

ANY RDBMSCONCEPTBASE

ONTOBROKER

fixedschema/

keywords

key

local distributed

7

Problem and Approach

Broadcasting all queries to all information sources obviously doesn‘t scale

Problem: How to distribute queries in a scalable fashion?

Optimal solution: distribute a query only to peers which have results for it

Approach Use Super-Peer network Introduce Query Routing Indices

8

Super-Peer Networks

Observation: Peers vary significantly in availability, bandwith, processing power, etc.

Create network backbone from highly available and powerful peers to distribute load better.

9

0

0

10

1

11

0

2

2 2

2

SP1

Super-Peer Topology

Super-peers are arranged as HyperCuP

Broadcast needs n-1 messages, log2(n) hops

High connectivity, resilient against node failures

10

Routing Indices

On joining the network, each peer provides self-description

Based on this information, super-peers maintain indexes of schemas/schema elements used at each peer

Super-peer/peer indices Super-peer/super-peer indices

Index Granularity Schema Property Property + value range Property + individual values

11

0

0

10

1

11

0

2

2 2

2

SP1

SP3

SP2

SP4

P0

P1

P4

P2 P3

dc:languagelom:contextdc:subject

„de“„undergrad“ccs:sw-eng

<s><s><s>



<r><r><r>

dc:subject ccs:ethernet<p> dc:subject ccs:clientserver<p>

Index Sample

...

dc SP1, SP3,SP4

lom SP1,SP4

SP2

lom:context P1

dc:subject ccs:sw-eng P1

dc:language „de“ P1

...

SP1

12

0

0

10

1

11

0

2

2 2

2

SP1

SP3

SP2

SP4

P0

P1

P4

P2 P3



<s><s><s>



<r><r><r>

dc:subject ccs:ethernet<p> dc:subject ccs:clientserver<p>

Query Routing Sample

Find any resource with dc:subject=ccs:sw-eng and lom:context=“undergrad”

13

Clustering

If peers are randomly assigned to super-peers, we often still have to broadcast queries within the super-peer network

Two approaches: Static: super-peer administrators define constraints which

peers have to fulfill to be accepted Dynamic: based on query statistics, peers are continually

reassigned to optimize query distribution

Work in progress

14

Schema Mapping

Peers may use different schemas to annotate their resources

Use federated database techniques for mapping Super-peers acts as Mediators Mapping rules have to be specified manually

15

Super-peer/Peer implementationB

ind

Ser

vice

Ad

v

Qu

ery

Ser

vice

Ad

v

Ro

urt

ing

Ser

vice

Ad

v

QueryService

JXTAEndpoint

Routing Service

Peer Service Registry

BindingService

ExecutionPool

JXTAEndpoint

ExecutionPool

JXTA Endpoint

SP/SPRouting

Processor

ExecutionPool

ExecutionPool

SP/PRouting

Processor

Bind Service Routing Service Query Service

SP/P RIs

QueryProcessor

BindingProcessor

To

po

log

y S

ervi

ce A

dv

JXTAEndpoint

TopologyService

Topology Service

SP/SP RIs

PeerSuper-Peer

16

Further Work

Quantitative evaluation (by simulation) Exploration of clustering approaches Integration of other mediation techniques

Documents

Document