17
Thomas Dreibholz Institute for Experimental Mathematics University of Duisburg-Essen, Germany [email protected] University of Duisburg-Essen, Institute for Experimental Mathematics Reliable Server Pooling A Novel IETF Architecture for Availability-Sensitive Services

Thomas Dreibholz Institute for Experimental Mathematics University of Duisburg-Essen, Germany [email protected] University of Duisburg-Essen, Institute

Embed Size (px)

Citation preview

Page 1: Thomas Dreibholz Institute for Experimental Mathematics University of Duisburg-Essen, Germany dreibh@iem.uni-due.de University of Duisburg-Essen, Institute

Thomas DreibholzInstitute for Experimental Mathematics

University of Duisburg-Essen, Germany

[email protected]

University of Duisburg-Essen, Institute for Experimental Mathematics

Reliable Server Pooling–

A Novel IETF Architecture for Availability-Sensitive Services

Page 2: Thomas Dreibholz Institute for Experimental Mathematics University of Duisburg-Essen, Germany dreibh@iem.uni-due.de University of Duisburg-Essen, Institute

Thomas Dreibholz Reliable Server Pooling – A Novel IETF Architecture for Availability-Sensitive Services P. 2

Table of Contents

What is Reliable Server Pooling? Prototype Demonstration Terminology and Protocols Motivation and Application Scenarios

Failure Detection Dynamic Pools “Unclean” Shutdowns Session Monitoring

Failover Mechanism Applying Client-Based State Sharing

Conclusion and Outlook

Thomas Dreibholz's Reliable Server Pooling Pagehttp://tdrwww.iem.uni-due.de/dreibholz/rserpool/

Thomas Dreibholz's Reliable Server Pooling Pagehttp://tdrwww.iem.uni-due.de/dreibholz/rserpool/

Page 3: Thomas Dreibholz Institute for Experimental Mathematics University of Duisburg-Essen, Germany dreibh@iem.uni-due.de University of Duisburg-Essen, Institute

Thomas Dreibholz Reliable Server Pooling – A Novel IETF Architecture for Availability-Sensitive Services P. 3

What is „Reliable Server Pooling“?Prototype Demonstration

Page 4: Thomas Dreibholz Institute for Experimental Mathematics University of Duisburg-Essen, Germany dreibh@iem.uni-due.de University of Duisburg-Essen, Institute

Thomas Dreibholz Reliable Server Pooling – A Novel IETF Architecture for Availability-Sensitive Services P. 4

Reliable Server Pooling (RSerPool)

Terminology: Pool Element (PE): Server Pool: Set of PEs PE ID: ID of a PE in a pool Pool Handle: Unique pool ID Handlespace: Set of pools Pool Registrar (PR) Pool User (PU): Client

Support for Existing Applications Proxy Pool User (PPU) Proxy Pool Element (PPE)

Protocols: ASAP (Aggregate Server Access Protocol) ENRP (Endpoint Handlespace Redundancy Protocol)

Page 5: Thomas Dreibholz Institute for Experimental Mathematics University of Duisburg-Essen, Germany dreibh@iem.uni-due.de University of Duisburg-Essen, Institute

Thomas Dreibholz Reliable Server Pooling – A Novel IETF Architecture for Availability-Sensitive Services P. 5

Session Failover usingClient-Based State Sharing

Necessary to handle failover:A new PE must be able to recover the

session state of the old PE

Simple solution for many applications:Usage of „state cookies“ [LCN2002]

Now part of the ASAP protocol!

Page 6: Thomas Dreibholz Institute for Experimental Mathematics University of Duisburg-Essen, Germany dreibh@iem.uni-due.de University of Duisburg-Essen, Institute

Thomas Dreibholz Reliable Server Pooling – A Novel IETF Architecture for Availability-Sensitive Services P. 6

What is a Pool Policy? A rule for the selection of the PEs Defined in our IETF Working Group draft (draft-ietf-rserpool-policies-07.txt)

Application of Policies Registrar: Creates PE list upon request by PU Pool User: Selection of a PE from the list Both according to the pool policies (pool-specific!)

Non-Adaptive Policies Stateless: Random (RAND) Stateful: Round Robin (RR) (Default policy, must be supported)

Adaptive Policy Least Used (LU)

Load definition is application-specific! Round robin among multiple least-loaded PEs

Server Selection Rules(Pool Policies)

Page 7: Thomas Dreibholz Institute for Experimental Mathematics University of Duisburg-Essen, Germany dreibh@iem.uni-due.de University of Duisburg-Essen, Institute

Thomas Dreibholz Reliable Server Pooling – A Novel IETF Architecture for Availability-Sensitive Services P. 7

The Application Model

Server– PE Capacity– Shared among sessions

(multi-tasking principle)

Client– Requests are generated

• Request Size (effort)• Request Interval (frequency)

– Waiting queue for requests– Sequential processing

System Utilization– PU:PE Ratio

– Provisioning for certain Target Utilization, e.g. 80%

yAvgCapacitrvalRquestInte

RquestSize

opuToPERatiizationsystemUtil *

Page 8: Thomas Dreibholz Institute for Experimental Mathematics University of Duisburg-Essen, Germany dreibh@iem.uni-due.de University of Duisburg-Essen, Institute

Thomas Dreibholz Reliable Server Pooling – A Novel IETF Architecture for Availability-Sensitive Services P. 8

Performance Metrics

Provider's Perspective“Does my server capacity gain revenue?”

Average Utilization of server resources [%]

User's Perspective“How much time is

needed to process

my requests?”

Avg. Handling Speed

[% of average

server capacity]

Depends on: Queuing Startup Server Failover

Page 9: Thomas Dreibholz Institute for Experimental Mathematics University of Duisburg-Essen, Germany dreibh@iem.uni-due.de University of Duisburg-Essen, Institute

Thomas Dreibholz Reliable Server Pooling – A Novel IETF Architecture for Availability-Sensitive Services P. 9

Dynamic Pools – A Proof of Conept

Ideal case: a “clean” shutdown PEs abort their session before

shutting down

Not critical ...

... except for extremely low MTBF

Round Robin: no stable rounds -> random

behaviour

Handling SpeedHandling Speed

Page 10: Thomas Dreibholz Institute for Experimental Mathematics University of Duisburg-Essen, Germany dreibh@iem.uni-due.de University of Duisburg-Essen, Institute

Thomas Dreibholz Reliable Server Pooling – A Novel IETF Architecture for Availability-Sensitive Services P. 10

“Unclean” Shutdowns

Re-processing effort increases (due to lost work)

Session monitoring is crucial: fast failure detection -> quick failover

Handling SpeedHandling SpeedUtilizationUtilization

Page 11: Thomas Dreibholz Institute for Experimental Mathematics University of Duisburg-Essen, Germany dreibh@iem.uni-due.de University of Duisburg-Essen, Institute

Thomas Dreibholz Reliable Server Pooling – A Novel IETF Architecture for Availability-Sensitive Services P. 11

Session Monitoring

Session monitoring is crucial

Various possible mechanisms Keep-Alives Part of application protocol

e.g. transaction timeouts

Endpoint Keep-Alive Monitoring Here: small impact When is it useful?

Short and frequent requests Minimizes startup time (see paper for details)

Handling SpeedHandling Speed

Page 12: Thomas Dreibholz Institute for Experimental Mathematics University of Duisburg-Essen, Germany dreibh@iem.uni-due.de University of Duisburg-Essen, Institute

Thomas Dreibholz Reliable Server Pooling – A Novel IETF Architecture for Availability-Sensitive Services P. 12

Using Client-Based State Sharing

More cookies -> less re-processing, better handling speed

But what about overhead?

Handling SpeedHandling SpeedUtilizationUtilization

Page 13: Thomas Dreibholz Institute for Experimental Mathematics University of Duisburg-Essen, Germany dreibh@iem.uni-due.de University of Duisburg-Essen, Institute

Thomas Dreibholz Reliable Server Pooling – A Novel IETF Architecture for Availability-Sensitive Services P. 13

Configuring a Useful Cookie Interval

Cookie size: a few bytes up to ~64K (limit)

Idea: For known MTBF (in request times):

set cookie interval to achieve a certain goodput (e.g. 98%)

Choice of goodput depending on application's requirements

=> Accepting a certain amount of re-processing work

Results: For realistic MTBF:

high goodput already at moderate cookie rate

overhead significantly rises for too-high goodput -> inefficient!

Cookies per RequestCookies per Request

Page 14: Thomas Dreibholz Institute for Experimental Mathematics University of Duisburg-Essen, Germany dreibh@iem.uni-due.de University of Duisburg-Essen, Institute

Thomas Dreibholz Reliable Server Pooling – A Novel IETF Architecture for Availability-Sensitive Services P. 14

Conclusion and Outlook

Conclusion RSerPool is the IETF's upcoming standard for service availability 3 basic server selection policies Failure detection mechanisms:

Session monitoring Endpoint keep-alives

Failover mechanism: Client-based state sharing

Future Work From simulation to reality:

Tests with our prototype implementation in the PlanetLab First results already available [KiVS2007]

Security analysis and robustness against DoS attacks

Page 15: Thomas Dreibholz Institute for Experimental Mathematics University of Duisburg-Essen, Germany dreibh@iem.uni-due.de University of Duisburg-Essen, Institute

Thomas Dreibholz Reliable Server Pooling – A Novel IETF Architecture for Availability-Sensitive Services P. 15

Thank You for Your Attention!Any Questions?

Visit Our Project Homepage:http://tdrwww.iem.uni-due.de/dreibholz/rserpool/

Thomas Dreibholz, [email protected]

To be continued ...To be continued ...

Page 16: Thomas Dreibholz Institute for Experimental Mathematics University of Duisburg-Essen, Germany dreibh@iem.uni-due.de University of Duisburg-Essen, Institute

Thomas Dreibholz Reliable Server Pooling – A Novel IETF Architecture for Availability-Sensitive Services P. 16

The RSerPool Protocol Stack

Aggregate Server Access Protocol (ASAP) PR PE: Registration, Deregistration and Monitoring by Home-PR (PR-H) PR PU: Server Selection, Failure Reports

Endpoint Handlespace Redundancy Protocol (ENRP) PR PR: Handlespace Synchronisation

ASAP is IETF's first

Session Layer standard!

ASAP is IETF's first

Session Layer standard!

Page 17: Thomas Dreibholz Institute for Experimental Mathematics University of Duisburg-Essen, Germany dreibh@iem.uni-due.de University of Duisburg-Essen, Institute

Thomas Dreibholz Reliable Server Pooling – A Novel IETF Architecture for Availability-Sensitive Services P. 17

Motivation

Motivation of RSerPool: Unified, application-independent solution for service availability Not available before => Foundation of the IETF RSerPool Working Group

Application Scenarios for RSerPool: Main motivation: Telephone Signalling (SS7) over IP Under discussion by the IETF:

Load Balancing Voice over IP (VoIP) with SIP IP Flow Information Export (IPFIX)

... and many more!

Requirements for RSerPool: “Lightweight” (low resource requirements, e.g. embedded devices!) Real-Time (quick failover) Scalability (e.g. to large (corporate) networks) Extensibility (e.g. by new server selection rules) Simple (automatic configuration: “just turn on, and it works!”)