Scalable Management for Networks and Services

Preview:

DESCRIPTION

Scalable Management for Networks and Services. Rolf Stadler Laboratory for Communication Networks KTH Royal Institute of Technology Stockholm HP Laboratories, Palo Alto, March 31, 2003. Management station. A. A. A. A. A. A. node. Manager-Agent based management. - PowerPoint PPT Presentation

Citation preview

Scalable Management for Networks and Services

Rolf Stadler

Laboratory for Communication NetworksKTH Royal Institute of Technology

Stockholm

HP Laboratories, Palo Alto, March 31, 2003

node

Managementstation

Managementstation

Manager-Agent based management

node

A A A

M

results

download &execute

P

Management Program

Manager

Agent

•Centralized Control•Management protocols: SNMP, CMIP•Program runs on Management Station

•Decentralized Control•Program runs on network nodes

P

The Shift of a Management Paradigm

AP

A A

Router

Execution Environment

Management station

Management Program

navigation Code Server

Architecture for Pattern-based Management

Weaver—A Testbed for pattern-based Management

WANA

WANB

WANC

WAND

Management Station

Router A Router B Router C Router D

FastEthernet Switch

Simple Navigation Patterns

1

21

2

1

2

2

2

3

3

34

operation typical application navigation pattern

type 1: node-to-node 1 node control/monitor(get/set of variables)

type 2: visit all nodesalong a path/flow

1 flow/path controle.g.: traceroute, bottleneckdetection, signalling, VPNoperation

type 3: distribute agentto all nodes in subnet(parallel control)

subnet controle.g.: topology detection

type 4: visit all nodes insubnet (sequent ialcontrol)

12

3

4

5

subnet control, messagebroadcaste.g.: congestion locationdetection

Echo Pattern (expansion)

droot=1

Echo Pattern (expansion)

droot=2

Echo Pattern (expansion)

droot=3

Echo Pattern (expansion)

droot=4

Echo Pattern (expansion)

Echo Pattern

droot=5

droot=4

Echo Pattern (contraction)

droot=3

Echo Pattern (contraction)

droot=2

Echo Pattern (contraction)

droot=1

Echo Pattern (contraction)

Echo Pattern (contraction)

Echo Pattern (contraction)

The Echo Pattern

• Two phases of traversal– expansion phase: explorers flood network with requests

for local operations– contraction phase: echoes return and aggregate results

• Properties– Generates balanced traffic load– Traffic load depends on network topology,

not on speed of traversal– Time complexity increases linearly with network

diameter.

Examples of Echo-based Management

• Get information on topology

– compute the current number of leaf nodes, the connectivity distribution

– discover current topology within 10 hops of node x

• Get information on network state

– identify 10 most congested links

– compute distribution of link utilization, queue lengths

– identify sub topologies with highly loaded links

– find a resource R closest to node x

Pattern-based Management—An Engineering Approach to Decentralized Management

• A management program consists of– A navigation pattern (distr. graph traversal algorithm)– An operation on nodes– An aggregation function

• Relevance of this approach– Provides a basis to analyze management operation for

performance, scalability, robustness– Supports concept of re-usable patterns, hides

complexity

Composing Management Programs

Segall

Echo Patterns

Navigation Patterns

Chang

Skip

Wait

Scope

Multi

Echo Aggregators

Res. Disc.

Aggregators

Leaf Count

Load. Hist.

Conn. Hist.

C LI

HT

TP

X ML

SN MP Local Operations

Node Access

Management Program

Properties of Patterns

Echo Aggregators

Res. Disc.

Aggregators

Leaf Count

Load. Hist.

Conn. Hist.

C LI

HT

TP

X ML

SN MP

Node Access

Management Program

Segall

Simple Echo Robust Echo Others

Echo Patterns

Navigation Patterns

Chang

Skip

Wait

Scope

Multi

• A pattern can be used for many management operations. • A pattern can be chosen according to performance objectives.• A pattern hides the complexity of a distributed operation.• Network failures can be handled within patterns.• Code mobility can be controlled.

visitedi : boolean init false;Gi : set of integers init neighbors();parenti : integer init -1;

Echo(inmsg: bytes, from: integer) { Gi := Gi - from; if visitedi = false { parenti := from; visitedi := true;

OnInitiate(inmsg, outmsg); if Gi != empty dispatch(parenti, outmsg, i); } else OnAggregate(inmsg); if Gi = empty {

OnComplete(outmsg); if parenti >= 0 dispatch(parenti, outmsg, i); else OnTerminate(inmsg); }}

The Interface between Pattern and Aggregator

OnBegin

OnTerminate

OnInitiate OnComplete

OnAggregate

…av_loadi := av_load;…

…av_load := load();n:=1;…

…av_load := (av_load*n + av_loadj)/(n+1);n:=n+1;…

SIMPSON: A SIMple Pattern Simulator fOr Large Networks

0

200000

400000

600000

800000

1e+06

1.2e+06

0 1 2 3 4 5 6T

raff

ic (b

yte

s)

Time (secs)

Traffic vs Time for 221 node grid network

"trace1.txt"

Analyzing Management Operations

A

CD

E

F

B

A

B C

DEF

E D F E

A

B D FEC

Network Graph G=(V,E) Execution Graphs G’=(V’,E’)

Centralized Management Distributed Management

Star Pattern Echo Pattern

Traffic Complexity of Management Operations

Amount of traffic placed on the network during execution.

C t ra f f ic hopcount v' childk v' Iq Ir+

v' V' 0 k ch i ldcou n t v'

=

Ctrafficecho

Iq Ir+ Edegree G 2 V–

2------------------------------------------- 1+ +

=

Ctrafficstar Iq Ir+ hopcount v'ro ot v'

v' V'

=

Time Complexity of Management Operations

Time needed from invocation until completion of a operation.

C time C time v'root =

Ctime v' tc tr+ if childcount v' 0=

tc tr M v' + + otherwise

=

M v' max ktq 2+ hopcount v' childk v' tl Ctime childk v' + 1 k childcount v'

=

Ctimes tar

O V =

Ct imeecho

O d =

Performing Echo-based Operations on the Entire Internet

• Purpose is illustrating the scalability of echo-based operations.

• What we needed:

– Complexity analysis of pattern

– Estimation of Internet topological properties

• diameter

• connectivity distribution

• number of nodes

Estimated Performance of Echo-based Operation on the Internet

Assumptions:Process-level transmission time: 5msNetwork delay per hop: 4msMessage size: 1KBLocal operation: 500ms per executionDiameter of Internet: 34 hops

Echo Pattern Star Pattern

Aggregated Traffic 2.25 x1011 bytes 1.31 x 1012 bytes

Max Traffic on a Link 4'096 bytes 1.8 x 109 bytes

Completion Time 17.48 seconds 5.09 days

Active Node ManagerSource

Repository

BinariesRepository

Preprocessor

TransportAccessPoint

ExecutionEnvironment

ManagementOperationResults

DeviceManager

C++Compiler

Source, State

Source, State

SNMP sets

Management commands

Weaver Active Node

Source code,Active node management

Router

NodeState

LocalProgramStates

Source, State

Events

SNMP gets/traps

Source Code

results

Active Node Engine

Management Station

Suboperations in Weaver

Node A Node B

start

Execution (T1)

Serialization (T2)

Dispatch (T3)

Receiving (T4)

Loading (T5) or Instantiation (T6)

De-serialization (T7)

Execution (T1)

Serialization (T2)

Dispatch (T3)

Receiving (T4)Resolving (T8)

end

Tim

e

De-serialization (T7)Execution (T1)

TC1

TC2

Measuring Execution Times on Weaver

    Duration in ms Performed by Module

Execution (T1) 1.57 (σ = 0.48) Execution Environment

Serialization (T2) 3.46 (σ = 0.71) Execution Environment

Dispatch (T3) 1.67 (σ = 0.49) Transport Access Point

Receiving (T4) 0.62 (σ = 0.30) Transport Access Point

Loading (T5) 23.42 (σ = 0.70) Execution Environment

Instantiation (T6) 0.77 (σ = 0.015) Execution Environment

De-serialization (T7) 2.04 (σ = 0.49) Execution Environment

Resolving (T8) 0.15 (σ = 0.001) Execution Environment

Communications Delay (TC) 4.04 (σ = 0.10) ---

Estimating Execution Times of Echo-based Operations on Weaver

Skip EchoSkipEcho(inmsg: bytes from: integer) {

if visitedi = false { parenti := from; visitedi := true; OnInitiate(inmsg, outmsg, i); Gi = up_neighbors() - from; if Gi != empty dispatch(parenti, outmsg, i); } else { Gi = Gi - from; OnAggregate(inmsg); } if completei != true and Gi = empty { OnComplete(outmsg); completedi := true; if parenti >= 0 dispatch(parenti, outmsg, i); else OnTerminate(inmsg); }}alarm(type: {failure, recovery}, affected: integer){ if visitedi = true { if type = failure { Gi := Gi - affected if completei != true and Gi = empty { completei := true; OnComplete(outmsg); if parenti >= 0 dispatch(parenti, outmsg, i); else OnTerminate(inmsg); } } }}

Wait EchoSkipEcho(inmsg: bytes from: integer) {

if visitedi = false { parenti := from; visitedi := true; OnInitiate(inmsg, outmsg, i); Gi = up_neighbors() - from; if Gi != empty dispatch(parenti, outmsg, i); } else { Gi = Gi - from; OnAggregate(inmsg); } if completei != true and Gi = empty { OnComplete(outmsg); completedi := true; if parenti >= 0 dispatch(parenti, outmsg, i); else OnTerminate(inmsg); }}alarm(type: {failure, recovery}, affected: integer){ if visitedi= true { if type == failure { Gi = Gi - affected Bi = Bi + affected if completei != true and Gi = empty { completei := true; OnComplete(outmsg); if parenti >= 0 dispatch(parenti, outmsg, i) else OnTerminate(inmsg); } } else { if affected is in Bi { Bi = Bi - affected Gi = Gi + affected } } }}

Designing Robust PatternsPlain EchoEcho(inmsg: bytes, from: integer) { Gi := Gi - from; if visitedi = false { parenti := from; visitedi := true; OnInitiate(inmsg, outmsg); if Gi != empty dispatch(parenti, outmsg, i); } else OnAggregate(inmsg); if Gi = empty { OnComplete(outmsg); if parenti >= 0 dispatch(parenti, outmsg, i); else OnTerminate(inmsg); }}

100

120

140

160

180

200

220

52.5 53 53.5 54 54.5 55 55.5

Cove

rage (

nodes)

Time (mins)

Coverage Vs Time for skipecho

MTTF=3.683 hrsMTTF=7.367 hrsMTTF=11.05 hrs

MTTF=14.733 hrsMTTF=29.467 hrsMTTF=73.67 hrs

Network Coverage vs. Execution Timefor Skip Echo

MTTR=1 minMTTR=11 minMTTR 0

MTTR inf

MTTF = 3.6 hrsMTTF = 7.3 hrsMTTF = 11.0 hrsMTTF = 14.7 hrsMTTF = 29.4 hrsMTTF = 73.6 hrs

Current and Planned Work

• Self-organizing, adaptable Networks and Systems:Patterns for routing and dynamic construction of network control structures. (Constantin Adam)

• WQL: A table-based Network Query Language on Weaver.

(Koon-Seng Lim)

• Policy-based Management: Patterns for distribution and dynamic re-computation of policies.(Alberto Gonzalez)

Literature on this Work

• K.S. Lim, R. Stadler: “Weaver—Realizing a scalable management paradigm on commodity routers,” Eighth IFIP/IEEE International Symposium on Integrated Network Management (IM 2003), Colorado Springs, Colorado, USA, March 24-28, 2003.

• K.S. Lim and R. Stadler: "Developing pattern-based management programs," IFIP/IEEE International Conference on Management of Multimedia Networks and Services (MMNS 2001), Chicago, IL, October 29 - November 1, 2001.

• K.S. Lim and R. Stadler: "A navigation pattern for scalable Internet management,"IFIP/IEEE International Symposium on Integrated Network Management (IM 2001), Seattle,Washington, 14-18 May, 2001.

• R. Kawamura and R. Stadler: "A middleware architecture for active distributed management of IP networks, "IEEE/IFIP Network Operations and Management Symposium (NOMS 2000), Honolulu, Hawaii, April 10-14, 2000.

Recommended