41
Scalable Management for Networks and Services Rolf Stadler Laboratory for Communication Networks KTH Royal Institute of Technology Stockholm HP Laboratories, Palo Alto, March 31, 2003

Scalable Management for Networks and Services

  • Upload
    avian

  • View
    30

  • Download
    0

Embed Size (px)

DESCRIPTION

Scalable Management for Networks and Services. Rolf Stadler Laboratory for Communication Networks KTH Royal Institute of Technology Stockholm HP Laboratories, Palo Alto, March 31, 2003. Management station. A. A. A. A. A. A. node. Manager-Agent based management. - PowerPoint PPT Presentation

Citation preview

Page 1: Scalable Management  for Networks and Services

Scalable Management for Networks and Services

Rolf Stadler

Laboratory for Communication NetworksKTH Royal Institute of Technology

Stockholm

HP Laboratories, Palo Alto, March 31, 2003

Page 2: Scalable Management  for Networks and Services

node

Managementstation

Managementstation

Manager-Agent based management

node

A A A

M

results

download &execute

P

Management Program

Manager

Agent

•Centralized Control•Management protocols: SNMP, CMIP•Program runs on Management Station

•Decentralized Control•Program runs on network nodes

P

The Shift of a Management Paradigm

AP

A A

Page 3: Scalable Management  for Networks and Services

Router

Execution Environment

Management station

Management Program

navigation Code Server

Architecture for Pattern-based Management

Page 4: Scalable Management  for Networks and Services
Page 5: Scalable Management  for Networks and Services

Weaver—A Testbed for pattern-based Management

WANA

WANB

WANC

WAND

Management Station

Router A Router B Router C Router D

FastEthernet Switch

Page 6: Scalable Management  for Networks and Services
Page 7: Scalable Management  for Networks and Services
Page 8: Scalable Management  for Networks and Services
Page 9: Scalable Management  for Networks and Services

Simple Navigation Patterns

1

21

2

1

2

2

2

3

3

34

operation typical application navigation pattern

type 1: node-to-node 1 node control/monitor(get/set of variables)

type 2: visit all nodesalong a path/flow

1 flow/path controle.g.: traceroute, bottleneckdetection, signalling, VPNoperation

type 3: distribute agentto all nodes in subnet(parallel control)

subnet controle.g.: topology detection

type 4: visit all nodes insubnet (sequent ialcontrol)

12

3

4

5

subnet control, messagebroadcaste.g.: congestion locationdetection

Page 10: Scalable Management  for Networks and Services

Echo Pattern (expansion)

Page 11: Scalable Management  for Networks and Services

droot=1

Echo Pattern (expansion)

Page 12: Scalable Management  for Networks and Services

droot=2

Echo Pattern (expansion)

Page 13: Scalable Management  for Networks and Services

droot=3

Echo Pattern (expansion)

Page 14: Scalable Management  for Networks and Services

droot=4

Echo Pattern (expansion)

Page 15: Scalable Management  for Networks and Services

Echo Pattern

droot=5

Page 16: Scalable Management  for Networks and Services

droot=4

Echo Pattern (contraction)

Page 17: Scalable Management  for Networks and Services

droot=3

Echo Pattern (contraction)

Page 18: Scalable Management  for Networks and Services

droot=2

Echo Pattern (contraction)

Page 19: Scalable Management  for Networks and Services

droot=1

Echo Pattern (contraction)

Page 20: Scalable Management  for Networks and Services

Echo Pattern (contraction)

Page 21: Scalable Management  for Networks and Services

Echo Pattern (contraction)

Page 22: Scalable Management  for Networks and Services

The Echo Pattern

• Two phases of traversal– expansion phase: explorers flood network with requests

for local operations– contraction phase: echoes return and aggregate results

• Properties– Generates balanced traffic load– Traffic load depends on network topology,

not on speed of traversal– Time complexity increases linearly with network

diameter.

Page 23: Scalable Management  for Networks and Services

Examples of Echo-based Management

• Get information on topology

– compute the current number of leaf nodes, the connectivity distribution

– discover current topology within 10 hops of node x

• Get information on network state

– identify 10 most congested links

– compute distribution of link utilization, queue lengths

– identify sub topologies with highly loaded links

– find a resource R closest to node x

Page 24: Scalable Management  for Networks and Services

Pattern-based Management—An Engineering Approach to Decentralized Management

• A management program consists of– A navigation pattern (distr. graph traversal algorithm)– An operation on nodes– An aggregation function

• Relevance of this approach– Provides a basis to analyze management operation for

performance, scalability, robustness– Supports concept of re-usable patterns, hides

complexity

Page 25: Scalable Management  for Networks and Services

Composing Management Programs

Segall

Echo Patterns

Navigation Patterns

Chang

Skip

Wait

Scope

Multi

Echo Aggregators

Res. Disc.

Aggregators

Leaf Count

Load. Hist.

Conn. Hist.

C LI

HT

TP

X ML

SN MP Local Operations

Node Access

Management Program

Page 26: Scalable Management  for Networks and Services

Properties of Patterns

Echo Aggregators

Res. Disc.

Aggregators

Leaf Count

Load. Hist.

Conn. Hist.

C LI

HT

TP

X ML

SN MP

Node Access

Management Program

Segall

Simple Echo Robust Echo Others

Echo Patterns

Navigation Patterns

Chang

Skip

Wait

Scope

Multi

• A pattern can be used for many management operations. • A pattern can be chosen according to performance objectives.• A pattern hides the complexity of a distributed operation.• Network failures can be handled within patterns.• Code mobility can be controlled.

Page 27: Scalable Management  for Networks and Services

visitedi : boolean init false;Gi : set of integers init neighbors();parenti : integer init -1;

Echo(inmsg: bytes, from: integer) { Gi := Gi - from; if visitedi = false { parenti := from; visitedi := true;

OnInitiate(inmsg, outmsg); if Gi != empty dispatch(parenti, outmsg, i); } else OnAggregate(inmsg); if Gi = empty {

OnComplete(outmsg); if parenti >= 0 dispatch(parenti, outmsg, i); else OnTerminate(inmsg); }}

The Interface between Pattern and Aggregator

OnBegin

OnTerminate

OnInitiate OnComplete

OnAggregate

…av_loadi := av_load;…

…av_load := load();n:=1;…

…av_load := (av_load*n + av_loadj)/(n+1);n:=n+1;…

Page 28: Scalable Management  for Networks and Services

SIMPSON: A SIMple Pattern Simulator fOr Large Networks

0

200000

400000

600000

800000

1e+06

1.2e+06

0 1 2 3 4 5 6T

raff

ic (b

yte

s)

Time (secs)

Traffic vs Time for 221 node grid network

"trace1.txt"

Page 29: Scalable Management  for Networks and Services

Analyzing Management Operations

A

CD

E

F

B

A

B C

DEF

E D F E

A

B D FEC

Network Graph G=(V,E) Execution Graphs G’=(V’,E’)

Centralized Management Distributed Management

Star Pattern Echo Pattern

Page 30: Scalable Management  for Networks and Services

Traffic Complexity of Management Operations

Amount of traffic placed on the network during execution.

C t ra f f ic hopcount v' childk v' Iq Ir+

v' V' 0 k ch i ldcou n t v'

=

Ctrafficecho

Iq Ir+ Edegree G 2 V–

2------------------------------------------- 1+ +

=

Ctrafficstar Iq Ir+ hopcount v'ro ot v'

v' V'

=

Page 31: Scalable Management  for Networks and Services

Time Complexity of Management Operations

Time needed from invocation until completion of a operation.

C time C time v'root =

Ctime v' tc tr+ if childcount v' 0=

tc tr M v' + + otherwise

=

M v' max ktq 2+ hopcount v' childk v' tl Ctime childk v' + 1 k childcount v'

=

Ctimes tar

O V =

Ct imeecho

O d =

Page 32: Scalable Management  for Networks and Services

Performing Echo-based Operations on the Entire Internet

• Purpose is illustrating the scalability of echo-based operations.

• What we needed:

– Complexity analysis of pattern

– Estimation of Internet topological properties

• diameter

• connectivity distribution

• number of nodes

Page 33: Scalable Management  for Networks and Services

Estimated Performance of Echo-based Operation on the Internet

Assumptions:Process-level transmission time: 5msNetwork delay per hop: 4msMessage size: 1KBLocal operation: 500ms per executionDiameter of Internet: 34 hops

Echo Pattern Star Pattern

Aggregated Traffic 2.25 x1011 bytes 1.31 x 1012 bytes

Max Traffic on a Link 4'096 bytes 1.8 x 109 bytes

Completion Time 17.48 seconds 5.09 days

Page 34: Scalable Management  for Networks and Services

Active Node ManagerSource

Repository

BinariesRepository

Preprocessor

TransportAccessPoint

ExecutionEnvironment

ManagementOperationResults

DeviceManager

C++Compiler

Source, State

Source, State

SNMP sets

Management commands

Weaver Active Node

Source code,Active node management

Router

NodeState

LocalProgramStates

Source, State

Events

SNMP gets/traps

Source Code

results

Active Node Engine

Management Station

Page 35: Scalable Management  for Networks and Services

Suboperations in Weaver

Node A Node B

start

Execution (T1)

Serialization (T2)

Dispatch (T3)

Receiving (T4)

Loading (T5) or Instantiation (T6)

De-serialization (T7)

Execution (T1)

Serialization (T2)

Dispatch (T3)

Receiving (T4)Resolving (T8)

end

Tim

e

De-serialization (T7)Execution (T1)

TC1

TC2

Page 36: Scalable Management  for Networks and Services

Measuring Execution Times on Weaver

    Duration in ms Performed by Module

Execution (T1) 1.57 (σ = 0.48) Execution Environment

Serialization (T2) 3.46 (σ = 0.71) Execution Environment

Dispatch (T3) 1.67 (σ = 0.49) Transport Access Point

Receiving (T4) 0.62 (σ = 0.30) Transport Access Point

Loading (T5) 23.42 (σ = 0.70) Execution Environment

Instantiation (T6) 0.77 (σ = 0.015) Execution Environment

De-serialization (T7) 2.04 (σ = 0.49) Execution Environment

Resolving (T8) 0.15 (σ = 0.001) Execution Environment

Communications Delay (TC) 4.04 (σ = 0.10) ---

Page 37: Scalable Management  for Networks and Services

Estimating Execution Times of Echo-based Operations on Weaver

Page 38: Scalable Management  for Networks and Services

Skip EchoSkipEcho(inmsg: bytes from: integer) {

if visitedi = false { parenti := from; visitedi := true; OnInitiate(inmsg, outmsg, i); Gi = up_neighbors() - from; if Gi != empty dispatch(parenti, outmsg, i); } else { Gi = Gi - from; OnAggregate(inmsg); } if completei != true and Gi = empty { OnComplete(outmsg); completedi := true; if parenti >= 0 dispatch(parenti, outmsg, i); else OnTerminate(inmsg); }}alarm(type: {failure, recovery}, affected: integer){ if visitedi = true { if type = failure { Gi := Gi - affected if completei != true and Gi = empty { completei := true; OnComplete(outmsg); if parenti >= 0 dispatch(parenti, outmsg, i); else OnTerminate(inmsg); } } }}

Wait EchoSkipEcho(inmsg: bytes from: integer) {

if visitedi = false { parenti := from; visitedi := true; OnInitiate(inmsg, outmsg, i); Gi = up_neighbors() - from; if Gi != empty dispatch(parenti, outmsg, i); } else { Gi = Gi - from; OnAggregate(inmsg); } if completei != true and Gi = empty { OnComplete(outmsg); completedi := true; if parenti >= 0 dispatch(parenti, outmsg, i); else OnTerminate(inmsg); }}alarm(type: {failure, recovery}, affected: integer){ if visitedi= true { if type == failure { Gi = Gi - affected Bi = Bi + affected if completei != true and Gi = empty { completei := true; OnComplete(outmsg); if parenti >= 0 dispatch(parenti, outmsg, i) else OnTerminate(inmsg); } } else { if affected is in Bi { Bi = Bi - affected Gi = Gi + affected } } }}

Designing Robust PatternsPlain EchoEcho(inmsg: bytes, from: integer) { Gi := Gi - from; if visitedi = false { parenti := from; visitedi := true; OnInitiate(inmsg, outmsg); if Gi != empty dispatch(parenti, outmsg, i); } else OnAggregate(inmsg); if Gi = empty { OnComplete(outmsg); if parenti >= 0 dispatch(parenti, outmsg, i); else OnTerminate(inmsg); }}

Page 39: Scalable Management  for Networks and Services

100

120

140

160

180

200

220

52.5 53 53.5 54 54.5 55 55.5

Cove

rage (

nodes)

Time (mins)

Coverage Vs Time for skipecho

MTTF=3.683 hrsMTTF=7.367 hrsMTTF=11.05 hrs

MTTF=14.733 hrsMTTF=29.467 hrsMTTF=73.67 hrs

Network Coverage vs. Execution Timefor Skip Echo

MTTR=1 minMTTR=11 minMTTR 0

MTTR inf

MTTF = 3.6 hrsMTTF = 7.3 hrsMTTF = 11.0 hrsMTTF = 14.7 hrsMTTF = 29.4 hrsMTTF = 73.6 hrs

Page 40: Scalable Management  for Networks and Services

Current and Planned Work

• Self-organizing, adaptable Networks and Systems:Patterns for routing and dynamic construction of network control structures. (Constantin Adam)

• WQL: A table-based Network Query Language on Weaver.

(Koon-Seng Lim)

• Policy-based Management: Patterns for distribution and dynamic re-computation of policies.(Alberto Gonzalez)

Page 41: Scalable Management  for Networks and Services

Literature on this Work

• K.S. Lim, R. Stadler: “Weaver—Realizing a scalable management paradigm on commodity routers,” Eighth IFIP/IEEE International Symposium on Integrated Network Management (IM 2003), Colorado Springs, Colorado, USA, March 24-28, 2003.

• K.S. Lim and R. Stadler: "Developing pattern-based management programs," IFIP/IEEE International Conference on Management of Multimedia Networks and Services (MMNS 2001), Chicago, IL, October 29 - November 1, 2001.

• K.S. Lim and R. Stadler: "A navigation pattern for scalable Internet management,"IFIP/IEEE International Symposium on Integrated Network Management (IM 2001), Seattle,Washington, 14-18 May, 2001.

• R. Kawamura and R. Stadler: "A middleware architecture for active distributed management of IP networks, "IEEE/IFIP Network Operations and Management Symposium (NOMS 2000), Honolulu, Hawaii, April 10-14, 2000.