24
1 Supermon: High Speed, Scalable Cluster Monitoring Matt Sottile ([email protected]) Los Alamos National Laboratory, Advanced Computing Laboratory Scalable Systems Software Meeting, 21-22 February 2002 February 21, 2002 M. Sottile / LANL-ACL

Sc Sup Sp - Oak Ridge National Laboratorygeist/Houston/Supermon.pdfPerio dic monitoring: Monitoring in sp ecial circumstances vs. con tin uous sampling at xed in terv als. The emphasis

  • Upload
    ngodieu

  • View
    213

  • Download
    0

Embed Size (px)

Citation preview

1

Superm

on:

Hig

hSpeed,Scala

ble

Clu

ster

Monito

ring

Matt

Sottile

(matt@

lanl.gov

)

Los

Alam

osN

ational

Lab

oratory,

Advan

cedC

omputin

gLab

oratory

Sca

lable

System

sSoftw

are

Meetin

g,21-2

2Febru

ary

2002

Feb

ruary

21,2002

M.Sottile

/LA

NL-A

CL

2

Outlin

e

1.In

troduction

2.A

rchitectu

re

3.Perform

ance

4.A

pplication

san

dFutu

reW

ork

5.C

onclu

sion

Feb

ruary

21,2002

M.Sottile

/LA

NL-A

CL

3

Intro

ductio

n

Monito

rin

g:

The

actof

observ

ing

asy

stemvia

aset

ofsen

sors.

•H

ard

ware

vsSoftw

are

monito

ring:

Trad

eoffs

inpertu

rbation

,

system

complex

ity,an

dcost.

•Rea

ctivevs

Period

icm

onito

ring:

Mon

itoring

insp

ecial

circum

stances

vs.

contin

uou

ssam

plin

gat

fixed

intervals.

The

emphasis

insu

perm

onis

balan

cing

high

-samplin

grates

with

min

imal

pertu

rbation

ofap

plication

s.

Feb

ruary

21,2002

M.Sottile

/LA

NL-A

CL

4

Pertu

rbatio

n(o

r,notm

onito

rin

gth

em

onito

r)

Given

avariab

leX

that

we

sample:

Xobs

=X

actu

al+

Xerror

Xerror

isin

troduced

by

the

mon

itoring

software.

Let

ε<

<X

actu

albe

the

max

imum

tolerable

error.If

Iis

am

easure

ofth

ein

trusiven

essof

the

sensor,

and

Xerror

isa

function

ofI,

then

we

wan

t:

|Xerror (

I)|

Sin

ceI

isrelated

toth

esam

plin

grate,

the

goalw

ithsu

perm

onis

toallow

the

max

imal

samplin

grate

s.t.th

eerror

isstill

lessth

anε.

This

drove

most

(all?)of

the

architectu

redesign

decision

s.

Feb

ruary

21,2002

M.Sottile

/LA

NL-A

CL

5

Superm

on

arch

itectu

re

Superm

onis

broken

dow

nin

tofou

rdistin

ctcom

pon

ents:

1.A

loadab

lekern

elm

odule

prov

idin

gdata

2.T

he

“mon

”sin

glenode

data

server

3.T

he

“superm

on”

data

concen

trator

4.C

lients

Sym

bolic

expressio

ns

formth

ebasis

ofth

eproto

colbin

din

geach

of

the

four

layers.

Feb

ruary

21,2002

M.Sottile

/LA

NL-A

CL

6

Su

perm

on

mo

n

/pro

c

mo

n

/pro

c

mo

n

/pro

c

sup

ermo

n

. . .N

od

e nN

od

e 2N

od

e 1

Clien

t

Figu

re1:

Arch

itecture

illustration

.

Feb

ruary

21,2002

M.Sottile

/LA

NL-A

CL

7

Sym

bolic

expre

ssions

and

superm

on

S-ex

pression

sw

erein

troduced

inth

e1950s

with

LIS

P.

sexpr

::=

(element

)

element

::=

atom

etail

|sexpr

etail

etail

::=

element

|sexpr

Com

plex

data

isea

syto

enco

de

ins-ex

pression

s,as

ism

eta-data.

Feb

ruary

21,2002

M.Sottile

/LA

NL-A

CL

8

Exam

ple

:A

tree

(nodeA

(nodeB

(node

C)

()

)(nodeD)

)

Feb

ruary

21,2002

M.Sottile

/LA

NL-A

CL

9

Loadable

Kern

elM

odule

sand/proc

The

loadab

lekern

elm

odule

under

Lin

ux

prov

ides

two

addition

al

entries

in/proc

forsu

perm

ondata

tobe

retrievedby

clients:

•/proc/sys/supermon/#

:T

his

contain

sa

descrip

tionof

the

mach

ine.

•/proc/sys/supermon/S

:T

his

contain

sth

edata

reflectin

gth

e

stateof

the

mach

ine

atth

etim

eit

isread

.

Feb

ruary

21,2002

M.Sottile

/LA

NL-A

CL

10

/proc/sys/supermon/#

(cpuin

fo(n

r4)

(user

nice

system

))

(avenru

n(n

r1)

(avenru

n0

avenru

n1

avenru

n2))

(pagin

g(n

r1)

(pgp

ginpgp

gout

psw

pin

psw

pou

t))

(switch

(nr

1)(sw

itch))

(time

(nr

1)(tim

estamp

jiffies))

(netin

fo(n

r6)

(nam

erx

bytes

rxpackets

rxerrs

rxdrop

rxfifo

rxfram

erx

compressed

rxm

ulticast

txbytes

txpackets

txerrs

txdrop

txfifo

txcolls

txcarrier

txcom

pressed

))

*nr

indica

testh

e“arity”

ofth

eva

riables

for

the

givenca

tegory.

Feb

ruary

21,2002

M.Sottile

/LA

NL-A

CL

11

/proc/sys/supermon/S

(cpuin

fo(u

ser44042292

8802643964765636

87093318)(n

ice

1719398511936486

1777815019162835)

(system

00

00)

)

(avenru

n(aven

run0

102)(aven

run1

373)(aven

run2

354))

(pagin

g(p

gpgin

174688564)(p

gpgou

t192768264)

(psw

pin

15)

(psw

pou

t408))

(switch

(switch

768964692))

(time

(timestam

p0x

ebdf04f8b

7)(jiffi

es0x

3b6b

37ef))

(netin

fo(n

ame

loeth

0eth

1eth

2m

yri0

myri1)

(rxbytes

1425399699

58153310270

5313147923421234062

0)(rx

packets

50841748924076

0235733640

5055710)

(rxerrs

00

00

00)

(rxdrop

00

00

00)

(rxfifo

00

00

00)

...)

Feb

ruary

21,2002

M.Sottile

/LA

NL-A

CL

12

mon

:P

rovid

ing

single

-node

data

via

TC

P

•M

onis

asim

ple

serverth

atru

ns

ona

node

and

servesdata

from/proc

toclien

tsvia

TC

Pso

ckets.

•T

he

data

format

betw

eenm

onan

dclien

tsis

asligh

tlym

odifi

ed

versionof

the/proc

format

contain

ing

structu

renecessary

for

scalability

and

composition

.

•M

onm

aintain

sper-clien

tfilters

(usin

gbitm

asks)

toallow

clients

toreq

uest

subsets

ofth

eavailab

ledata.

(cpuin

fo(n

ode

0x1

2344556)

(mask

0x1

)(n

m(1

(0x1

2344556)))

...)

Feb

ruary

21,2002

M.Sottile

/LA

NL-A

CL

13

superm

on

:T

he

data

concentra

tor

•Superm

oncon

nects

tom

ultip

lem

onclien

tsan

dcom

bin

esth

eir

data

streams

into

asin

glestream

forclien

ts.

•A

synch

ronou

sso

cketco

de

improves

perform

ance

toavoid

the

bad

effects

ofslow

ordead

clients.

•“S

peak

s”th

esam

eproto

colto

both

clients

and

mon

servers,

allowin

gsu

perm

onservers

tosp

eakto

other

superm

onservers.

•C

urren

tlya

major

focal

poin

tof

researchfor

improv

ing

max

imum

samplin

grates:

superm

ontop

ologies,data

filterin

g

and

reduction

,etc...

Feb

ruary

21,2002

M.Sottile

/LA

NL-A

CL

14

Clie

nts

•C

lients

aresim

ple:

aslon

gas

they

canunderstan

d

s-expression

s,th

eycan

use

the

data

foran

ypurp

ose.

•C

lients

determ

ine

the

samplin

grate

-if

aclien

tdoes

not

send

a

request,

then

mon

,su

perm

on,an

d/proc

areid

lean

dsleep

.

•M

eta-data

(#co

mm

and)

isavailab

leat

any

layerof

superm

on,

soa

client

canuse

man

ydata

sources

with

no

modifi

cation.

Later

we

giveexa

mple

clientapplica

tions...

Feb

ruary

21,2002

M.Sottile

/LA

NL-A

CL

15

Perfo

rmance

ofsu

perm

on

The

perform

ance

ofsu

perm

onlo

oks

atth

enum

ber

ofsam

ples

we

canach

ievein

one

second

-how

man

yH

zcan

we

sample

at?

We

look

atth

eperform

ance

ofsu

perm

onat

multip

lelevels.

1./proc

toa

client.

2./proc

toa

mon

toa

client

3./proc

toa

mon

toa

superm

onto

aclien

t.

For

info

rmatio

non

the

testbedused

for

mea

surin

gsu

permon

perform

ance,

plea

serefer

to“Life

with

Ed”,H

PD

C’0

2.

Feb

ruary

21,2002

M.Sottile

/LA

NL-A

CL

16

0500

10001500

20002500

30003500

samples

0.0

0.5

1.0time (seconds)

Com

paring /proc, mon, and superm

on performance

/procm

onsuperm

on

Figu

re2:

Acom

parison

ofth

elayers

:su

perm

on,m

on,/proc

Feb

ruary

21,2002

M.Sottile

/LA

NL-A

CL

17

Fla

shback

:T

he

old

way

tosa

mple

single

node

data

The

followin

gplot

issim

ply

show

nto

prove

that

our

new

techniq

ue

ism

uch

faster

than

the

oldm

ethod

used

togath

erdata

froma

single

node.

*T

he

oldw

ayto

sample

iseq

uivalen

tto

readin

gon

cedirectly

from

the/proc

entry

prov

ided

by

the

superm

onkern

elm

odule.

Feb

ruary

21,2002

M.Sottile

/LA

NL-A

CL

18

0500

10001500

20002500

30003500

samples

0.0

0.5

1.0time (seconds)

Com

paring rstat and proc performance

rstat/proc

Figu

re3:

Sam

plin

gw

ithth

eold

(rstat)vs

new

(/proc)

meth

ods.

Feb

ruary

21,2002

M.Sottile

/LA

NL-A

CL

19

Superm

on

perfo

rmance:

Som

enum

bers...

The

peak

samplin

grates

we

observe

ateach

layer:

•/proc

:3500H

z

•m

on:

1400Hz

•su

perm

on:

750Hz

We

improved

overold

erm

ethods

also:

•/proc

vs

RP

C.R

Statd

:3500H

z/275Hz

yield

s12x

improvem

ent

This

isgo

od

-w

ecan

sample

throu

ghth

reelayers

(/proc

to

superm

on)

fasterth

anth

eorigin

alrstat

could

atth

elow

estlayer!

Note

that

eachm

easurem

ent

involved

samples

contain

ing

ALL

possible

data

,w

hich

isa

largerdata

setth

anrstat

prov

ided

.

Feb

ruary

21,2002

M.Sottile

/LA

NL-A

CL

20

Perfo

rmance

contin

ued...

But

who

wan

tsto

mon

itorju

stone

nod

e?!

Nod

esSam

plin

gRate

5400H

z

10225H

z

20125H

z

100Fla

t66H

z

10010-n

ode

fanout

57Hz

10050-n

ode

fanout

35Hz

Tab

le1:

Scalin

gresu

ltsfor

Superm

on.

Feb

ruary

21,2002

M.Sottile

/LA

NL-A

CL

21

Applic

atio

ns

ofsu

perm

on

•Failu

repred

ictionan

din

telligent

application

reaction

•A

lgorithm

visu

alization

•Perform

ance

analy

sis

•R

apid

iden

tification

offailed

compon

ents

•Im

proved

system

statepresen

tationvia

/proc

•Sch

edulin

gto

ols(bp

rocin

progress)

Feb

ruary

21,2002

M.Sottile

/LA

NL-A

CL

22

Futu

rew

ork

1.D

ebug,

deb

ug,

deb

ug...

2.In

tegratehard

ware

sensors

(temperatu

re,fan

speed

s,voltages)

3.W

riteclien

tsfor

users

who

don

’tw

ant

tow

riteth

eirow

n

4.In

tegrateap

plication

levelm

onitorin

gdata

(such

asTA

U)

5.D

ataan

alysis

techniq

ues

-non

-trivial

prob

lem,particu

larlyin

“realtime”.

Feb

ruary

21,2002

M.Sottile

/LA

NL-A

CL

23

Conclu

sion

•Superm

onis

fast.

•E

venth

rough

multip

lelevels

offilterin

gan

dnetw

orktran

sport,

Superm

onis

fasterth

anex

isting

mon

itoring

tools.

•S-ex

pression

sare

signifi

cantly

better

than

custom

proto

colsif

we

wan

tgen

eralto

ols.

•Superm

onhas

already

revealedfeatu

resin

cluster

computin

g

that

arevery

interestin

g(M

PI

beh

avior,

actual

ben

efit(?

)of

heirarch

y,lim

itsof

Lin

ux).

Feb

ruary

21,2002

M.Sottile

/LA

NL-A

CL

24

For

more

info

rmatio

n...

Conta

ctin

fo:

superm

on@

lanl.gov

Web

page:

http

://ww

w.acl.lan

l.gov/su

perm

on/

For

perform

ance

and

architectu

raldetails,

acop

yof

the

pap

er

“Superm

on:

Clu

sterM

onitorin

gas

ifPerform

ance

Mattered

”,

(subm

ittedto

ICS’02),

canbe

prov

ided

upon

request.

Also,

curren

t

versionof

code

only

available

by

request.

Feb

ruary

21,2002

M.Sottile

/LA

NL-A

CL