23
ENABLING AND (IN MY HUMBLE OPINION) ESSENTIAL TECHNOLOGIES FOR DEPENDABILITY BENCHMARKING Project funded in part by the NSF Next Generation Software Program, the NSF Information Technology Research Program, and the Motorola Center for High-Availability System Validation http://www.crhc.uiuc.edu/PERFORM 39th IFIP Working Group Meeting, March 1, 2001 William H. Sanders University of Illinois at Urbana-Champaign [email protected]

ESSENTIAL TECHNOLOGIES FOR DEPENDABILITY ENABLING …koopman/ifip_wg_10_4_sig/external/01_02_28/05-sanders.pdfENABLING AND (IN MY HUMBLE OPINION) ESSENTIAL TECHNOLOGIES FOR DEPENDABILITY

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: ESSENTIAL TECHNOLOGIES FOR DEPENDABILITY ENABLING …koopman/ifip_wg_10_4_sig/external/01_02_28/05-sanders.pdfENABLING AND (IN MY HUMBLE OPINION) ESSENTIAL TECHNOLOGIES FOR DEPENDABILITY

ENABLING AND (IN MY HUMBLE OPINION)ESSENTIAL TECHNOLOGIES FOR DEPENDABILITY

BENCHMARKING

Project funded in part by the NSF Next Generation Software Program, the NSF Information TechnologyResearch Program, and the Motorola Center for High-Availability System Validation

http://www.crhc.uiuc.edu/PERFORM39th IFIP Working Group Meeting, March 1, 2001

William H. Sanders

University of Illinois at [email protected]

Page 2: ESSENTIAL TECHNOLOGIES FOR DEPENDABILITY ENABLING …koopman/ifip_wg_10_4_sig/external/01_02_28/05-sanders.pdfENABLING AND (IN MY HUMBLE OPINION) ESSENTIAL TECHNOLOGIES FOR DEPENDABILITY

Dependability Benchmarking (from Henrique Madeira)

Dependability benchmark (a working definition)

A test (or set of tests) to assess measures related to the behaviorof a computer system in the presence of faults (e.g., failuremodes, error detection coverage, error latency, diagnosisefficiency, recovery time, recovery losses, etc.), supporting theevaluation of dependability attributes (reliability, availability,safety).

Workload Systemunder test

Faultload

Measurements

Otherparameters

Models

DependabilityAttributes

Direct DependabilityBenchmark

Measures

Page 3: ESSENTIAL TECHNOLOGIES FOR DEPENDABILITY ENABLING …koopman/ifip_wg_10_4_sig/external/01_02_28/05-sanders.pdfENABLING AND (IN MY HUMBLE OPINION) ESSENTIAL TECHNOLOGIES FOR DEPENDABILITY

Another View and Justification for a Combined Approach(adapted from slide by J. Arlat)

Specificationof ProperService

SystemUnderTest

Specification ofFaults/Errors to

Inject

ConditionalFault-Tolerance

Measures

Fault Injectionof Prototype

Specification ofActivity Demanded

of System

UnconditionalDependability

Measures

Modeling

Examples:• Coverage• Time to Recovery

Examples:• Availability• Reliability• Performability

Page 4: ESSENTIAL TECHNOLOGIES FOR DEPENDABILITY ENABLING …koopman/ifip_wg_10_4_sig/external/01_02_28/05-sanders.pdfENABLING AND (IN MY HUMBLE OPINION) ESSENTIAL TECHNOLOGIES FOR DEPENDABILITY

What Technologies are Applicable?

Modeling

Simulation(Fault Injection onSimulated System)

Continuous State

Discrete Event (state)

Sequential Parallel

Analysis/Numerical

Deterministic Non-Deterministic

Probabilistic Non-Probabilistic

State-space-basedNon-State-space-based

(Combinatorial)

Possible Benchmarking Technologies

Measurement

Passive(no faultinjection)

Active(Fault Injectionon Prototype)

WithoutContact

WithContact

Hardware-Implemented

Software-Implemented

Stand-aloneSystems

Networks/Distributed

Systems

Page 5: ESSENTIAL TECHNOLOGIES FOR DEPENDABILITY ENABLING …koopman/ifip_wg_10_4_sig/external/01_02_28/05-sanders.pdfENABLING AND (IN MY HUMBLE OPINION) ESSENTIAL TECHNOLOGIES FOR DEPENDABILITY

Comments (Claims)

• If reliability, availability, and safety are to be evaluated, combinedmeasurement (including fault injection) and modeling techniques mustbe employed

• Many measurement- and modeling-based techniques exist, but workstill remains to make them applicable to large-scale, distributedsystems.

• Effective techniques themselves are not sufficient -- Common andagreed upon ways to apply these technologies (which I call testcases) must be defined

– System-neutral representations of work- fault-load are needed

– Methods must be developed to inject these work and fault-loadsthat can be applied across multiple systems

– System-neutral model generation methods must be developed totranslate conditional into unconditional measures

• Development of dependability benchmarks is extremely difficult!

Page 6: ESSENTIAL TECHNOLOGIES FOR DEPENDABILITY ENABLING …koopman/ifip_wg_10_4_sig/external/01_02_28/05-sanders.pdfENABLING AND (IN MY HUMBLE OPINION) ESSENTIAL TECHNOLOGIES FOR DEPENDABILITY

Current Projects on Fault Injection and Modeling(www.crhc.uiuc.edu/PERFORM)

Loki - Fault injection based on a measure-driven partial global viewof system state, time, and previously injected faults

Distributed system fault injector that permits fault injections andmeasure collections based on a partial global view of systemstate, yielding statistically sound estimates of distributed systemdependability

Möbius - Multi-faceted performance/dependability validationframework

Infrastructure for building domain independent/specificperformance dependability analysis tools which support multiplemodel specification, composition, connection, and solutionmethods

Page 7: ESSENTIAL TECHNOLOGIES FOR DEPENDABILITY ENABLING …koopman/ifip_wg_10_4_sig/external/01_02_28/05-sanders.pdfENABLING AND (IN MY HUMBLE OPINION) ESSENTIAL TECHNOLOGIES FOR DEPENDABILITY

Loki: Experimental Validation of Distributed Systems

LAN2

LokiRuntime

SystemUnderTest

Communicationvia Probes

LokiRuntime

SystemUnderTest

Communicationvia Probes

LokiRuntime

SystemUnderTest

Communicationvia Probes

LokiRuntime

SystemUnderTest

Communicationvia Probes

LAN1

N2

N3N1

N1

N4

N4

N3

N3

• Targeted injection of faults in particular system states, defined by thestate of multiple system components

• Statistically-significant interpretation of experiment results yieldingcoverage, recovery times, and performance related measures

Page 8: ESSENTIAL TECHNOLOGIES FOR DEPENDABILITY ENABLING …koopman/ifip_wg_10_4_sig/external/01_02_28/05-sanders.pdfENABLING AND (IN MY HUMBLE OPINION) ESSENTIAL TECHNOLOGIES FOR DEPENDABILITY

Loki Concepts, Architecture and Data Flow

. . .Loki RuntimeSystemUnderTest

Loki RuntimeSystemUnderTest

Loki RuntimeSystemUnderTest

Loki Runtime

SystemUnderTest

Probe

Inject Fault

StateMachine

Notifications

Recorder

StateMachineTransport

FaultParser

LAN1

LAN2

. . .

Local Timelines

Offline ClockSynchronization

Uses timestamp datacollected before andafter the experiment

Single Global Timeline

To LAN2

MeasureAnalysis

Measures3

7

11

1

N

2

N21

Page 9: ESSENTIAL TECHNOLOGIES FOR DEPENDABILITY ENABLING …koopman/ifip_wg_10_4_sig/external/01_02_28/05-sanders.pdfENABLING AND (IN MY HUMBLE OPINION) ESSENTIAL TECHNOLOGIES FOR DEPENDABILITY

Loki State Machine SpecificationINIT

ELECT

FOLLOW LEAD

EXIT CRASH

ERROR

FOLLOWER

EXIT

INIT_DONE

LEADER

CRASH

EXIT

ERROR

State Machine for Each Node

STATE sylvester Persian heathcliff

INIT heathcliff Heathcliff sylvester

ELECT persian, heathcliff sylvester, heathcliff persian, sylvester

FOLLOW - - -

LEAD - - -

CRASH - - -

EXIT persian, heathcliff sylvester, heathcliff persian, sylvester

Notify Lists for each of the State Machines

Page 10: ESSENTIAL TECHNOLOGIES FOR DEPENDABILITY ENABLING …koopman/ifip_wg_10_4_sig/external/01_02_28/05-sanders.pdfENABLING AND (IN MY HUMBLE OPINION) ESSENTIAL TECHNOLOGIES FOR DEPENDABILITY

Loki Fault Specification

Node name Fault name Boolean expression Freq.

sylvester sfault1 ((sylvester:INIT)&(heathcliff:INIT)) once

sylvester sfault2 ((sylvester:ELECT)&(heathcliff:ELECT)&(persian:ELECT)) once

heathcliff hfault1 ((sylvester:INIT)&(heathcliff:INIT)&(persian:INIT)) once

heathcliff hfault2 ((sylvester:ELECT)&(heathcliff:ELECT)) once

persian pfault1 ((sylvester:ELECT)&(heathcliff:ELECT)) once

Page 11: ESSENTIAL TECHNOLOGIES FOR DEPENDABILITY ENABLING …koopman/ifip_wg_10_4_sig/external/01_02_28/05-sanders.pdfENABLING AND (IN MY HUMBLE OPINION) ESSENTIAL TECHNOLOGIES FOR DEPENDABILITY

Output from Global Event Timeline Calculation

persian 0 l INIT_LEAF 48186103765 0.000000persian 0 h INIT_LEAF 48186103842 0.000077sylvester 0 l INIT_LEAF 48186246754 0.142989sylvester 0 h INIT_LEAF 48186246782 0.143017heathcliff 0 l INIT_LEAF 4818679543 0.691666heathcliff 0 h INIT_LEAF 48186795431 0.691666heathcliff 0 l ELECT 48421412646 235.308881heathcliff 0 h ELECT 48421412646 235.308881persian 0 l ELECT 48656277005 470.173240persian 0 h ELECT 48656277500 470.173735sylvester 0 l ELECT 48656278777 470.175012sylvester 0 h ELECT 48656279316 470.175551sylvester 1 l sfault2 48658583337 472.479572sylvester 1 h sfault2 48658583878 472.480113heathcliff 0 l FOLLOW 48659144990 473.041225heathcliff 0 h FOLLOW 48659144990 473.041225heathcliff 0 l EXIT 48659155725 473.051960heathcliff 0 h EXIT 48659155725 473.051960

Page 12: ESSENTIAL TECHNOLOGIES FOR DEPENDABILITY ENABLING …koopman/ifip_wg_10_4_sig/external/01_02_28/05-sanders.pdfENABLING AND (IN MY HUMBLE OPINION) ESSENTIAL TECHNOLOGIES FOR DEPENDABILITY

Results• Correct Fault Injection Probability as a Function of the Time Spent in the

ELECT State (Standard Linux Kernel - 10 ms Timeslice)

• Correction Fault Injection Probability as a Function of the Time Spent in theELECT State (Linux Kernel - 1 msec Timeslice)

0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

10 12 14 16 18 20 22 24 26 28 30

Time in Elect state (ms)

Fre

qu

en

cy

of

inje

cti

on

s

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

10 12 14 16 18 20 22 24 26 28 30

Time in Elect state (ms)

Pro

ba

bil

ity

of

pro

pe

r in

jec

tio

n

0

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0 2 4 6 8 10 12 14 16 18 20

Time in Elect state (ms)

Fre

qu

en

cy

of

inje

cti

on

s

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 2 4 6 8 10 12 14 16 18 20

Time in Elect state (ms)

Pro

ba

bil

ity

of

pro

pe

r in

jec

tio

n

Page 13: ESSENTIAL TECHNOLOGIES FOR DEPENDABILITY ENABLING …koopman/ifip_wg_10_4_sig/external/01_02_28/05-sanders.pdfENABLING AND (IN MY HUMBLE OPINION) ESSENTIAL TECHNOLOGIES FOR DEPENDABILITY

Loki Graphical Interface

Page 14: ESSENTIAL TECHNOLOGIES FOR DEPENDABILITY ENABLING …koopman/ifip_wg_10_4_sig/external/01_02_28/05-sanders.pdfENABLING AND (IN MY HUMBLE OPINION) ESSENTIAL TECHNOLOGIES FOR DEPENDABILITY

Möbius Project Research Goal

• Development of tools to predict the performance, dependability, andperformability of distributed computing/communication systems

– Such systems are complex combinations of:

• Computing hardware

• Networks

• Operating systems

• Software

(Note: goal is not to prove logical system properties, although thismay be possible within framework)

• We believe such tools can be realized by:

– Developing a framework/tool that supports multiple modelingformalisms, at multiple levels of detail and abstraction, and multiplemodel solution methods

– Developing new model representation and solution methods(within the framework, and implemented in the tool) that scale wellwith increasing system complexity

Page 15: ESSENTIAL TECHNOLOGIES FOR DEPENDABILITY ENABLING …koopman/ifip_wg_10_4_sig/external/01_02_28/05-sanders.pdfENABLING AND (IN MY HUMBLE OPINION) ESSENTIAL TECHNOLOGIES FOR DEPENDABILITY

Integrated Modeling Frameworks are Needed!

• No single formalism is best for representing all parts of a distributedcomputing/communication system

– Computer hardware, networks, protocols, and applications eachcall for a different representation

– Even within a “class” of application, different industry segmentsuse very different ways of representing a particular design

• No single solution method is adequate to solve all models

– Discrete-event simulation is efficient in many cases, but isextremely slow in others (e.g., significant, but rare events (likefaults and buffer overflows), or extreme system complexity)

• Research in new modeling methods and tools is significantlyhampered by the close link between model specification and modelsolution methods, and the closed nature of existing tools

Page 16: ESSENTIAL TECHNOLOGIES FOR DEPENDABILITY ENABLING …koopman/ifip_wg_10_4_sig/external/01_02_28/05-sanders.pdfENABLING AND (IN MY HUMBLE OPINION) ESSENTIAL TECHNOLOGIES FOR DEPENDABILITY

Modelers Need Heterogeneous Models

FaultDescription

Components Protocol TrafficControl/

Data FlowResource

Contention

VHDLFault TreeLOTOS,Estelle

QueuingModel

BlockDiagramLanguage

StochasticPetri Nets,

SANs

Hardware Network Application OS

Computer System

?

Page 17: ESSENTIAL TECHNOLOGIES FOR DEPENDABILITY ENABLING …koopman/ifip_wg_10_4_sig/external/01_02_28/05-sanders.pdfENABLING AND (IN MY HUMBLE OPINION) ESSENTIAL TECHNOLOGIES FOR DEPENDABILITY

Möbius Framework

• Model: An abstract representation of some system

• Formalism: A modeling language

• Framework: A “language” in which modeling languages may beexpressed

Formalism

Formalism

Formalism

Model

Model

Model

MöbiusFramework

Solver

Solver

Solver

Page 18: ESSENTIAL TECHNOLOGIES FOR DEPENDABILITY ENABLING …koopman/ifip_wg_10_4_sig/external/01_02_28/05-sanders.pdfENABLING AND (IN MY HUMBLE OPINION) ESSENTIAL TECHNOLOGIES FOR DEPENDABILITY

The Möbius Framework ...

• Expresses most existing modeling languages (except some simulationlanguages)

• Retains the ability for efficient solution

• Facilitates homogeneous modeling

• Is a vehicle for researching new model composition, connection,reward specification, and solution methods

Page 19: ESSENTIAL TECHNOLOGIES FOR DEPENDABILITY ENABLING …koopman/ifip_wg_10_4_sig/external/01_02_28/05-sanders.pdfENABLING AND (IN MY HUMBLE OPINION) ESSENTIAL TECHNOLOGIES FOR DEPENDABILITY

Möbius Framework Components

SolvableModel

AtomicModel

ComposedModel

SolvedModel

ConnectedModel

StateVariables

Properties

MöbiusExecution

Policy

FlexibleExecution

Policy

Well-SpecifiedChecker

ModelConnection

ActionsReward

VariablesModel

CompositionSolver Results

Page 20: ESSENTIAL TECHNOLOGIES FOR DEPENDABILITY ENABLING …koopman/ifip_wg_10_4_sig/external/01_02_28/05-sanders.pdfENABLING AND (IN MY HUMBLE OPINION) ESSENTIAL TECHNOLOGIES FOR DEPENDABILITY

• The abstract functional interface allows models to affect each other and beacted on by solvers without understanding model semantics

• A project manager maintains consistency when constructing new models,performance/dependability variables, and studies from existing models

Abstract Functional Interface Facilitates Interaction ofModels and Solution Engines

Model Specification,Composition,Reward Definition,and ConnectionFormalisms

Simulation, State-Space Generation,andAnalytic/NumericalSolvers

Abstract FunctionalInterfaceInteractions

Model and Solver Editor Interactions

Project ManagerAFI

Page 21: ESSENTIAL TECHNOLOGIES FOR DEPENDABILITY ENABLING …koopman/ifip_wg_10_4_sig/external/01_02_28/05-sanders.pdfENABLING AND (IN MY HUMBLE OPINION) ESSENTIAL TECHNOLOGIES FOR DEPENDABILITY

Möbius Tool Architecture

Linker

ModelSolution

SubmodelObject Code

ComposedModel Object

Code

PV Object CodeStudy EditorObject Code

Model Conn.Object Code

Model Editor PV Editor `Study Editor`Model ConnectorComposer

FormalismLibraries

SolverLibraries

Model Editor Interaction

Page 22: ESSENTIAL TECHNOLOGIES FOR DEPENDABILITY ENABLING …koopman/ifip_wg_10_4_sig/external/01_02_28/05-sanders.pdfENABLING AND (IN MY HUMBLE OPINION) ESSENTIAL TECHNOLOGIES FOR DEPENDABILITY

Graphical User Interfaces

Page 23: ESSENTIAL TECHNOLOGIES FOR DEPENDABILITY ENABLING …koopman/ifip_wg_10_4_sig/external/01_02_28/05-sanders.pdfENABLING AND (IN MY HUMBLE OPINION) ESSENTIAL TECHNOLOGIES FOR DEPENDABILITY

For More Information

www.crhc.uiuc.edu/PERFORM

Made possible by the dedicated work of:

Amy Christenson, Ramesh Chandra, Graham Clark, Tod Courtney, MichelCukier, Salam Derisavi, David Daly, Dan Deavours, Jay Doyle, KaustaubhJoshi, G. P. Kavanaugh, Ryan Lefever, John Sowder, Aaron Stillman,Patrick Webster, and Alex Williamson