Transcript
Page 1: Reliability and State Machines in an Advanced Network Testbed

Reliability and State Machines in an Advanced

Network Testbed

Mac Newbold

School of ComputingUniversity of UtahMS Thesis Defense

April 5, 2004Advisor: Prof. Jay Lepreau

Page 2: Reliability and State Machines in an Advanced Network Testbed

April 5th, 2004 Mac Newbold - MS Thesis Defense 2

Distributed Systems• Distributed Systems are complex

– Many components– Distributed across multiple systems

• Component failures are relatively common– But should not cause system breakdown

• “A distributed system is one in which the failure of a computer you didn’t even know existed can render your own computer unusable.” – Leslie Lamport, quoted in CACM, June 1992

Page 3: Reliability and State Machines in an Advanced Network Testbed

April 5th, 2004 Mac Newbold - MS Thesis Defense 3

Our Context: Emulab• Emulab is an advanced network testbed• Complex time- and space-shared system• System dynamically reconfigures nodes and

network links to create “experiments”• Key architectural feature: Central Database

– System uses DB for storage, communication

• Complex system with many different scripts and programs on clients and servers

Page 4: Reliability and State Machines in an Advanced Network Testbed

April 5th, 2004 Mac Newbold - MS Thesis Defense 4

Emulab Background

• First prototype in April 2000 (10 nodes)• In production since Oct. 2000 (40 nodes)• Early versions weren’t perfect

– Reliability problems– Experiments of limited size– Inefficient use of resources

• Problem is becoming harder– 200 nodes, 400 remote, 2000 virtual nodes

Page 5: Reliability and State Machines in an Advanced Network Testbed

April 5th, 2004 Mac Newbold - MS Thesis Defense 5

Four Key Challenges

Emulab requirements:• Reliability• Scalability• Performance and Efficiency• Generality and Portability

Page 6: Reliability and State Machines in an Advanced Network Testbed

April 5th, 2004 Mac Newbold - MS Thesis Defense 6

1. Reliability

• Complex systems are hard to make reliable

• Many sources of unreliability:– Hardware – commodity PCs as nodes– Software – misconfiguration, bugs, etc.– Humans – can interrupt at any time

• More complexity and more parts mean higher chance that something is broken at any given time

Page 7: Reliability and State Machines in an Advanced Network Testbed

April 5th, 2004 Mac Newbold - MS Thesis Defense 7

2. Scalability

• Almost everything a testbed provides is harder to provide at a larger scale

• Larger scale requires more resources• If throughput doesn’t increase, things

slow down at larger scale• Increased load adversely affects

reliability• Practical scalability limited by

reliability, performance and efficiency

Page 8: Reliability and State Machines in an Advanced Network Testbed

April 5th, 2004 Mac Newbold - MS Thesis Defense 8

3. Performance and Efficiency

• One direct requirement on performance:– Emulab is used in an interactive style

• System tasks must complete in “a few minutes”

• Indirect requirements:– Scalability requirement places high

demands on many system components– Maximize efficient resource utilization

• As many users/experiments as possible in the shortest time with the fewest resources

Page 9: Reliability and State Machines in an Advanced Network Testbed

April 5th, 2004 Mac Newbold - MS Thesis Defense 9

4. Generality and Portability

• Workload Generality– Wide variety of research, teaching,

development, and testing activities– Good support for minimally and non-

instrumented client OS’s and devices

• Model generality– Evolving system– New types of network devices

• Portable and non-intrusive client software

Page 10: Reliability and State Machines in an Advanced Network Testbed

April 5th, 2004 Mac Newbold - MS Thesis Defense 10

Summary of Challenges

• Three challenges are closely related– Fourth is a constraint more than a

challenge

• Reliability is key• Failure rates directly impact scalability,

performance, and efficiency• Generality and portability requirements

constrain any solution used to address the challenges

Page 11: Reliability and State Machines in an Advanced Network Testbed

April 5th, 2004 Mac Newbold - MS Thesis Defense 11

Thesis Statement

Enhancing Emulab with a flexible framework

based on state machines provides better monitoring and control, and improves thereliability, scalability, performance, efficiency, and generality of the system.

Page 12: Reliability and State Machines in an Advanced Network Testbed

April 5th, 2004 Mac Newbold - MS Thesis Defense 12

Outline

• Introduction, Background, and Challenges

• Thesis Statement• State Machines in Emulab

– Interactions, Control Models, stated– Node Boot State Machines

• Results• Related and Future Work• Conclusion

Page 13: Reliability and State Machines in an Advanced Network Testbed

April 5th, 2004 Mac Newbold - MS Thesis Defense 13

State Machines

• Also called Finite State Machines (FSMs), Finite State Automata (FSAs), or Statecharts in UML

• Well-known model• Simple• Explicit model• Rich and flexible• Easy to understand and visualize

Page 14: Reliability and State Machines in an Advanced Network Testbed

April 5th, 2004 Mac Newbold - MS Thesis Defense 14

Example State Machine• Three main parts:• States• Transitions

– Directional

• Events– Associated with

transitions – labels

• Stored in database– diagrams generated

automatically from DB

Page 15: Reliability and State Machines in an Advanced Network Testbed

April 5th, 2004 Mac Newbold - MS Thesis Defense 15

State Machines in Emulab

• Each state machine has a “type”– Currently three: node boot, node allocation,

and experiment status

• Multiple machines allowed within a type– Only in one state in one machine of a type

• States can have “timeouts” with actions– Timer starts when state is entered

• State “triggers” – Entry Actions

Page 16: Reliability and State Machines in an Advanced Network Testbed

April 5th, 2004 Mac Newbold - MS Thesis Defense 16

Direct Interaction

• Within a type, can take a transition from a state in one machine (or “mode”) to a state in another machine of that type– Known as “mode transition” in Emulab– Similar to hierarchical state machines

• Highlights similarities/symmetries– Most machines are variations of another

• Improved code reuse

Page 17: Reliability and State Machines in an Advanced Network Testbed

April 5th, 2004 Mac Newbold - MS Thesis Defense 17

Direct Interaction Example

Page 18: Reliability and State Machines in an Advanced Network Testbed

April 5th, 2004 Mac Newbold - MS Thesis Defense 18

Models of State Machine Control

• Centralized monitoring & control: stated – State changes submitted, checked for

correctness, applicable actions performed– Daemon tracks timeouts– Used for Node Boot state machines

• Distributed management of state machine– No central service enforcing correctness– No dependency on central service– Timeouts harder to implement

Page 19: Reliability and State Machines in an Advanced Network Testbed

April 5th, 2004 Mac Newbold - MS Thesis Defense 19

The stated State Daemon

• Listens for events continuously• State transitions cause database

updates• Invalid transitions cause notifications• Timeouts, timeout actions, triggers

configurable in DB for each state• Caching – only writer of node boot states• Modular design – dispatch events to

proper action handlers

Page 20: Reliability and State Machines in an Advanced Network Testbed

April 5th, 2004 Mac Newbold - MS Thesis Defense 20

Node Boot State Machines

• Nodes in Emulab self-configure• Monitored via state machines – stated

• “Normal” node boot machine• Variations – “Minimal” • Reloading node disks

Page 21: Reliability and State Machines in an Advanced Network Testbed

April 5th, 2004 Mac Newbold - MS Thesis Defense 21

Node Self-Configuration• Nodes send state events during booting to

allow progress to be monitored• “Global knowledge” inside state daemon

– Better decisions about recovery steps

• Finer granularity gives more information for recovery, allows for shorter timeouts

• Each OS image is associated with a “mode” (state machine) that describes its behavior

Page 22: Reliability and State Machines in an Advanced Network Testbed

April 5th, 2004 Mac Newbold - MS Thesis Defense 22

“Normal” Node Boot Machine

• Start in SHUTDOWN• DHCP, start OS booting• When Emulab-specific

configuration begins, enter TBSETUP

• ISUP when finished• In case of failure, can

retry from SHUTDOWN

Page 23: Reliability and State Machines in an Advanced Network Testbed

April 5th, 2004 Mac Newbold - MS Thesis Defense 23

Variations of Node Boot

• Example: MINIMAL• For OS images with

little or no special Emulab support

• ISUP generated by stated if necessary– Immediate or ping

• SilentReboot allowed in this mode

Page 24: Reliability and State Machines in an Advanced Network Testbed

April 5th, 2004 Mac Newbold - MS Thesis Defense 24

Reloading Node Disks

• Mode transition into RELOAD / SHUTDOWN

• RELOADDONE transitions into mode for newly-installed OS image

Page 25: Reliability and State Machines in an Advanced Network Testbed

April 5th, 2004 Mac Newbold - MS Thesis Defense 25

Reloading and Mode Transitions

Page 26: Reliability and State Machines in an Advanced Network Testbed

April 5th, 2004 Mac Newbold - MS Thesis Defense 26

Experiment Status State Machine

• Uses distributed model – Stored in database, but

not strictly enforced

• Documents life-cycle• Restricts user

interruption– Reduces a source of errors

• Can queue, activate, modify, restart, swap, or terminate an expt.

Page 27: Reliability and State Machines in an Advanced Network Testbed

April 5th, 2004 Mac Newbold - MS Thesis Defense 27

Node Allocation State Machine

• Distributed control model• Diagram documents the

way the states are used by the program, but not currently enforced

• Either reloads nodes with a custom image, or reboots them as members of the experiment

Page 28: Reliability and State Machines in an Advanced Network Testbed

April 5th, 2004 Mac Newbold - MS Thesis Defense 28

Results: Context

• Emulab in production 1 year before state machines were added

• In production 3 years since first stated– 650 users, 150 projects, 75 institutions

• 19 papers, top venues

– Over 155,000 nodes allocated in nearly 10,000 experiment instances

– 13 classes at 10 universities

• Emulab SW on 6 more testbeds, 4 planned

Page 29: Reliability and State Machines in an Advanced Network Testbed

April 5th, 2004 Mac Newbold - MS Thesis Defense 29

Results

• Anecdotal: (others in thesis) – Reliability/Performance: Preventing race

conditions– Generality: Graceful handling of custom

OS images– Generality: New node types

• Experiment:– Reliability/Scalability: Improved timeout

mechanisms

Page 30: Reliability and State Machines in an Advanced Network Testbed

April 5th, 2004 Mac Newbold - MS Thesis Defense 30

Reliability/Performance: Preventing Race Conditions

• Expt. ends, nodes move to holding expt., get reloaded, then freed while they boot

• Problem: Getting allocated while booting • Node appears unresponsive, gets forcefully

power cycled, corrupts FS on disk• Solution: don’t free immediately• Add trigger on next ISUP for a node that

finishes booting, that frees it when booted

Page 31: Reliability and State Machines in an Advanced Network Testbed

April 5th, 2004 Mac Newbold - MS Thesis Defense 31

Generality: Graceful Handling

of Custom OS Images• Users create custom OS images• Emulab client software is optional• Problem: Nodes don’t send state events• Solution: “Minimal” state machine• SHUTDOWN: maybe on server, optional• BOOTING: server side, trigger checks ISUP• ISUP: either node sends, or generated

when pingable, or generated immediately

Page 32: Reliability and State Machines in an Advanced Network Testbed

April 5th, 2004 Mac Newbold - MS Thesis Defense 32

Generality: New Node Types

• Emulab is always growing and changing• State machine model and our framework

are flexible to provide graceful evolution• We’ve added 5 new node types

– IXPs, wide-area, PlanetLab, vnodes, sim-nodes

• Mostly used existing machines• 2 new machines, slight variations• 1 change to stated to add a new trigger

Page 33: Reliability and State Machines in an Advanced Network Testbed

April 5th, 2004 Mac Newbold - MS Thesis Defense 33

Reliability/Scalability: Improved Timeout

Mechanisms• Before: reboot node, wait for it to boot– Static, 7 minute timeout

• Pragmatic – minimizes false positives/negatives• Avg. 4 min., but max. error-free boot is 15 min.• 11 minute delay is too long

• Improved: state machine monitoring– Fine-grained, context-sensitive timeouts– Faster error detection– Better monitoring and control

Page 34: Reliability and State Machines in an Advanced Network Testbed

April 5th, 2004 Mac Newbold - MS Thesis Defense 34

Reliability/Scalability:Improved Timeout

Mechanisms• Experiment: Measure expt. swap-in time, with and without the improvements– Synthetic but plausible scenario

• One node, loads an RPM (8 min. install)• Node reboots, timeout during RPM install

– Reboots again, timeout again, mark node dead

– Try twice per swap-in, 3 swap-in attempts– Total failure in 45 min., 3 nodes “dead”

Page 35: Reliability and State Machines in an Advanced Network Testbed

April 5th, 2004 Mac Newbold - MS Thesis Defense 35

Reliability/Scalability:Improved Timeout

Mechanisms• With state machines:– Timeouts: SHUTDOWN 2 min, BOOTING 3

min, TBSETUP 10 min

• Node reboots, enters BOOTING• 1 minute: Enters TBSETUP• 9 minutes: Enters ISUP, expt. ready• Succeeds, with no dead nodes or

retries• Cut time from 45 min. to 9 min. (80%)

Page 36: Reliability and State Machines in an Advanced Network Testbed

April 5th, 2004 Mac Newbold - MS Thesis Defense 36

Limitations and Issues

• stated is critical infrastructure– Another single point of failure– More system complexity, new bugs,

complicated debugging– Potential for scaling problems (none seen

yet)

• Simple heuristics for error detection– Send mail for invalid transitions

Page 37: Reliability and State Machines in an Advanced Network Testbed

April 5th, 2004 Mac Newbold - MS Thesis Defense 37

Summary of Results• Explicit model requires careful thought

– Improves design and implementation

• Visualization makes it easier to understand• Faster and more accurate error detection• Better reliability helps scalability/efficiency

– Bigger expts. possible, less overhead per expt.

• Flexibility for evolution, workload generality

Page 38: Reliability and State Machines in an Advanced Network Testbed

April 5th, 2004 Mac Newbold - MS Thesis Defense 38

Related Work

• “Standard” Finite State Automata – basics– Timed Automata – have global clock

• Message Sequence Charts (MSCs)– “Scenarios” – hierarchy, like

modes/machines

• UML Statecharts– States have entry actions – “triggers”– Hierarchical states – similar to modes– Can model Emulab’s timeouts

Page 39: Reliability and State Machines in an Advanced Network Testbed

April 5th, 2004 Mac Newbold - MS Thesis Defense 39

Future Work

• Further developing distributed control– Add monitoring, timeouts, triggers

• Better heuristics for error detection– Only flag clustered or related errors

• Implement more ideas from other systems– UML’s exit actions, guarded transitions, etc.

• Move code into database – i.e. triggers– Easier to modify, framework code vs. machine

Page 40: Reliability and State Machines in an Advanced Network Testbed

April 5th, 2004 Mac Newbold - MS Thesis Defense 40

Conclusion

Enhancing Emulab with a flexible framework

based on state machines provides better monitoring and control, and improves thereliability, scalability, performance, efficiency, and generality of the system.

Page 41: Reliability and State Machines in an Advanced Network Testbed

April 5th, 2004 Mac Newbold - MS Thesis Defense 41

Bonus Slides

Page 42: Reliability and State Machines in an Advanced Network Testbed

April 5th, 2004 Mac Newbold - MS Thesis Defense 42

Demonstrating Improvement• Currently: programs have their own retry

and timeout mechanisms for node reboots– No knowledge of progress, just completion– Can cause failures by forcing a reboot, which

can damage file systems on node disk• “New Way”: stated handles timeout and

retry during rebooting– Implemented, not installed– Knows if progress is being made– Programs simply wait for ISUP or failure event

Page 43: Reliability and State Machines in an Advanced Network Testbed

April 5th, 2004 Mac Newbold - MS Thesis Defense 43

Demonstrating Improvement (cont’d)

• These failures directly hurt reliability– Node failure can cause experiment setup

to fail

• Significant impact on scaling, performance, efficiency

• Maximum experiment size is limited by node failure rate– Failures make things take longer– A slower system means less efficient use

of resources

Page 44: Reliability and State Machines in an Advanced Network Testbed

April 5th, 2004 Mac Newbold - MS Thesis Defense 44

Demonstrating Improvement (cont’d)

• Compare current vs. new: failure rate, time to completion, etc.

• Test data; one of:– Historical experiments– Artificially high load– Fault injection, e.g. reboots

• Why new way should help:– Better knowledge for intelligent recovery

• Know when to wait longer and when to retry

– Shorter timeouts allow for early error detection

Page 45: Reliability and State Machines in an Advanced Network Testbed

April 5th, 2004 Mac Newbold - MS Thesis Defense 45

Future Work:Modeling Indirect Interaction

• Occur between machines of different types– Due to external relationships between the

entities tracked by each type of machine• Examples:

– Same entity may be tracked in two different types of machine• Nodes are in Boot and Allocation machines

– Other relationship between entities• Nodes may be “owned” or allocated to an

experiment – links Expt. Status and Node machines

Page 46: Reliability and State Machines in an Advanced Network Testbed

April 5th, 2004 Mac Newbold - MS Thesis Defense 46

Question and Answer

Page 47: Reliability and State Machines in an Advanced Network Testbed

April 5th, 2004 Mac Newbold - MS Thesis Defense 47

Conclusion

Enhancing Emulab with a flexible framework

based on state machines provides better monitoring and control, and improves thereliability, scalability, performance, efficiency, and generality of the system.


Recommended