2017 03-10 - vu amsterdam - testing safety critical systems

Testing Safety Critical Systems

Theory and Experiences

[email protected]://www.slideshare.net/Jaap_van_Ekris/

My Job

Your life’s goal will be to stay out of the newspapers

Gerard Duin (KEMA)

My Projects

Agenda

• The Goal

• The requirements

• The challenge

• Go with the process flow

– Development Process

– System design

– Testing Techniques

• Trends

• Reality

4

Specifications…

• Specifications are extremely detailed

• Sometimes up to 20 binders

• After years, you still find contradictions

Extreme bughunters?

Goals of testing safety critical systems

• Verify contractually agreed functionality

• Verify correct functional safety-behaviour

• Verify safety-behaviour during degraded and failure conditions

THE REQUIREMENTS

What is so different about safety critical systems?

Some people live on the edge…

How would you feel if you were getting ready to launch and knew you were

sitting on top of two million parts

-- all built by the lowest bidder on a government contract.

John Glenn

Actually, we all do…

We might have become overprotective…

The public is mostly unaware of risk…

Until it is too late…

• February 1st 1953

• Spring tide and heavy winds broke dykes

• Killed 1836 humans and 30.000 animals

The battle against flood risk…

• Cost €2.500.000.000

• The largest moving structure on the planet

• Defends– 500 km2 land

– 80.000 people

• Partially controlled by software

Nothing is flawless, by design…No matter how well the design has been:

• Some scenarios will be missed

• Some scenarios are too expensive to prevent:– Accept risk

– Communicate to stakeholders

When is software good enough?

• Dutch Law on storm surge barriers

• Equalizes risk of dying due to unnatural causes across the Netherlands

Risks have to be balanced…

Availability of the service Safety of the service

VS.

Oosterschelde Storm Surge Barrier

• Chance of

– Failure to close: 10-7

per usage

– Unexpected closure: 10-4 per year

To put things in perspective…

• Having a drunk pilot: 10-2 per flight

• Hurt yourself when using a chainsaw: 10-3 per use

• Dating a supermodel: 10-5 in a lifetime

• Drowning in a bathtub: 10-7 in a lifetime

• Being hit by falling airplane parts: 10-8 in a lifetime

• Being killed by lighting: 10-9 per lifetime

• Winning the lottery: 10-10 per lifetime

• Your house being hit by a meteor: 10-15 per lifetime

• Winning the lottery twice: 10-20 per lifetime

Small chances do happen…

Risk balance does change over time...

9/11...• Identified a

fundamental (new) risk to ATC systems

• Changed the ATC system dramatically

• Doubled our safetycritical scenario’s

Are software risks acceptable?

Software plays a significant role...

The industry statistics are against us…• Capers-Jones: at least 2 high severity

errors per 10KLoc

• Industry concensus is that software will never be more reliable than

– 10-5 per usage

– 10-9 per operating hour

THE CHALLENGE

Why is testing safety critical systems so hard?

The value of testing

Program testing can be used to show the presence of bugs, but never to show

their absence!

Edsger W. Dijkstra

Is just testing enough?

• 64 bits input isn’t thatuncommon

• 264 is the global riceproduction in 1000 years, measured in individualgrains

• Fully testing all binaryinputs on a simple 64-bits stimilus response system once takes 2 centuries

Just testing isn’t enough!

THE SOFTWARE DEVELOPMENT PROCESS

Quality and reliability start at conception, not at testing…

IEC 61508: Safety Integrity Level and acceptable risk

DO-178B (Avionics) does the same

Level Impact Target Reliability (INDICATION)

A Catestrophic 10-9/flight hour

B Dangerous 10-7/flight hour

C Major problem 10-5/flight hour

D Small 10-3/flight hour

E No effect

IEC 61508: A process for safety critical functions

SYSTEM DESIGN

What do safety critical systems look like and what are their most important drivers?

Design Principles

• Risk analysis drives design (decissions)

• Safety first (production later)

• Fail-to-safe

• There shall be no single source of (catastrophic) failure

Simplicity is prerequisite for reliability

Edsger W. Dijkstra

A simple design of a storm surge barrier

Relais

(€10,00/piece)

Waterdetector

(€17,50)

Design documentation

(Sponsored by Heineken)

Risk analysis

Relais failureChance: small

Cause: aging

Effect: catastophic

Waterdetector failsChange: Huge

Oorzaken: Rust, driftwood,

seaguls (eating, shitting)

Effect: Catastophic

Measurement errorsChance: Collossal

Causes: Waves, wind

Effect: False Positive

Broken cableChance: Medium

Cause: digging, seaguls

Effect: Catastophic

System Architecture

Risk analysis

Typical risks identified

• Components making the wrong decissions

• Power failure

• Hardware failure of PLC’s/Servers

• Network failure

• Ship hitting water sensors

• Human maintenance error

41

Risk ≠ system crash• Understandability of

the GUI

• Wrongful functional behaviour

• Data accuracy

• Lack of response speed

• Tolerance towards unlogical inputs

• Resistance to hackers

Usability of a GUI is key to safety

Systems do misbehave...

Systems can be late…

Some systems defy controll

• 2nd order systems aren’tcontrollable

• Are found in the field a lot

– Heating systems

– Overpressure systems in tunnels

Systems aren’t your only problem

StuurX: Component architecture design

Stuurx::Functionality, initial global design

Init

Start_D“Start” signal to Diesels

Wacht

Waterlevel < 3 meter

Waterlevel> 3 meter

W_O_D

“Diesels ready”

Sluit_?“Close Barrier”

Waterlevel

Stuurx::Functionality, final global design

Stuurx::Functionality, Wait_For_Diesels, detailed design

VERIFICATION

What is getting tested, and how?

Design completion...

An example of safety critical components

IEC 61508 SIL4: Required verification activities

Design Validation and Verification

• Peer reviews by– System architect– 2nd designer– Programmers– Testmanager system testing

• Fault Tree Analysis / Failure Mode and Effect Analysis

• Performance modeling• Static Verification/ Dynamic Simulation by

(Twente University)

Programming (in C/C++)

• Coding standard:

– Based on “Safer C”, by Les Hutton

– May only use safe subset of the compiler

– Verified by Lint and 5 other tools

• Code is peer reviewed by 2nd developer

• Certified and calibrated compiler

Unit tests

• Focus on conformance to specifications• Required coverage: 100% with respect to:

– Code paths

– Input equivalence classes

• Boundary Value analysis• Probabilistic testing• Execution:

– Fully automated scripts, running 24x7

– Creates 100Mb/hour of logs and measurement data

• Upon bug detection– 3 strikes is out After 3 implementation errors it is build by another developer

– 2 strikes is out Need for a 2nd rebuild implies a redesign by another designer

Representative testing is difficult

Integration testing

• Focus on– Functional behaviour of chain of components– Failure scenarios based on risk analysis

• Required coverage– 100% coverage on input classes

• Probabilistic testing• Execution:

– Fully automated scripts, running 24x7, speed times 10– Creates 250Mb/hour of logs and measurement data

• Upon detection– Each bug Rootcause-analysis

Redundancy is a nasty beast• You do get functional

behaviour of your entiresystem

• It is nearly impossible tosee if all componentsare working correctly

• Is EVERYTHING workingok, or is it the safetynet?

61

System testing

• Focus on– Functional behaviour– Failure scenarios based on risk analysis

• Required coverage– 100% complete environment (simultation)– 100% coverage on input classes

• Execution:– Fully automated scripts, running 24x7, speed times 10– Creates 250Mb/hour of logs and measurement data

• Upon detection– Each bug Rootcause-analysis

Endurance testing

• Look for the “one in a million times” problem

• Challenge:

– Software is deterministic

– execution is not (timing, transmission-errors, system load)

• Have an automated script run it over and over again

Results of Endurance Tests

1.E-05

1.E-04

1.E-03

1.E-02

1.E-01

1.E+00

4.35 4.36 4.37

Ch

ance

of

Failu

re (

Loga

rith

mic

Sca

le)

Platform Version

Reliability Growth of Function M, Project S

Acceptance testing

• Acceptance testing

1. Functional acceptance

2. Failure behaviour, all top 50 (FMECA) risks tested

3. A year of operational verification

• Execution:

– Tests performed on a working stormsurge barrier

– Creates 250Mb/hour of logs and measurement data

• Upon detection

– Each bug Root cause-analysis

A risk limit to testing• Some things are too

dangerous to test

• Some tests introduce more risks than theytry to mitigate

• There should always bea safe way out of a test procedure

Testing safety critical functions isdangerous...

GUI Acceptance testing• Looking for

– quality in use for interactive systems

– Understandability of the GUI

• Structural investigation of the performance of the man-machine interactions

• Looking for “abuse” by the users

• Looking at real-life handling of emergency operations

Avalanche testing• To test the capabilies of

alarming and control

• Usually starts with one simple trigger

• Generally followed by millions of alarms

• Generally brings your network and systems to the breaking point

http://wallpaper-s.org/42__Avalanche!,_Denali_National_Park_and_Preserve,_Alaska.htm

Crash and recovery procedure testing• Validation of system

behaviour after massive crash and restart

• Usually identifies many issues about emergency procedures

• Sometimes identifies issues around power supply

• Usually identifies some (combination of) systems incapable of unattended recovery...

Software will never be flawless

Production has its challenges…

• Are equipment and processes optimally arranged?

• Are the humans up to their task?

• Does everything perform as expected?

REALITY

What are the real-life challenges of a testmanager of safety critical systems?

Difference between theory and reality

Just following the rulebook doesn’t suffice!

Working together…

Requires true commitment to results…

• Romans put the architect under the arches when removing the scaffolding

• Boeing and Airbus put all lead-engineers on the first test-flight

• Dijkstra put his “rekenmeisjes” on the opposite dock when launching ships

It is about keeping your back straight…

• Thomas Andrews, Jr.

• Naval architect in charge of RMS Titanic

• He recognized regulations were insufficient for ship the size of Titanic

• Decisions “forced upon him” by the client:– Limit the range of double hulls

– Limit the number of lifeboats

• He was on the maiden voyage to spot improvements

• He knowingly went down with the ship, saving as many as he could

It requires a specific breed of people

The faiths of developers and testers are linked to safety

critical systems into eternity

It sometimes requires drastic measures

Conclusion• Stop reading newspapers

• Safety Critical Testing is a lot of work, making sure nothing happens

• Technically it isn’t that much different, we’re just more rigerous and use a specific breed of people....

Questions?

• Questions/remarks: [email protected]

• View again: http://www.slideshare.net/Jaap_van_Ekris/

mailto:[email protected]

Engineering

2017 03-10 - vu amsterdam - testing safety critical systems