44
Dependable Technologies For Critical Systems © 2011 Critical Software Railway Certification and RAM Calculations CSW Workshop on Dependability and Certification, Coimbra, Portugal, September 28th-29th, 2011

Railway

Embed Size (px)

DESCRIPTION

Safety

Citation preview

Page 1: Railway

Dependable Technologies For Critical Systems

© 2011 Critical Software

Railway Certification and RAM Calculations

CSW Workshop on Dependability and Certification,

Coimbra, Portugal, September 28th-29th, 2011

Page 2: Railway

2

© 2

011 C

ritical S

oft

ware

S.A

.

Contents

Railway Certification PART 1

Safety

PART 2

RAM PART 3

Page 3: Railway

Dependable

Technologies

For Critical

Systems

PART 1 – Railway Certification

Page 4: Railway

4

© 2

011 C

ritical S

oft

ware

S.A

.

Railway Certification Topics

IEC 61508 - Functional Safety

Safety Integrity Levels

CENELEC Standards – EN50126/8/9

EN50126 Lifecycle

Safety Cases Organisation

Organisation Independency

Page 5: Railway

5

© 2

011 C

ritical S

oft

ware

S.A

.

IEC 61508 Functional Safety

Page 6: Railway

6

© 2

011 C

ritical S

oft

ware

S.A

.

IEC 61508 Safety Integrity Level – Tolerable Hazard Rate

SIL Tolerable Hazard Rate 4 10-9 <= THR < 10-8

3 10-8 <= THR < 10-7

2 10-7 <= THR < 10-6

1 10-6 <= THR < 10-5

Page 7: Railway

7

© 2

011 C

ritical S

oft

ware

S.A

.

CENELEC Standards EN50126/8/9

EN 50126 Railway applications - Specification

and demonstration of reliability,

availability, maintainability and

safety (RAMS)

EN 50128 Railway applications -

Communications, signalling and

processing systems -

Software for railway control and

protection systems

EN 50129 Railway applications -

Communication, signalling and

processing

systems - Safety related electronic

systems for signalling

Page 8: Railway

8

© 2

011 C

ritical S

oft

ware

S.A

.

EN50126 Lifecycle

Phase 1: Concept

Phase 2: System Definition

Phase 3: Risk Analysis

Phase 4: System Requirements

Phase 5: Apportionment of System Requirements

Phase 6: Design and Implementation

Phase 8: Installation

Phase 9: System Validation

Phase 10: System acceptance

Phase 11: Operation and Maintenance

Phase 7: Manufacture

Phase 12: Performance Monitoring

Phase 13: Modification and Retrofit

Phase 14: De-commissioning

and Disposal

New

Lifecycle

Page 9: Railway

9

© 2

011 C

ritical S

oft

ware

S.A

.

EN50126 Lifecycle GPSC and GASC

Phase 1: Concept

Phase 2: System Definition

Phase 3: Risk Analysis

Phase 4: System Requirements

Phase 5: Apportionment of System Requirements

Phase 6: Design and Implementation

Phase 7: Manufacture

Generic Product Safety Case

Generic Application Safety Case

Page 10: Railway

10

© 2

011 C

ritical S

oft

ware

S.A

.

EN50126 Lifecycle SASC

Phase 8: Installation

Phase 9: System Validation

Phase 10: System acceptance

Phase 11: Operation and Maintenance

Phase 12: Performance Monitoring

Phase 13: Modification and Retrofit

Phase 14: De-commissioning

and Disposal

New

Lifecycle

Specific Application Safety Case

Page 11: Railway

11

© 2

011 C

ritical S

oft

ware

S.A

.

Safety Cases Organisation GPSC, GASC and SASC

Generic Product

Generic Application

Specific Application

Generic Product Safety Case

Generic Application Safety Case

System

Requirements

Specification

System

Requirements

Specification

System

Requirements

Specification

Safety

Requirements

Specification

Safety

Requirements

Specification

Safety

Requirements

Specification

Specific Application Safety Case

Application Design

Physical Implementati

on

Safety

Assessment

Report

Safety

Assessment

Report

Safety

Assessment

Report

Page 12: Railway

12

© 2

011 C

ritical S

oft

ware

S.A

.

Safety Cases

The HW boards composing a module and the “base SW”

that runs on the boards of the module represent a

generic product.

The “base SW” is intended as the part of the SW that

doesn’t change from customer to customer and therefore

normally includes the OS, the drivers, and the base SW

functionalities.

GENERIC PRODUCT

Generic Product Safety Case

Page 13: Railway

13

© 2

011 C

ritical S

oft

ware

S.A

.

Safety Cases

The generic application changes, typically, from customer to customer. It is defined by a set of HW modules combination (minimum and maximum number of a module, types of modules interconnection etc.) and by an application SW that specialize each module behaviour for a customer.

The generic application is normally implemented by application data.

GENERIC APPLICATION

Generic Application Safety Case

Page 14: Railway

14

© 2

011 C

ritical S

oft

ware

S.A

.

Safety Cases

The specific application specializes a generic application

for a specific usage (typically a single train between all

the trains of a customer fleet). This means that the

generic application is configurable and that the specific

application represents a specific configuration of it.

An object of the specific application configuration level

can be a voltage level of an input, a specific behavior of

the logic for a particular train, etc…

SPECIFIC APPLICATION

Specific Application Safety Case

Application Design

Physical Implementati

on

Page 15: Railway

15

© 2

011 C

ritical S

oft

ware

S.A

.

Organisation Independency

Project Manager

Dev. Team

Ver. Team

Val. Team

Project Manager

Dev. Team

Val. Team

Ver. Team

Assr

Assr

SIL 3 &

SIL 4

Page 16: Railway

Dependable

Technologies

For Critical

Systems

PART 2 – Safety

Page 17: Railway

17

© 2

011 C

ritical S

oft

ware

S.A

.

Safety Topics

Preliminary Hazard Analysis / Risk Analysis

Hazard Analysis

Hazard Log

Safety Case

Relation with other Safety Cases

Page 18: Railway

18

© 2

011 C

ritical S

oft

ware

S.A

.

Preliminary Hazard Analysis

Hazard Identification

Hazard Causes Identification

Hazard Consequences

Hazard Initial Risk Evaluation

Hazard Mitigation Recommendations

Hazard Final Risk Evaluation

Page 19: Railway

19

© 2

011 C

ritical S

oft

ware

S.A

.

Preliminary Hazard Analysis

System

Context

Application

Domain

Past

Experience

System

Hazards

Page 20: Railway

20

© 2

011 C

ritical S

oft

ware

S.A

.

Preliminary Hazard Analysis

Example of hazard consequences in the railway domain:

Collision

Derailment

Casualties

Injuries

Page 21: Railway

21

© 2

011 C

ritical S

oft

ware

S.A

.

Risk Analysis

2

System Analysis Hazard IdentificationHazard

Consequence

Initial Risk

EvaluationMitigation Actions

Final Risk

Evaluation

Risk Analysis Process

Check Hazard

Frequency

Make Qualitative

Risk Evaluation

Verify Hazard

SeverityRisk Value

Risk Quantification Process

Page 22: Railway

22

© 2

011 C

ritical S

oft

ware

S.A

.

Risk Analysis

Risk Evaluation Matrix

Frequency Risk Level

Frequent Undesirable Intolerable Intolerable Intolerable

Probable Tolerable Undesirable Intolerable Intolerable

Occasional Tolerable Undesirable Undesirable Intolerable

Remote Negligible Tolerable Undesirable Undesirable

Improbable Negligible Negligible Tolerable Tolerable

Incredible Negligible Negligible Negligible Negligible

Insignificant Marginal Critical Catastrophic

Severity

Page 23: Railway

23

© 2

011 C

ritical S

oft

ware

S.A

.

Hazard Analysis

Hazard Analysis Process

Preparation

· Analyse all Inputs;· Define Risk Analysis Methodoly;· Define Safety Criteria· Define Hazard Analysis

Properties.

Input

· System and Interface Requirements;

· Architecture Specification;· Preliminary Hazards Analysis; · Top level system activities.

Output

· Hazard Log;· System Safety Requirements;· Safety Exported Constraints;· Functional and Physical SIL

Allocation.

Execution

· Identify all foreseen hazards;· Identify causes and consequences

of each hazard;· Evaluate initial risk (frequency and

severity) of each hazard;· Define mitigations (both preventive

and protective) for each hazard;· Define external costumer

recomendations;· Evaluate final risk when mitigations

are applied

Architecture

Specification

System and

Interface

Requirements

Preliminary

Analysis

Identified

Hazards

Hazard Analysis

Top Level

System

Activities

Page 24: Railway

24

© 2

011 C

ritical S

oft

ware

S.A

.

Hazard Log

Property Description

ID-xxx A running number starting from 1

System Activity Activity to support the analysis of this hazard.

Architecture Item Sub-system where the hazard was identified

Function Name Function where the hazard was identified

Component Name Component where the hazard was identified

System State System state where the hazard was identified

Hazard Description Hazard Description

Hazard Cause Cause of Hazard

Hazard Effect Effects of the hazard in the system

Direct Consequence Description of the direct consequence of this hazard in the environment.

Frequency Frequency of the hazard occurrence

Severity Severity of the hazard

Risk Evaluation level Risk evaluation Level

Preventive Countermeasure Preventive Countermeasure

Protective Countermeasure Protective Countermeasure

Mitigated Consequence Description of the consequence of this hazard in the environment after applying the mitigation action.

Customer Recommendations Recommendations for the customer. Need to be transmitted to customer

Final Frequency Final Frequency of the hazard occurrence

Final Severity Final Severity of the hazard

Final Risk Evaluation Level Final Risk Evaluation Level

Application Conditions Application conditions code for the correct usage of the system in terms of safety.

Safety Requirement related Code FDT3_RS_SR_xxx : Code of Safety Requirement related to Hazard.

Hazard Status Status of the hazard.

Page 25: Railway

25

© 2

011 C

ritical S

oft

ware

S.A

.

Safety Case

Generic Product Safety Case

Part 6:

Conclusions

Part 5: Related Safety

Cases

Part 4: Technical Safety

Report

Part 3:Safety

Management

Report

Part 2:Quality

Management

Report

Part 1:Definition of the

System

Part 4:

Technical Safety Report

Section 6:Safety

Qualification

Tests

Section 5:Safety-related

Application

Conditions

Section 4:Operation with

External

Influences

Section 3:

Effects of Faults

Section 1:

Introduction

Section 2:Assurance of

Correct

Operation

Page 26: Railway

26

© 2

011 C

ritical S

oft

ware

S.A

.

Relation with other Safety Cases

Component 1

GPSC

Product 1

GPSC Component 2

GPSC

Component 3

GPSC

Page 27: Railway

Dependable

Technologies

For Critical

Systems

PART 3 – RAM

Page 28: Railway

28

© 2

011 C

ritical S

oft

ware

S.A

.

RAM Topics

Dependability Concepts

RAM Process

RAM Activities

Qualitative Analysis

FMEA, FTA

Quantitative Analysis

Software Reliability

Page 29: Railway

29

© 2

011 C

ritical S

oft

ware

S.A

.

Dependability Concepts

Reliability

Probability that an item can perform a required function under given conditions for a given time interval (t1, t2).

Availability

Ability of a product to be in a state to perform a required function under given conditions at a given instant of time or over a given time interval, assuming that the required external resources are provided.

Maintainability

Probability that a given active maintenance action, for an item under given conditions of use can be carried out within a stated time interval when the maintenance is performed under stated conditions and using stated procedures and resources.

EN50126

Page 30: Railway

30

© 2

011 C

ritical S

oft

ware

S.A

.

Dependability Concepts

System A System B

MTTF: Mean Time To Failure

MTBF: Mean Time Between Failures

MDT: Mean Down Time

MTW: Mean Time Waiting

MTTR: Mean Time To Repair

MDT = MTW + MTTR

MTBF = MTTF + MDT

Page 31: Railway

31

© 2

011 C

ritical S

oft

ware

S.A

.

RAM Process

Page 32: Railway

32

© 2

011 C

ritical S

oft

ware

S.A

.

RAM Activities

Preliminary RAM Analysis

Reliability Apportionement & Prediction

Detailed RAM Analysis

Qualitative Analysis

Quantitative Analysis

FMEA/FMECA

FTA

RBD

FA

CCA

HSIA, …

Page 33: Railway

33

© 2

011 C

ritical S

oft

ware

S.A

.

Back to Basics

Failure Mode:

Cause (Fault)

Local Effect (Error)

End Effect (Failure)

Fault -> Error -> Failure

Fault -> Detection -> Negation -> Restore

All errors must be properly handled (detected and mitigated)

FMEA/FMECA, FTA, CCA, HSIA are all different techniques for assisting the identification of all system failures, effects and combinations/propagations.

Qualitative Analysis

Page 34: Railway

34

© 2

011 C

ritical S

oft

ware

S.A

.

FMECA Table Failure Modes, Effects, and Criticality Analysis

Field Name

FMEA ID

Trace from

Function

Generic failure mode

Failure Mode

Failure Cause

Local Effects

Propagates to

End Effects

Impact Type

Severity

Probability of Ocurrence

Method of Detection

Compensating Provisions

Mitigated Severity

Mitigated Probability

Notes

Page 35: Railway

35

© 2

011 C

ritical S

oft

ware

S.A

.

FTA Fault Tree Analysis

Page 36: Railway

36

© 2

011 C

ritical S

oft

ware

S.A

.

Quantitative Analysis Reliability

Page 37: Railway

37

© 2

011 C

ritical S

oft

ware

S.A

.

Quantitative Analysis Availability

Stand-by:

...

Page 38: Railway

38

© 2

011 C

ritical S

oft

ware

S.A

.

Quantitative Analysis

In Practice

Page 39: Railway

39

© 2

011 C

ritical S

oft

ware

S.A

.

Software Reliability

RAM calculations tipically consider only HW failure rates;

In SW failures are systematic;

Software does not wear out or break;

Software failures result of errors in the software;

This does not necessarily imply that a software function

containing implementation errors will fail every time it is called!

The error may not reveal every time the function is called.

There are no absolute answers for the classification of

software reliability

Page 40: Railway

40

© 2

011 C

ritical S

oft

ware

S.A

.

Software Reliability

To prove the absence of faults in reasonably complex

software is a tremendous task, if not impossible.

To avoid software errors, EN 50128 provides a set of

development guidelines and V&V procedures that, for the

highest integrity levels, are very demanding.

Accordance with these procedures allows an extreme

level of confidence in the SW implementation

correctness.

Page 41: Railway

41

© 2

011 C

ritical S

oft

ware

S.A

.

SW Failure Probability

Classification alternatives:

0 / 1

No value. Only Qualitative Analysis.

Evaluate the probability of failure excluding software causes from the

calculation and present this value together with a detailed analysis of SW

failures impact

Mapped to SIL Level

Value evaluation supported by:

Sound engineering and statistical judgment, analyses, and evidences.

Service records

Page 42: Railway

42

© 2

011 C

ritical S

oft

ware

S.A

.

Stress testing IEC 60605-4

Page 43: Railway

43

© 2

011 C

ritical S

oft

ware

S.A

.

Example

r 2 20

T 50000 500000

m 25000 25000

90% ~9400 ~19300

95% ~7950 ~17900

99% ~6920 ~15700

Page 44: Railway

Coimbra, Lisboa, Porto

www.criticalsoftware.com

San Jose

www.criticalsoftware.com

Southampton

www.critical-software.co.uk

São José dos Campos

www.criticalsoftware.com.br

Maputo

http://www.criticalsoftware.co.mz

44 © 2011 Critical Software

Jorge Almeida

[email protected]

José Faria

[email protected]

Contacts