Upload
konto27
View
4
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Safety
Citation preview
Dependable Technologies For Critical Systems
© 2011 Critical Software
Railway Certification and RAM Calculations
CSW Workshop on Dependability and Certification,
Coimbra, Portugal, September 28th-29th, 2011
2
© 2
011 C
ritical S
oft
ware
S.A
.
Contents
Railway Certification PART 1
Safety
PART 2
RAM PART 3
Dependable
Technologies
For Critical
Systems
PART 1 – Railway Certification
4
© 2
011 C
ritical S
oft
ware
S.A
.
Railway Certification Topics
IEC 61508 - Functional Safety
Safety Integrity Levels
CENELEC Standards – EN50126/8/9
EN50126 Lifecycle
Safety Cases Organisation
Organisation Independency
5
© 2
011 C
ritical S
oft
ware
S.A
.
IEC 61508 Functional Safety
6
© 2
011 C
ritical S
oft
ware
S.A
.
IEC 61508 Safety Integrity Level – Tolerable Hazard Rate
SIL Tolerable Hazard Rate 4 10-9 <= THR < 10-8
3 10-8 <= THR < 10-7
2 10-7 <= THR < 10-6
1 10-6 <= THR < 10-5
7
© 2
011 C
ritical S
oft
ware
S.A
.
CENELEC Standards EN50126/8/9
EN 50126 Railway applications - Specification
and demonstration of reliability,
availability, maintainability and
safety (RAMS)
EN 50128 Railway applications -
Communications, signalling and
processing systems -
Software for railway control and
protection systems
EN 50129 Railway applications -
Communication, signalling and
processing
systems - Safety related electronic
systems for signalling
8
© 2
011 C
ritical S
oft
ware
S.A
.
EN50126 Lifecycle
Phase 1: Concept
Phase 2: System Definition
Phase 3: Risk Analysis
Phase 4: System Requirements
Phase 5: Apportionment of System Requirements
Phase 6: Design and Implementation
Phase 8: Installation
Phase 9: System Validation
Phase 10: System acceptance
Phase 11: Operation and Maintenance
Phase 7: Manufacture
Phase 12: Performance Monitoring
Phase 13: Modification and Retrofit
Phase 14: De-commissioning
and Disposal
New
Lifecycle
9
© 2
011 C
ritical S
oft
ware
S.A
.
EN50126 Lifecycle GPSC and GASC
Phase 1: Concept
Phase 2: System Definition
Phase 3: Risk Analysis
Phase 4: System Requirements
Phase 5: Apportionment of System Requirements
Phase 6: Design and Implementation
Phase 7: Manufacture
Generic Product Safety Case
Generic Application Safety Case
10
© 2
011 C
ritical S
oft
ware
S.A
.
EN50126 Lifecycle SASC
Phase 8: Installation
Phase 9: System Validation
Phase 10: System acceptance
Phase 11: Operation and Maintenance
Phase 12: Performance Monitoring
Phase 13: Modification and Retrofit
Phase 14: De-commissioning
and Disposal
New
Lifecycle
Specific Application Safety Case
11
© 2
011 C
ritical S
oft
ware
S.A
.
Safety Cases Organisation GPSC, GASC and SASC
Generic Product
Generic Application
Specific Application
Generic Product Safety Case
Generic Application Safety Case
System
Requirements
Specification
System
Requirements
Specification
System
Requirements
Specification
Safety
Requirements
Specification
Safety
Requirements
Specification
Safety
Requirements
Specification
Specific Application Safety Case
Application Design
Physical Implementati
on
Safety
Assessment
Report
Safety
Assessment
Report
Safety
Assessment
Report
12
© 2
011 C
ritical S
oft
ware
S.A
.
Safety Cases
The HW boards composing a module and the “base SW”
that runs on the boards of the module represent a
generic product.
The “base SW” is intended as the part of the SW that
doesn’t change from customer to customer and therefore
normally includes the OS, the drivers, and the base SW
functionalities.
GENERIC PRODUCT
Generic Product Safety Case
13
© 2
011 C
ritical S
oft
ware
S.A
.
Safety Cases
The generic application changes, typically, from customer to customer. It is defined by a set of HW modules combination (minimum and maximum number of a module, types of modules interconnection etc.) and by an application SW that specialize each module behaviour for a customer.
The generic application is normally implemented by application data.
GENERIC APPLICATION
Generic Application Safety Case
14
© 2
011 C
ritical S
oft
ware
S.A
.
Safety Cases
The specific application specializes a generic application
for a specific usage (typically a single train between all
the trains of a customer fleet). This means that the
generic application is configurable and that the specific
application represents a specific configuration of it.
An object of the specific application configuration level
can be a voltage level of an input, a specific behavior of
the logic for a particular train, etc…
SPECIFIC APPLICATION
Specific Application Safety Case
Application Design
Physical Implementati
on
15
© 2
011 C
ritical S
oft
ware
S.A
.
Organisation Independency
Project Manager
Dev. Team
Ver. Team
Val. Team
Project Manager
Dev. Team
Val. Team
Ver. Team
Assr
Assr
SIL 3 &
SIL 4
Dependable
Technologies
For Critical
Systems
PART 2 – Safety
17
© 2
011 C
ritical S
oft
ware
S.A
.
Safety Topics
Preliminary Hazard Analysis / Risk Analysis
Hazard Analysis
Hazard Log
Safety Case
Relation with other Safety Cases
18
© 2
011 C
ritical S
oft
ware
S.A
.
Preliminary Hazard Analysis
Hazard Identification
Hazard Causes Identification
Hazard Consequences
Hazard Initial Risk Evaluation
Hazard Mitigation Recommendations
Hazard Final Risk Evaluation
19
© 2
011 C
ritical S
oft
ware
S.A
.
Preliminary Hazard Analysis
System
Context
Application
Domain
Past
Experience
System
Hazards
20
© 2
011 C
ritical S
oft
ware
S.A
.
Preliminary Hazard Analysis
Example of hazard consequences in the railway domain:
Collision
Derailment
Casualties
Injuries
21
© 2
011 C
ritical S
oft
ware
S.A
.
Risk Analysis
2
System Analysis Hazard IdentificationHazard
Consequence
Initial Risk
EvaluationMitigation Actions
Final Risk
Evaluation
Risk Analysis Process
Check Hazard
Frequency
Make Qualitative
Risk Evaluation
Verify Hazard
SeverityRisk Value
Risk Quantification Process
22
© 2
011 C
ritical S
oft
ware
S.A
.
Risk Analysis
Risk Evaluation Matrix
Frequency Risk Level
Frequent Undesirable Intolerable Intolerable Intolerable
Probable Tolerable Undesirable Intolerable Intolerable
Occasional Tolerable Undesirable Undesirable Intolerable
Remote Negligible Tolerable Undesirable Undesirable
Improbable Negligible Negligible Tolerable Tolerable
Incredible Negligible Negligible Negligible Negligible
Insignificant Marginal Critical Catastrophic
Severity
23
© 2
011 C
ritical S
oft
ware
S.A
.
Hazard Analysis
Hazard Analysis Process
Preparation
· Analyse all Inputs;· Define Risk Analysis Methodoly;· Define Safety Criteria· Define Hazard Analysis
Properties.
Input
· System and Interface Requirements;
· Architecture Specification;· Preliminary Hazards Analysis; · Top level system activities.
Output
· Hazard Log;· System Safety Requirements;· Safety Exported Constraints;· Functional and Physical SIL
Allocation.
Execution
· Identify all foreseen hazards;· Identify causes and consequences
of each hazard;· Evaluate initial risk (frequency and
severity) of each hazard;· Define mitigations (both preventive
and protective) for each hazard;· Define external costumer
recomendations;· Evaluate final risk when mitigations
are applied
Architecture
Specification
System and
Interface
Requirements
Preliminary
Analysis
Identified
Hazards
Hazard Analysis
Top Level
System
Activities
24
© 2
011 C
ritical S
oft
ware
S.A
.
Hazard Log
Property Description
ID-xxx A running number starting from 1
System Activity Activity to support the analysis of this hazard.
Architecture Item Sub-system where the hazard was identified
Function Name Function where the hazard was identified
Component Name Component where the hazard was identified
System State System state where the hazard was identified
Hazard Description Hazard Description
Hazard Cause Cause of Hazard
Hazard Effect Effects of the hazard in the system
Direct Consequence Description of the direct consequence of this hazard in the environment.
Frequency Frequency of the hazard occurrence
Severity Severity of the hazard
Risk Evaluation level Risk evaluation Level
Preventive Countermeasure Preventive Countermeasure
Protective Countermeasure Protective Countermeasure
Mitigated Consequence Description of the consequence of this hazard in the environment after applying the mitigation action.
Customer Recommendations Recommendations for the customer. Need to be transmitted to customer
Final Frequency Final Frequency of the hazard occurrence
Final Severity Final Severity of the hazard
Final Risk Evaluation Level Final Risk Evaluation Level
Application Conditions Application conditions code for the correct usage of the system in terms of safety.
Safety Requirement related Code FDT3_RS_SR_xxx : Code of Safety Requirement related to Hazard.
Hazard Status Status of the hazard.
25
© 2
011 C
ritical S
oft
ware
S.A
.
Safety Case
Generic Product Safety Case
Part 6:
Conclusions
Part 5: Related Safety
Cases
Part 4: Technical Safety
Report
Part 3:Safety
Management
Report
Part 2:Quality
Management
Report
Part 1:Definition of the
System
Part 4:
Technical Safety Report
Section 6:Safety
Qualification
Tests
Section 5:Safety-related
Application
Conditions
Section 4:Operation with
External
Influences
Section 3:
Effects of Faults
Section 1:
Introduction
Section 2:Assurance of
Correct
Operation
26
© 2
011 C
ritical S
oft
ware
S.A
.
Relation with other Safety Cases
Component 1
GPSC
Product 1
GPSC Component 2
GPSC
Component 3
GPSC
Dependable
Technologies
For Critical
Systems
PART 3 – RAM
28
© 2
011 C
ritical S
oft
ware
S.A
.
RAM Topics
Dependability Concepts
RAM Process
RAM Activities
Qualitative Analysis
FMEA, FTA
Quantitative Analysis
Software Reliability
29
© 2
011 C
ritical S
oft
ware
S.A
.
Dependability Concepts
Reliability
Probability that an item can perform a required function under given conditions for a given time interval (t1, t2).
Availability
Ability of a product to be in a state to perform a required function under given conditions at a given instant of time or over a given time interval, assuming that the required external resources are provided.
Maintainability
Probability that a given active maintenance action, for an item under given conditions of use can be carried out within a stated time interval when the maintenance is performed under stated conditions and using stated procedures and resources.
EN50126
30
© 2
011 C
ritical S
oft
ware
S.A
.
Dependability Concepts
System A System B
MTTF: Mean Time To Failure
MTBF: Mean Time Between Failures
MDT: Mean Down Time
MTW: Mean Time Waiting
MTTR: Mean Time To Repair
MDT = MTW + MTTR
MTBF = MTTF + MDT
31
© 2
011 C
ritical S
oft
ware
S.A
.
RAM Process
32
© 2
011 C
ritical S
oft
ware
S.A
.
RAM Activities
Preliminary RAM Analysis
Reliability Apportionement & Prediction
Detailed RAM Analysis
Qualitative Analysis
Quantitative Analysis
FMEA/FMECA
FTA
RBD
FA
CCA
HSIA, …
33
© 2
011 C
ritical S
oft
ware
S.A
.
Back to Basics
Failure Mode:
Cause (Fault)
Local Effect (Error)
End Effect (Failure)
Fault -> Error -> Failure
Fault -> Detection -> Negation -> Restore
All errors must be properly handled (detected and mitigated)
FMEA/FMECA, FTA, CCA, HSIA are all different techniques for assisting the identification of all system failures, effects and combinations/propagations.
Qualitative Analysis
34
© 2
011 C
ritical S
oft
ware
S.A
.
FMECA Table Failure Modes, Effects, and Criticality Analysis
Field Name
FMEA ID
Trace from
Function
Generic failure mode
Failure Mode
Failure Cause
Local Effects
Propagates to
End Effects
Impact Type
Severity
Probability of Ocurrence
Method of Detection
Compensating Provisions
Mitigated Severity
Mitigated Probability
Notes
35
© 2
011 C
ritical S
oft
ware
S.A
.
FTA Fault Tree Analysis
36
© 2
011 C
ritical S
oft
ware
S.A
.
Quantitative Analysis Reliability
37
© 2
011 C
ritical S
oft
ware
S.A
.
Quantitative Analysis Availability
Stand-by:
...
38
© 2
011 C
ritical S
oft
ware
S.A
.
Quantitative Analysis
In Practice
39
© 2
011 C
ritical S
oft
ware
S.A
.
Software Reliability
RAM calculations tipically consider only HW failure rates;
In SW failures are systematic;
Software does not wear out or break;
Software failures result of errors in the software;
This does not necessarily imply that a software function
containing implementation errors will fail every time it is called!
The error may not reveal every time the function is called.
There are no absolute answers for the classification of
software reliability
40
© 2
011 C
ritical S
oft
ware
S.A
.
Software Reliability
To prove the absence of faults in reasonably complex
software is a tremendous task, if not impossible.
To avoid software errors, EN 50128 provides a set of
development guidelines and V&V procedures that, for the
highest integrity levels, are very demanding.
Accordance with these procedures allows an extreme
level of confidence in the SW implementation
correctness.
41
© 2
011 C
ritical S
oft
ware
S.A
.
SW Failure Probability
Classification alternatives:
0 / 1
No value. Only Qualitative Analysis.
Evaluate the probability of failure excluding software causes from the
calculation and present this value together with a detailed analysis of SW
failures impact
Mapped to SIL Level
Value evaluation supported by:
Sound engineering and statistical judgment, analyses, and evidences.
Service records
42
© 2
011 C
ritical S
oft
ware
S.A
.
Stress testing IEC 60605-4
43
© 2
011 C
ritical S
oft
ware
S.A
.
Example
r 2 20
T 50000 500000
m 25000 25000
90% ~9400 ~19300
95% ~7950 ~17900
99% ~6920 ~15700
Coimbra, Lisboa, Porto
www.criticalsoftware.com
San Jose
www.criticalsoftware.com
Southampton
www.critical-software.co.uk
São José dos Campos
www.criticalsoftware.com.br
Maputo
http://www.criticalsoftware.co.mz
44 © 2011 Critical Software
Jorge Almeida
José Faria
Contacts