Upload
alex-tomczynski
View
82
Download
1
Tags:
Embed Size (px)
Citation preview
CRANFIELD UNIVERSITY
Alexander Tomczynski
APPLICATION OF RESILIENCE ENGINEERING CONCEPTS TO
THE MANAGEMENT OF AIRWORTHINESS
DEFENCE ACADEMY - COLLEGE OF MANAGEMENT AND
TECHNOLOGY
Military Aerospace and Airworthiness
MSc
Academic Year: 2013 - 2014
Supervisor: Dr Simon Place
March 2014
CRANFIELD UNIVERSITY
DEFENCE ACADEMY - COLLEGE OF MANAGEMENT AND
TECHNOLOGY
Military Aerospace and Airworthiness
MSc
Academic Year 2013 - 2014
Alexander Tomczynski
APPLICATION OF RESILIENCE ENGINEERING CONCEPTS TO
THE MANAGEMENT OF AIRWORTHINESS
Supervisor: Dr Simon Place
March 2014
This thesis is submitted in partial fulfilment of the requirements for
the degree of Master of Science
© Crown Copyright 2014. All rights reserved. No part of this
publication may be reproduced without the written permission of the
copyright owner.
i
ABSTRACT
Complex safety critical systems in high hazard industries continue to have
accidents despite improvements in reliability, understanding of human factors
and the behaviour of organisations. Resilience engineering offers a new
paradigm in safety science and proposes that safety is defined as success
under varying performance conditions. The theory is examined and its
applicability to airworthiness is discussed. A related technique, the Functional
Resonance Analysis Method (FRAM), treats system performance as a control
problem. This methodology is employed to create an airworthiness
management tool for the Royal Air Force Tornado aircraft fleet. Data was
gathered through occurrence report data, practical experience and semi-
structured interviews with a variety of personnel within the airworthiness
system. The tool comprises a spreadsheet model with an accompanying
interactive visualisation tool. The tool is used to analyse two air safety
occurrences and also to attempt to provide a resilience based risk assessment
of an airworthiness issue. It was concluded that resilience engineering presents
a promising basis for better management of airworthiness. The initial version of
the tool was found to work well but extensive development work is required to
produce a desktop IT airworthiness resilience dashboard tool.
Keywords:
SYSTEM SAFETY, SAFETY CRITICAL SYSTEMS, ACCIDENT
INVESTIGATION
iii
ACKNOWLEDGEMENTS
I would like to thank my wife Natalie for her encouragement and support.
Also worthy of thanks are Professor Erik Hollnagel and the rest of the “FRAMily”
who have collected both online and at the 2013 meeting in Munich. The shared
knowledge and experience has been most instructive.
This project would not have been possible without the enthusiastic participation
of a large number of people at Royal Air Force Station Marham - service
personnel and employees of BAE Systems and Rolls Royce.
The guidance provided by my supervisor Dr Simon Place has been invaluable
in the completion of this project and I thank him for it.
iv
In remembrance of No. CXX Squadron, Crew 3
“Endurance”
v
TABLE OF CONTENTS
ABSTRACT ......................................................................................................... i
ACKNOWLEDGEMENTS................................................................................... iii
LIST OF FIGURES ............................................................................................. ix
LIST OF TABLES .............................................................................................. xii
LIST OF EQUATIONS ...................................................................................... xiv
LIST OF ABBREVIATIONS ............................................................................... xv
1 INTRODUCTION ............................................................................................. 1
1.1 Introduction ............................................................................................... 1
1.2 Background – Theories of Safety .............................................................. 2
1.3 Background – The Practical Requirement ................................................ 3
1.4 What is ‘Airworthiness Management’? ...................................................... 5
1.5 The Research Aim .................................................................................... 5
1.6 Objectives ................................................................................................. 5
1.7 Methodology Overview ............................................................................. 6
1.8 Descriptions and Definitions ..................................................................... 6
1.9 Thesis Structure ........................................................................................ 6
2 LITERATURE REVIEW ................................................................................... 9
2.1 Airworthiness in the Context of Safety ...................................................... 9
2.1.1 Accident Investigations ....................................................................... 9
2.1.2 Initial and Type Airworthiness .......................................................... 10
2.1.3 Safety Management ......................................................................... 10
2.1.4 Continuing Airworthiness .................................................................. 11
2.2 A History of Safety Theory ...................................................................... 11
2.2.1 Technological Age – Governing Philosophy ..................................... 14
2.2.2 Technological Age – Tools ............................................................... 14
2.2.3 Limits of Probabilistic Risk Assessment ........................................... 16
2.2.4 Human Factors ................................................................................. 18
2.2.5 Organisational .................................................................................. 18
2.3 Complexity .............................................................................................. 19
2.3.1 Complexity Theory ........................................................................... 20
2.3.2 Systems Thinking and Systems Engineering ................................... 22
2.3.3 Control Theory ................................................................................. 24
2.3.4 Non-Linear Dynamics ....................................................................... 25
2.4 Resilience Engineering ........................................................................... 26
2.4.1 Resilience Engineering as a Successor to Safety Management ...... 30
2.4.2 Under Specification of Performance Conditions ............................... 30
2.4.3 Performance Variability .................................................................... 31
2.4.4 Examples of Resilience Engineering in Practice .............................. 31
2.4.5 Criticism of Resilience Engineering .................................................. 33
2.4.6 Resilience Engineering and Airworthiness ....................................... 34
vi
2.4.7 Lean Resilience ................................................................................ 38
2.5 Functional Resonance Analysis Method ................................................. 38
2.6 Quantifying Resilience ............................................................................ 39
2.7 Concluding Remarks ............................................................................... 40
3 METHODOLOGY .......................................................................................... 41
3.1 Introduction ............................................................................................. 41
3.2 Working Arrangements ........................................................................... 41
3.3 Research Interviews ............................................................................... 41
3.4 Model Development ................................................................................ 43
3.5 Air Safety Information Management System Data .................................. 44
3.5.1 Data Extraction ................................................................................. 44
3.5.2 Assignment of Related Functions to Incidents ................................. 46
4 BUILDING THE TORNADO AIRWORTHINESS SYSTEM MODEL USING
THE FUNCTIONAL RESONANCE ANALYSIS METHOD ................................ 47
4.1 Basic Principles ...................................................................................... 47
4.2 Taxonomy ............................................................................................... 48
4.3 FRAM Step 0 – Recognise the Purpose of the FRAM Analysis .............. 50
4.4 FRAM Step 1a – Identify and Describe the Initial Function List. ............. 51
4.5 FRAM Step 1b – Verify Functions with Experts ...................................... 53
4.6 Step 2 – Identification of Output Variability ............................................. 56
4.7 Step 2a – Identify the Type of Function .................................................. 56
4.8 Step 2b – Identify Internal Sources of Output Variability ......................... 59
4.9 Step 2c – Identify External Sources of Output Variability ........................ 60
4.10 Step 2d – Most Likely Dimension of Output Variability .......................... 61
4.11 Step 3 – Aggregation of Variability ........................................................ 65
4.12 Step 4 – Consequences of the Analysis ............................................... 71
4.12.1 Step 4a – Damping Factors ............................................................ 71
4.12.2 Step 4b Performance Indicators ..................................................... 71
4.13 Summary of TASM Layout .................................................................... 74
5 TORNADO AIRWORTHINESS SYSTEM MODEL VISUALISATION TOOL . 77
5.1 Need for the Tool .................................................................................... 77
5.2 Microsoft Visio ........................................................................................ 77
5.3 Building the Tool ..................................................................................... 77
5.3.1 General Functional Areas ................................................................. 77
5.3.2 Functions .......................................................................................... 79
5.3.3 External Dependencies .................................................................... 82
5.3.4 Functional Activities.......................................................................... 84
5.4 Exploiting the Tool .................................................................................. 85
5.5 Summary ................................................................................................ 89
6 USING THE TORNADO AIRWORTHINESS SYSTEM MODEL FOR
INCIDENT ANALYSIS ...................................................................................... 93
6.1 Case for Using FRAM for Incident Modelling .......................................... 93
vii
6.2 Incident One – Thrust Reverser Incidents ............................................... 94
6.2.1 Description of Incidents .................................................................... 95
6.2.2 Summary of the Investigations ......................................................... 96
6.2.3 Instantiation of the FRAM Model ...................................................... 98
6.2.4 The Sources of Variability .............................................................. 102
6.2.5 Insights from TASM ........................................................................ 108
6.3 Incident 2 – Missing Rigging Pin ........................................................... 111
6.3.1 Description of Incident .................................................................... 111
6.3.2 Summary of Investigation ............................................................... 112
6.3.3 Instantiation of the TASM ............................................................... 116
6.3.4 Insights from TASM ........................................................................ 119
7 USING THE TORNADO AIRWORTHINESS SYSTEM MODEL FOR RISK
ANALYSIS ...................................................................................................... 121
7.1 Case for Using TASM for Risk Analysis ................................................ 121
7.2 Current Theoretical Basis for Airworthiness Risk Management ............ 123
7.3 Proposal of FRAM Based Airworthiness Risk Theory ........................... 124
7.4 Proposal for a FRAM Based Risk Assessment Process ....................... 127
7.5 Risk Example – Operation of Components in Excess of Cleared Life .. 131
7.5.1 Generating a FRAM Model Risk Assessment ................................ 131
7.5.2 Insights into Risk ............................................................................ 144
7.6 Proposal for a FRAM Based Risk Management ................................... 148
7.7 Chapter Summary ................................................................................. 149
8 DISCUSSION .............................................................................................. 151
8.1 Applicability of the Resilience Engineering Paradigm to Airworthiness . 151
8.2 The Tornado Airworthiness System Model – Initial Version .................. 155
8.3 Incident Investigation ............................................................................ 156
8.3.1 Data Collection ............................................................................... 157
8.3.2 Aids to Investigation ....................................................................... 157
8.4 Risk Assessment .................................................................................. 158
8.4.1 Hazard Management vs Functional Resonance Management ....... 160
8.5 Utility of the TASM for Type Airworthiness Activities ............................ 160
8.6 Utility of the TASM for Continuing Airworthiness Activities ................... 162
8.7 Utility of TASM for Duty Holder Activity ................................................. 164
8.8 Potential Use for System Improvement ................................................. 165
8.9 Potential for Further Development of the TASM ................................... 166
8.9.1 Increased Model Fidelity ................................................................ 167
8.9.2 Application of Bayesian and/or Fuzzy Logic ................................... 168
8.9.3 Expansion into Operational Safety Management ........................... 168
8.10 Chapter Summary ............................................................................... 169
9 CONCLUSIONS .......................................................................................... 171
9.1 Summary .............................................................................................. 171
9.2 Recommendations ................................................................................ 172
viii
9.2.1 Manage Airworthiness as a Control Problem ................................. 173
9.2.2 Use the TASM to Control the Airworthiness System ...................... 173
9.2.3 Review Airworthiness Risk from a Resilience Perspective ............. 173
9.2.4 Use FRAM as a Means to Improve System Resilience and
Efficiency ................................................................................................. 173
9.3 Potential for Further Research and Development ................................. 174
9.4 Concluding Remarks ............................................................................. 174
REFERENCES ............................................................................................... 177
Appendix A –TORNADO AIRWORTHINESS FRAM MODEL ..................... 185
Appendix B – TORNADO AIRWORTHINESS MODEL VISUALISATION ... 187
Appendix C – PARTICIPANTS BRIEFING SHEET .................................... 188
ix
LIST OF FIGURES
Figure 1-1 - Nimrod MR2 XV230 ........................................................................ 3
Figure 1-2 - RAF Tornado GR4 Aircraft ............................................................. 4
Figure 2-1 Accident Analysis and Risk Assessment Methods .......................... 13
Figure 2-2 Three Tracks on the Evolution of Safety Theory ............................. 13
Figure 2-3 The ‘Cynefin’ Framework – Complexity and Risk Management ..... 21
Figure 2-4 General Form of a Model of Socio-technical Control....................... 24
Figure 2-5 The Four Cornerstones of Resilience .............................................. 29
Figure 2-6 Conceptual Framework for Resilience Engineering ........................ 30
Figure 2-7 Framework for managing the impact organisation, technology and human factors have on safety management systems ................................ 37
Figure 2-8 FRAM Function ............................................................................... 38
Figure 4-1 FRAM Model Visualisation Demonstrating Taxonomy .................... 49
Figure 4-2 TASM Step 12 – Screen Capture Showing Applicable Spreadsheet Areas ......................................................................................................... 54
Figure 4-3 Visualising Functional Output Variability ......................................... 56
Figure 4-4 Instances of Functional Output Variability Recorded in Occurrence Reports 2012/13 ........................................................................................ 58
Figure 4-5 Instances of Reported Functional Output Variability by Function Type .................................................................................................................. 59
Figure 4-6 Total Instances of Functional Output Variability Recorded in Occurrence Reports 2012/13 ..................................................................... 59
Figure 4-7 TASM Step 2 – Screen Capture Showing Applicable Spreadsheet Areas ......................................................................................................... 65
Figure 4-8 Tracing Output Downstream Dependencies (Screen Capture) ....... 66
Figure 4-9 Rough Score Matrix ........................................................................ 70
Figure 4-10 Rough Downstream Function Variability Score ............................. 70
Figure 4-11 TASM Step 3 – Screen Capture Showing Applicable Spreadsheet Areas ......................................................................................................... 71
Figure 4-12 TASM Step 4 – Screen Capture Showing Applicable Spreadsheet Areas ......................................................................................................... 72
Figure 4-13 Example FRAM for 2 Functions, A and B ...................................... 75
x
Figure 5-1 Visualisation Functional Groupings ................................................. 78
Figure 5-2 A Function and Its Aspects ............................................................. 79
Figure 5-3 Screen Capture of Visualisation Tool with Functions Added ........... 81
Figure 5-4 Screen Capture of Visualisation Tool with External Dependencies Added ........................................................................................................ 83
Figure 5-5 5-6 Screen Capture of Visualisation Tool with all Functional Activities Shown........................................................................................................ 85
Figure 5-7 Activities and Dependencies Linked to Aspects of the ‘Train Maintenance Personnel’ Function ............................................................. 86
Figure 5-8 Selecting Layers within Visio – Screen Capture .............................. 87
Figure 5-9 DII Visio Viewer – Screen Capture .................................................. 88
Figure 5-10 Visualisation Tool Key ................................................................... 90
Figure 6-1 Tornado GR4 with Thrust Reversers Deployed .............................. 94
Figure 6-2 Thrust Reverser Incidents Visualisation ........................................ 101
Figure 6-3 Propulsion & Electrical System ..................................................... 102
Figure 6-4 Electrical System Potential Functionally Resonant Activities ........ 104
Figure 6-5 Instantiation of Thrust Reverse Occurrence Reports .................... 110
Figure 6-6 Location of Where Lost Pin was Installed ..................................... 113
Figure 6-7 General Installation Location of Lost Pin ....................................... 114
Figure 6-8 Pin Location in Tool Kit ................................................................. 115
Figure 6-9 Visualisation Tool Output for Rigging Tool Occurrence ................. 117
Figure 6-10 Instantiation of Rigging Pin Occurrence ...................................... 118
Figure 7-1 Tornado Process for Emergent Airworthiness Issues ................... 122
Figure 7-2 Current Theoretical Basis for Tornado Airworthiness Risk Management ............................................................................................ 124
Figure 7-3 Proposed Functional Resonance Risk Management Theory - Visualisation of a Generic Hazardous Process ........................................ 125
Figure 7-4 FRAM Model Risk Assessment Process ....................................... 129
Figure 7-5 Operation of Components Beyond Cleared Life - First Stage Risk Visualisation, Excluding Background Functions ...................................... 132
Figure 7-6 Visualisation of Hazard Generation Process ................................. 141
Figure 7-7 Visualisation of Potential Accident Processes ............................... 143
xi
Figure 7-8 Proposed Risk Management Process ........................................... 148
Figure 8-1 Fractal Property of the FRAM - Function Decomposed into Lower Level Functions ....................................................................................... 159
Figure 8-2 TASM Development Pathway ....................................................... 167
xii
LIST OF TABLES
Table 2-1 Herrera 's Ages of Safety Theory ..................................................... 12
Table 2-2 Benefits and Criticisms of Probabilistic Risk Assessment ............... 17
Table 2-3 Examples of Resilience Engineering in Practice .............................. 31
Table 3-1 D-ASOR Classifications included in Data ......................................... 45
Table 4-1 Example FRAM frame for Fault Diagnosis ....................................... 52
Table 4-2 Listing of TASM Functions ............................................................... 55
Table 4-3 Summary of Internal Variability ........................................................ 60
Table 4-4 Summary of External Variability ....................................................... 61
Table 4-5 Example TASM Recording of Step 2a-c for Function 67 - Engine Fleet Monitoring .................................................................................................. 61
Table 4-6 Elaborate Description of Output Variability ....................................... 62
Table 4-7 Characterising Output Variability – Flight Servicing ......................... 63
Table 4-8 Classifications for Frequency of Output Variability ........................... 64
Table 4-9 Classification of Amplitude of Performance Variability ..................... 65
Table 4-10 Aggregation of Variability for Flight Servicing ................................. 69
Table 4-11 Example of Step 4 - Flight Servicing .............................................. 73
Table 6-1 Thrust Reverser Air Safety Occurrence Reports 2012/13 ................ 95
Table 6-2 Thrust Reverse Occurrences with Detailed Investigation ................. 96
Table 6-3 Thrust Reverser FRAM Instantiation ................................................ 99
Table 6-4 FRAM Model of Electrical System .................................................. 105
Table 6-5 Electrical System Precondition Variability ...................................... 107
Table 6-6 Functional Variability Noted From Investigation ............................. 116
Table 7-1 Configuration Management Aspects .............................................. 133
Table 7-2 Summary of Second Stage of Risk Assessment ............................ 135
Table 7-3 Stage 2 - Scheduled Maintenance Function ................................... 136
Table 7-4 Stage 2 - Force and A4 Operations Function (Part 1) .................... 137
Table 7-5 Stage 2 - Force and A4 Operations Function (Part 2) .................... 138
Table 7-6 Stage 3 Replacement of Life Limited Parts Function...................... 139
xiii
Table 7-7 Example Accident Generating Function FRAM Frame Layout ....... 145
Table 7-8 Avionic Flight Systems Output – Baseline FRAM Model ................ 147
Table 8-1 Utility of TASM for TAA Activities ................................................... 161
Table 8-2 Potential CAMO Use of TASM ....................................................... 163
Table 8-3 Aviation Duty Holder Use of TASM ................................................ 164
xiv
LIST OF EQUATIONS
Equation 1 – Linear System ............................................................................. 25
Equation 2 - Additive Property .......................................................................... 25
Equation 3 – Homogeneous Property .............................................................. 25
Equation 4 – Non Linear System; lack of Additive Property ............................. 25
Equation 5 – Non Linear System; lack of Homogeneous Property ................... 25
Equation 6 - Rough Downstream Function Variability Score ............................ 70
xv
LIST OF ABBREVIATIONS
A4
AC
AcciMap
ADF
AEB
AESO
ALARP
ARC
ATTAC
ASIMS
ATHEANA
AWFL
CAM
CAMO
CAMSS
CMU
CREAM
CSNI
DAOS
DASOR
DE&S
DII
DMS
DO
DQAFF
EA
EngO
ETTO
FAST
FMEA
FMECA
FOC
FRAM
GSE
HAS
HAZOPS
HAZID
NATO designator for Logistics/Engineering
Aircraft
Accident Map
Acceptable Deferred Fault
Accident Evolution and Barrier Function
Air Engineering Standing Orders
As Low As Reasonably Practicable
Airworthiness Review Certificate
Aircraft Tornado Transformation Availability Contract
Air Safety Information Management System
A Technique for Human Error ANAlysis
AirWorthiness Flight Limitations
Continuing Airworthiness Manager
Continuing Airworthiness Management Organisation
Continued Airworthiness Management Support Services
Combined Maintenance and Upgrade Unit
Cognitive Reliability Error Analysis Method
Committee on the Safety of Nuclear Installations
Design Approved Organisation Scheme
Defence Air Safety Occurrence Report
Defence Equipment and Support
Defence Information Infrastructure
Dedicated Maintenance System
Design Organisation
Defence Quality Assurance Field Force
Engineering Authority
Engineer Officer
Efficiency Thoroughness Trade Off
Fast Air Support Team
Failure Mode Effects Analysis
Failure Mode and Criticality Analysis
Force Operations Centre
Functional Resonance Analysis Method
Ground Support Equipment
Hardened Aircraft Shelter
Hazard and Operability Study
Hazard Identification
xvi
HCR
HEAT
HERA
HFACS
HPES
HRO
ITEA
JEngO
JSP
LFT
LITS
LOAA
MAA
MAOS
MAP-01
MERMOS
MOD
MMD
MORT
MSG
MRP
MTO
MWO
NAT
NATO
NETMA
OOPS
OSI
ORG
PST
PT
QA
QMS
R2
RA
RAF
RCA
ROCET
Human Cognitive Reliability
Human Error Assessment Technique
Human Error in Air Traffic Management Technique
Human Factors Analysis and Classification System
Human Performance Enhancement System
High Reliability Organisation
Independent Technical Evaluation and Advice
Junior Engineering Officer
Joint Service Publication
Latest Finish Time
Logistics Information Technology System
Letter Of Airworthiness Authority
Military Airworthiness Authority
Maintenance Approved Organisation Scheme
Manual of Airworthiness Processes - 01
Méthode d’Evaluation de la Réalisation des Missions Opérateur pour la Sûreté
Ministry of Defence
Man Made Disaster
Maintenance Oversight and Risk Tree
Maintenance Steering Group
MAA Regulatory Publications
Man-Technology-Organisation
Maintenance Work Order
North Atlantic Treaty Organisation
Normal Accident Theory
NATO Eurofighter and Tornado Management Agency
Out Of Phase Servicing
Occurrence Safety Investigation
Occurrence Review Group
Propulsion Support Team
Project Team
Quality Assurance
Quality Management System
2nd Line Repair
Regulatory Article
Royal Air Force
Root Cause Analysis
RB199 Operational Contract for Engine Transformation
xvii
RTS
RTSA
SEngO
SI(T)
SQEP
STAMP
STANEVAL
STEP
TAA
TAP
TASM
TGRF
THERP
TME
TRACEr
TSEMP
Release To Service
Release To Service Authority
Senior Engineering Officer
Special Instruction (Technical)
Suitably Qualified and Experienced Person
Systems Theoretic Accident Model
STANdards EVALuation
Sequential Timed Event Plotting
Type Airworthiness Authority
Technical Assistance Process
Tornado Airworthiness System Model
Tornado Ground Attack & Reconnaissance Force
Technique for Human Reliability Analysis
Testing and Measuring Equipment
Technique for The Retrospective Analysis of Cognitive Error
Tornado Safety and Environmental Management Plan
1
1 INTRODUCTION
“I can see him now, fighting with the controls, trying his best…
…We want some justice and the MOD to sit up and take notice, what they have
done could have been avoided; we live in hope that they will not let this happen
in the future.”
Mrs Adele Squires, Wife of Flight Lieutenant Al Squires, Captain of the Nimrod
aircraft XV230
1.1 Introduction
Air accidents have shown that aircraft are sometimes not as safe or as
airworthy as was previously imagined. With huge resources applied to ensuring
airworthiness, why do accidents still occur? It is often said that such accidents
could be prevented if the lessons of the past been heeded. Yet despite many
investigations and recommendations, accidents still occur. Why is this so? Are
existing tools for safety analysis inadequately applied or inadequate in of
themselves? How can those charged with responsibility over complex
hazardous systems in industry, transportation or the military work better to
prevent accidents and yet still achieve their operational objectives?
Design engineers are duty bound to demonstrate that their system may initially
be operated without an unacceptable level of harm. Thereafter, the system must
be maintained and continually monitored for the increase of risk beyond
acceptable levels. For organisations that have very few if any accidents, this is
a major challenge; how can the risk of something that has not happened be
measured and managed? In the aviation domain, airworthiness is a property
that requires continual management and assessment. Whilst the property is
attributable to the materiel itself, aircraft systems experience almost constant
contact with humans and thus airworthiness is inherently bound up with the
humans who manage, operate and maintain aircraft. Aircraft and their
supporting organisations are complex ‘socio-technical systems’. Resilience
engineering is a new concept that provides insight into this relationship and
offers useful models and tools for better management of the safety of such
2
systems. If complex socio-technical systems managing airworthiness are better
understood, then perhaps future accidents will be prevented.
1.2 Background – Theories of Safety
There have been accidents involving complex systems ever since the industrial
revolution. The management of safety has consequently been a concern since
these times but a theoretical basis for safety did not emerge until the 1930s.
Herrea (2012) divides the development of safety theory into 4 overlapping ages
of safety theory - the ages of technology, human factors, organisational safety
and complexity. The age of technology dealt with the design of machines and
why they fail whereas human factors has traditionally been concerned with why
humans fail to do what is expected of them. Organisational safety has been
concerned with the safe management of potentially hazardous enterprises and
how these fail – Reason’s (1997) famous ‘Swiss Cheese’ being the pre-eminent
model in the field. However detailed the taxonomies of failure in the first 3 ages
were, the associated models of accident causation have been linear. An
emerging 4th age of safety theory is that of complexity. In complexity theories,
accident causation models are non-linear and are sometimes said to be
intractable. The term Resilience Engineering has come to encompass the use
of these models; the practise seeks to develop socio-technological systems that
are resilient against those variations in system performance which may cause
accidents. Airworthiness has been mostly associated with technological safety
theory, with reliability and safety assessment methods such as fault tree
analysis dominating the thinking of designers and regulators. Although design
for human factors has been an issue since the 1940s the field has generally
been concerned with operator performance; human factors in maintenance has
only more recently come into the spot light (Reason and Hobbs, 2003). The
ability of aircraft operating authorities and regulators to maintain continuing
airworthiness has been the subject analysis from an organisational safety
standpoint due to accidents such as Alaska Air 261 (Woltjer, 2007). More
recently accidents such as Air France 447 (Stoop, 2013) have shown that
3
unexpected results can emerge from increasingly complex systems and that a
lack of resilience can be fatal.
1.3 Background – The Practical Requirement
This research will specifically address the management of airworthiness within
the United Kingdom’s military. On the 2nd September 2006 the UK military
suffered its single largest loss of life since the 1982 Falklands War, when a
Royal Air Force (RAF) Nimrod MR2 aircraft was destroyed near Kandahar,
Afghanistan. This was not the consequence of a hostile act or the outcome of
operator error. It was an accident caused by a failure to establish the correct
level of initial airworthiness though the design of modifications and thereafter a
failure to maintain continuing airworthiness in the condition of fuel and hot air
systems. The independent inquiry into the incident identified that the deeper
causes were organizational and managerial (Haddon-Cave, 2009).
Figure 1-1 - Nimrod MR2 XV230 (McKenzie, 2012)
As a consequence of the recommendations made by the Nimrod Review,
military airworthiness management has been comprehensively overhauled as
part of a reorganisation of ‘Air Safety’ within the Ministry of Defence (MOD). The
previously “byzantine” (Haddon-Cave, 2009) regulation of air safety has been
simplified through the establishment of the Military Aviation Authority (MAA).
Key to the new system has been the establishment of a chain of ‘duty holders’
who are named senior military officers with legal responsibility for the safety of
aircraft operated by their organisation. Duty Holders rely on Type Airworthiness
4
Authorities (TAA) and Continuous Airworthiness Managers (CAM) to ensure
that the airworthiness of their aircraft is adequately established and maintained.
In practise this is achieved through a variety of processes aimed at managing
the risk of a technical failure. There is an engineering programme to maintain
the integrity of the systems’ initial airworthiness whilst developing the system’s
capability and also a maintenance programme specified by the Engineering
Authority (EA) (reporting to the TAA) and implemented by the Continuing
Airworthiness Management Organisation (CAMO). In common with most socio-
technological systems, these processes do not operate exactly as designed or
documented. Particular concerns centre around the human factors within the
maintenance programme and whether or not appropriate engineering
‘standards and practises’ can be ensured in the face of pressures to produce
operational output within an increasingly lean front line organisation. These are
typical examples of the Efficiency-Thoroughness Trade-Off (ETTO) principle
highlighted within the Resilience Engineering literature (Hollnagel et al, 2007).
Practical experience of the messy realities of military aircraft operations and
back-office airworthiness assessment was the genesis of the research aim. This
research uses the RAF’s Tornado Ground Attack Reconnaissance Force
(TGRF) as a case study.
Figure 1-2 - RAF Tornado GR4 Aircraft (Crown Copyright, 2009)
5
1.4 What is ‘Airworthiness Management’?
Large organisations exist to maintain, modify, provide resources, operate and
monitor aircraft fleets in order to keep them airworthy. The way in which this
multitude of functions is carried out has a variety of effects on the aircraft
system and the property of airworthiness. Those responsible for airworthiness
can only manage it indirectly by managing of the functioning of the organisation.
This is achieved by means of tasking maintenance or setting policy, defining an
organisational structure (including contracting out elements), providing
resources and conducting quality assurance. So whilst making engineering
assessments and specifying what physical actions are to be carried out on an
aircraft system is critical, the management of airworthiness is a wider
endeavour.
1.5 The Research Aim
The aim of this thesis is:
To apply resilience engineering concepts by producing a system
model of an airworthiness management organisation in order to
provide a tool to improve management of airworthiness.
1.6 Objectives
In order to achieve the aim the following research objectives were established:
Review the theoretical background to safety management and the
implications for airworthiness management.
Review the concepts of Resilience Engineering with an emphasis on
applying it to airworthiness management.
Establish a theoretical framework for a model of an airworthiness
management system.
Gather and use primary research data to establish and validate a model
of the airworthiness management system for the RAF Tornado Force.
Using the model, develop a tool to enhance the airworthiness
management system of the RAF Tornado Force.
6
1.7 Methodology Overview
A literature review of resilience engineering was carried out, which branched out
into source disciplines of systems thinking and engineering; control theory; non-
linear dynamics and complexity theory. A search for work in this area
addressing airworthiness or technical safety in other domains was conducted.
For the Tornado case study, the safety, airworthiness and assurance plans of
the various elements of the organisation were examined. Resilience
engineering provides a number of modelling techniques that could be applied to
the case study; these were assessed and down selected to the Functional
Resonance Analysis Method (FRAM). The system was assessed by semi-
structured interviews with key personnel as well as using a large amount of
information and experience gained from working within the system. The FRAM
Model was built within a spreadsheet and a separate model visualisation tool
was created using Microsoft Visio. This allowed for the identification of various
potential leading indicators for system safety. In order to validate the FRAM
model, specific case studies were required. Two incident reports and an
emergent airworthiness risk were selected for analysis.
1.8 Descriptions and Definitions
For simplicity the standard terminology as described within MAA02 – Military
Aviation Authority Master Glossary (MAA, 2012) is adopted for this thesis.
There are a number of minor differences in emphasis between terms used here
and in civil aviation or other domains; these are discussed where relevant.
1.9 Thesis Structure
This thesis is structured around the research objectives:
Chapter 2 describes the theoretical foundations for resilience engineering
in the context of the other theories of safety and safety engineering
practise in other domains. Potentially useful models are analysed.
Chapter 3 details the methodology for carrying out the primary research.
Chapter 4 describes the process for building the case study FRAM Model
– the Tornado Airworthiness System Model.
7
Chapter 5 describes the development of the FRAM visualisation tool.
Chapter 6 discusses how the FRAM Model may be used for incident
analysis with reference to two examples.
Chapter 7 gives a process for, and example of the FRAM Model as a risk
assessment tool.
Chapter 8 provides a general discussion of the case study exercise,
focussing on the applicability of Resilience Engineering to aspects of
airworthiness practise.
Chapter 9 provides some conclusions.
9
2 LITERATURE REVIEW
The literature review will examine arguments for broadening the scope of
airworthiness to address the complexities of managing modern aircraft,
maintenance and support organisations. Existing notions of cause, failure and
hazards are challenged as the theoretical background to resilience engineering
is described. Models and methods for understanding and managing the safety
and airworthiness of complex systems are examined using the paradigm of
resilience engineering.
2.1 Airworthiness in the Context of Safety
There are a number of definitions for the term airworthiness; all these have at
their core the need for the aircraft to be able to be operated in safety or as the
MAA has it; ‘without significant hazard’. Hazard is further defined as ‘an
intermediate state where the potential for harm exists’ (MAA, 2012b). The
hazard is said to lie between a cause (such as a technical or human failure) and
an accident. So whilst airworthiness is clearly a target for aerospace design
organisations to meet through satisfaction of certification standards, it is also an
element of system safety that requires management throughout the lifecycle of
the system. It is analogous to ‘technical safety’ or in other domains, which is
often separated ‘operational’ or ‘occupational’ safety.
2.1.1 Accident Investigations
The need to investigate loss of life or near misses is both a pragmatic and moral
choice. The conclusions drawn from such investigations are extremely
important at a human level but also critical to restoring system safety. It is
therefore vital for accident investigators to use mental and procedural models
that reflect the complexity of modern technologies. One of the largest accident
investigation agencies, the National Transportation Safety Board (NTSB)
determines a ‘probable cause’ in all its reports (Johnson and Holloway, 2004)
but ICAO recommends that ‘causes’ – plural are determined (ICAO, 2001).
This indicates a governing accident chain theory in the former organisation but
perhaps a slightly more sophisticated model in the latter. Various writers (De
10
Landre et al., 2006),(Coury et al., 2008) have proposed models or frameworks
in which multiple causes can be described in accident investigation. Much has
been written about the intersection between legal frameworks and accident
investigation methodologies. Dekker (2003) for example has described the
detrimental effect of the adversarial nature of justice. The rest of this chapter will
describe how assigning ‘root’ or probable cause to accidents is potentially
unhelpful in the context of complex systems. It follows therefore that notions of
blame or individual responsibility are often problematic to apply.
2.1.2 Initial and Type Airworthiness
Much of the airworthiness of a system is ‘designed-in’ before manufacture. This
involves specifications, systems configuration and assumptions on support and
maintenance philosophy. A structured systems engineering approach to safety
as described in ARP 4761 (SAE, 1996) is used to convince regulators that a
type certificate can be issued. The evolution of safety requirements and
regulation over a system’s lifecycle causes difficulty (Kelly and McDermid,
1999). Military aircraft in particular are often retained in service for many
decades. Whilst the technology may remain relatively constant, experience
shows that it is usual operational usage to evolve over the course of the
lifecycle. For this reason it is important to regularly adjust, validate and reassess
airworthiness assessments if the type airworthiness of a design is to be
maintained.
2.1.3 Safety Management
For many complex systems, the development of safety cases is a mandatory
requirement (MoD, 2007) and in particular for military airworthiness this is
governed by MAA Regulatory Article 1205 (MAA, 2013). The concept of a
safety case is the presentation or collation of a body of evidence to assure
interested parties that the system is safe. This body of evidence is collected and
organised according to mental or procedural models. The theoretical basis for
these models are the same theories of safety as described below. Safety
management systems are similarly structured according to the prevailing
11
theoretical approach to safety. An evolution in modelling requires an evolved
approach to safety management.
2.1.4 Continuing Airworthiness
Continuing airworthiness relates to the maintenance of a particular, safe system
state for each of the individual aircraft being managed (MAA, 2012b). Given that
it is never possible to comprehensively inspect/audit each aircraft before every
flight, there must be assumptions made as to the effect of organisational and
human interactions with the aircraft so as to maintain the system in a safe state.
Understanding maintenance system performance is critical to assuring
continued airworthiness. This achieved through a Continuing Airworthiness
Management Organisation (CAMO) which provides assurance that its specified
tasks are being undertaken successfully. This is primarily achieved through a
quality assurance system, which ensures that rigorous processes are
established (Casey, 2013).
2.2 A History of Safety Theory
Chapter One sketched out a chronological view of ‘Ages’ of safety theory. New
theories tend to gain traction as a result of the investigation to major accidents.
Herrera (2012) describes how safety theory has evolved across technological,
human factors, organisational and complexity ‘ages’, identifying key accidents
and ideas on a time line, which is summarised in Table 2-1:
12
Table 2-1 Herrera 's Ages of Safety Theory
Leonhardt et al (2009) presents breakdown of safety methodologies within a
Resilience Engineering White Paper. This document describes Technical,
Human Factors, Organisational and Systemic accident analysis and risk
assessment methods. Systemic models/methods are those that have recently
emerged to provide a means of analysing safety from a ‘complexity’ standpoint.
These are shown chronologically in Figure 2-1 with an expansion of each
abbreviation available within the glossary.
Time Accidents Technology Human Factors Organisational Complexity
1930s Domino Model
1940 - 50sFailure Mode Effects
Analysis (FMEA)
Human Factors
Design
Task Analysis
1960s Aberfan Colliery Disaster
Fault Tree Analysis (FTA) -
Minute-Man Missiles &
Boeing aircraft
Energy Barrier Model
Technique for Human
Error Rate Prediction
1970s
Flixborough & Seveso Chemical
Plants
Tenerife Aircraft Collision
Three Mile Island Nuclear Plant
Probalistic Risk
Assessment (WASH-1400
Reactor Safety Study)
Hazard & Operability
Analysis
Energy Damage and
Countermeasure Strategies
Man Made
Disaster
Information
Perspective
1980s
Bhopal Chemical Plant
Challenger Space Shuttle
Chernobyl Nuclear Plant
Kings Cross Railway
Piper Alpha Oil & Gas
Dryden Aviation
Crew Resource
Management
Safety Culture
Swiss Cheese
Model
Normal
Accident Theory
1990s
Warsaw Air Crash
Iraq Friendly Fire
Cali Air Crash
Arianne 5 - Space
Norne Air Crash
Longford Oil & Gas
Mandatory Safety Cases
(UK)Normal Deviations
Man,
Technology and
Organisation
Concept
Drift into Failure
Risk Influence
Model
High Reliability
Organisations
2000s
Uberlingen Air Crash
Columbia Space Shuttle
Helios Airways
Texas City Refinery
Nimrod Air Crash
Air France 447
Deepwater Horizon
Human Factors
Analysis &
Classification System
Failure of
Leadership,
Culture &
Priorities
Aviation Safety
Management
Systems
Resilience
Engineering
Theory of
Practical Drift
"Age" of Safety Theory
13
Figure 2-1 Accident Analysis and Risk Assessment Methods (Leonhardt et al,
2009)
Saleh et al (2010) present a slightly different narrative in the development of
safety theory. Whilst they note most of the same key ideas and developments,
they identify three tracks in safety theory leading towards the modern ‘system
and control theoretic’. These are illustrated below:
Figure 2-2 Three Tracks on the Evolution of Safety Theory (Saleh et al., 2010)
The tracks are not exhaustive and there is some cross coupling between ideas.
Herrera’s (2012) technological age can be likened to the middle track, the
defence in depth track is comparable to the organisational age whilst the top
14
track has many human factors elements but takes much from the current ‘age of
complexity’. The current state of the art is given as a systems engineering-
control theory approach. Saleh (2010) acknowledges that the literature in the
field is particularly fractured. This is perhaps because the various theories
emanate from disparate fields such as psychology, reliability, operations studies
and management.
2.2.1 Technological Age – Governing Philosophy
The predominant theme in the technological age of safety theory is that of a
‘chain of causation’; first visualised as a set of toppling dominos by Heinrich
(1950). Each domino represented a factor in the accident: Management
controls; failure of a man; unsafe acts or mechanical conditions; the accident;
injury. Once the first domino was toppled removal of either of the others would
prevent the final injury domino toppling. Related to this is the concept of an
accident or event chain, where causative elements or events link together to
form a chain, which if it had been broken would have prevented the accident. It
is unclear where this idea originated; it is perhaps a reflection that a linear view
of the world still represents the defining popular narrative for any major
accident. Leveson (2011) links this to an erroneous assumption that there is
always a cause for any given accident.
2.2.2 Technological Age – Tools
The notion of a linear event chain gave rise to methods of analysing system
safety or the related property of reliability. The Fault Tree Analysis (FTA)
methodologies were developed from reliability studies of the American
Minuteman missile system and quickly developed into a methodology for
analysing safety by defining the probability of an unsafe condition developing
(Herrera, 2012). Closely associated are event trees which define hierarchies of
events post a single initiating event (such as an unsafe condition). These
analyses use stochastic methods to forecast top level probabilities for accidents
caused by single or multiple failures lower down in the system. There is always
a mathematical audit trail from the top level system safety target, for example
hull loss probability in commercial aviation, down to individual system or
15
component reliability data or predictions. Importantly, modern system safety
assessments contain more qualitative information based on expert
understanding of systems; carried out through Functional Hazard Assessments
(FHAs) (Dalton, 1996). When analysing accidents using event chain type
models such as FTA, there is a question of how far back it is appropriate to go
in order to find an initiating event. Leveson (2011) argues that selection of
initiating events is often arbitrary in accident analysis. It has been accepted in a
large number of major accident reports that management commitment to safety
or ‘safety culture’ is a key factor in risk of accident (Dekker, 2005), yet there is
no clear way in which these vital considerations can be fitted into an event chain
model. Reason (1997) espouses a version of the event chain in the famous
‘Swiss cheese’ model of organisational accidents. Reason’s cheese has
become the de-facto mental model for understanding safety and accidents
within the military aviation community as shown by articles in the RAF’s Air
Clues in-house safety magazine demonstrate (Anon, 2011; Gale et al., 2013).
Whilst Haddon-Cave’s (2009) investigation into Nimrod addresses issues of
culture and complexity, his view of causation is essentially linear. Leveson
(2011) outlines why linear accident models of the technological age such as the
Swiss Cheese are no longer considered acceptable:
Direct Causality – there is a reliance on the notion that there is always a
linear relationship between event A causing event B.
Subjectivity in Selecting Events – The backward chain of events is
often shown to stop for a number of arbitrary reasons, which could
include familiarity with a particular event in the sequence (“We’ve seen
this before”), it deviates from a standard (component operates outside its
specification) or a lack of information (such as inability to understand a
human performance issue).
Subjectivity in Selecting Chaining Conditions – It is often not clear
which factors caused each other.
Discounting System Factors – Event chain models generally deal with
proximate causes and do not deal with issues such as culture or
16
organisational pressures which can pervade through a socio-technical
system.
A useful example of how this approach to accident analysis can prove
disastrous is given by Leveson (2011). She notes how an incident where a DC-
10 lost a cargo door (without loss of life) was attributed to the failure of a
baggage handler to close the door properly rather than a design floor meant
that two years later a similar incident resulted in the complete loss of a DC-10
near Paris in 1974.
2.2.3 Limits of Probabilistic Risk Assessment
Both civil and military airworthiness certification standards require certain safety
targets to be met. These targets are expressed in terms of probabilities,
principally probability of hull loss and death of passengers or crew; for military
aircraft this is specified in Regulatory Article 1230 – Design Safety Targets
(MAA, 2012a). There are various other targets regarding risk of harm to third
parties or other unsafe conditions – these are operating risks. Operating risks
are also commonly assigned qualitative risk levels; in the case of military
aviation this process is specified in Regulatory Article 1210 – Management of
Operating Risk to Life (MAA, 2012a). This regulation advises Platform
Operators and Project Teams to make use of Fault Tree Analysis to enable
calculation of these risks. For some UK military platforms this has resulted in
the introduction of ‘Loss Models’ to guide the assessment of new or emergent
risks. In the case of Tornado, the Loss Model (Sugden, 2011) is not a tool that
can be used in isolation for predictive risk assessment; rather it uses incident
statistics to provide a current picture of loss rates across the fleet (Woodbridge,
2012). The regulation and recommended practise (SAE, 2010; Lloyd and Tye,
1982) for both civil and military airworthiness and safety targets is for the use of
fault tree and dependency diagram models. These methods of probabilistic risk
assessment (PRA) are linear, which usefully provides for aggregation of total
risk. There are however a variety of issues to consider in their use. Apostolakis
(2004) provides a summary of some of the benefits and criticisms of PRA.
However in the case of airworthiness certification risk assessments the process
17
is generally based on a qualitative assessment of Functional Hazard Analysis
(FHA). FHA allows expert subjective analysis to provide an element of linkage
between various hazards. Equally Common Cause Analysis (CCA)
methodologies go some way to accounting for system-wide failure mechanisms.
The literature on resilience engineering disputes Apostolakis’ (2004) claim that
PRA deals effectively with true complexity.
Table 2-2 Benefits and Criticisms of Probabilistic Risk Assessment (Apostolakis,
2004)
Benefits Criticisms
Multiple failures considered
Increases likelihood of spotting complex failure interactions.
Facilitates communication.
Integrated Approach.
Identifies unknown areas for research.
Focuses risk management activity on key areas
Human actions during accident scenarios cannot be modelled.
Difficulty of quantifying software failures.
Cannot model safety culture.
Difficulty estimating design and manufacturing errors.
PRA models are essentially a product of the ‘technical era’ of safety science,
they assume linear behaviour and that the systems being analysed are
tractable; thus decomposable into independent subsystems. This remains the
de-facto approach to managing most complex socio-technical systems and
forms the basis of the safety case approach prevalent within many regulatory
environments. The fundamental assumptions that justify their use are
questionable when applied to complex socio-technical systems. The principle
concern is that the human element cannot be satisfactorily modelled using
Boolean logic, in systems where there are frequent interactions with humans,
whether operators, maintainers or design or support engineers this presents the
possibility that common cause failures will be built into the system and that the
relationships will be non-linear.
18
2.2.4 Human Factors
Herrera (2012) outlines how 20th century disasters such as Three Mile Island
and Flixborough showed that the event chain models were becoming
inadequate – the focus began to shift to human failing, with the human identified
as the number one unreliable component in the event chain. Herrera (2012)
highlights two trends in the age of human factors; studies concerned with
eliminating human error by design for human performance and studies into how
humans cope with disturbances.
2.2.5 Organisational
‘Man Made Disaster’ theory was the initiating scholarly theory behind
organisational accident theory (Saleh et al., 2010). This theory noted that within
a certain class of events known as ‘man made disasters’ there were multiple
events chains that reached a long back into the past and that management and
organisation were key factors in causing accidents. Saleh (2010) also notes
‘Normal Accident Theory’ and ‘High Reliability Organisations’ as key precepts of
the organizational accident. Normal accident theory notes that there are tight
couplings between interacting causal factors in complex system accidents and
that they cannot be predicted. This has been condemned as a somewhat
fatalistic view. Herrera (2012) sees High Reliability Organisation Theory as a
counter to Normal Accident Theory. This characterises successful organisations
as those operating complex systems with a very small number of accidents.
Saleh (2010) notes that the research highlights a number of common
characteristics of such organisations such as:
Preoccupation with failure and organizational learning.
Commitment to and consensus on production and safety as concomitant
organizational goals.
Organizational slack and redundancy.
These facets of successfully safe or high reliability organisations correspond to
aspects of ‘safety culture’ as described by Reason (1997) and others.
19
2.3 Complexity
Aircraft are complicated machines; they have many components interacting in a
multitude of combinations. Dekker (2011) holds that analytic reduction, as
practised within traditional linear safety analysis, is unable to describe how
system elements and processes behave when exposed to multiple
simultaneous influences. He also describes the key distinction between a
complicated system such as an aircraft, which could conceivably be
disassembled then reassembled by a single person and complex systems. A
complex system is one where the boundaries are ‘fussy’ (require highly detailed
definition) and the structure is intractable; an aircraft operated subject to human
factors, culture, regulatory and organisational factors is therefore complex.
Cilliers (2005) defines complex systems as those having the following
properties:
Large numbers of simple elements.
Dynamic, propagating and non-linear interactions; these define
behaviour which is emergent and cannot be understood by inspection of
components nor predicted by deterministic methods.
Open, exchanging energy and information with the environment.
Memory is distributed within the system, influencing behaviour.
Adaptive behaviour; without the intervention of external agents.
This study assumes that the complete aircraft system, incorporating its
operation and support is complex rather than simply complicated. It could also
be argued that the edition of extensive software within aircraft renders the
system complex. The safety management system and airworthiness
management in particular must deal with complexity.
For those charged with managing the safety of complex systems, understanding
models for accidents and studying post mortem analyses of accidents does not
present a comprehensive approach to prevention. It is generally accepted that
events, hazards and risks often combine in unexpected ways. Is it therefore
adequate to manage safety risk as a game of ‘whack-a-mole’; eliminating or
20
mitigating risks as and when they become apparent (Zarboutis and Wright,
2006)?
It may be argued that a proactive reporting culture does much to allow
elimination or mitigation of risks before they materialise. Heinrich’s (1950) ‘ice
berg’ model drives much of this effort to uncover previously unknown risk and
there is an indisputable logic which says that knowing about a risk is a first step
to eliminating or managing it. The continued history of complex accidents tells
us that this approach may never be completely effective in preventing
unexpected failure (Hollnagel, 2007). Leveson (2011) explains that the concept
of a High Reliability Organisation confuses notions of safety and reliability. Just
because individual components of a socio-technical system can be proven to be
individually reliable it does not follow that safety will necessarily emerge as a
system property. Systems may be reliable yet unsafe, such as the NASA Mars
lander which crashed because the designer failed to anticipate the interaction
between the software and mechanical systems. Equally it is possible for a
system to be unreliable yet safe where systems fail-safe.
2.3.1 Complexity Theory
Accident investigation or analysis of complex system failure requires a mental
model to be applied to the accident scenario (Hollnagel, 2011). Similarly
accident prevention through risk management uses modelling to understand
potential accidents. Hitchens (2003) describes how complexity is relative to the
observer’s frame of reference. Modelling complex systems requires judgement
as to the extent of elaboration or its converse; encapsulation. He proposes that
systems derive their degree of complexity from their variety, connectedness and
disorder. Socio-technical systems are increasing in complexity as a result of the
increased use of networks. Manson (2001) provides a useful review of
complexity theory, most of the branches of which have an antecedent in general
systems theory. Three main branches of complexity theory are identified;
‘algorithmic complexity’ which gives that complexity is defined by the difficulty in
describing system characteristics. ‘Deterministic complexity’ deals with chaos or
catastrophe theories which posit that stable complex systems may become
21
suddenly unstable ‘Aggregate complexity’ deals with how elements interact to
produce complexity. A key property of complex systems is that of emergence
which describes how system-wide characteristics cannot be computed by the
aggregation system component behaviour. Zabourtis (2006) highlights that
patterns that emerge from complex socio-technical systems which erode the
resilience of complex systems. Grøtan et al (2011) gives a good account of the
theoretical foundations of complexity and how they can be applied to risk
assessment; the ‘Cynefin’ Framework provides a summary.
Figure 2-3 The ‘Cynefin’ Framework – Complexity and Risk Management (Grøtan
et al., 2011)
Generally the literature shows that whilst linear thinking has reached its limits
within system safety science, complexity theory has yet to be completely
applied to the problem. Zabourtis (2006) identifies how complexity theories can
be used to replace HAZOPS type safety analyses. The key inputs should be:
How can system entities co-adapt?
What will the probable effect be on the whole?
How can such patterns be eliminated?
22
The output of such an analysis should therefore be some means of avoiding the
emergent harmful properties. Dekker (2011) advises that complexity theories
can be applied to accident investigation if the search for a single cause is
dropped and multiple narratives are allowed to overlap and on occasion
contradict each other. The nature of complexity defies analysis; Cilliers (2005)
writes on the ‘incompressibility’ of complex systems, in that the only reliable
model of a complex system is that which has the same level of detail as the
system itself. Clearly this is impractical, yet as any model will involve
simplification, disregarded elements may have non-linear effects and the
magnitude of the potential outcomes may be non-trivial. However Cilliers (2005)
also states that whilst modelling and computing complex systems will never be
sufficient, it is still necessary.
2.3.2 Systems Thinking and Systems Engineering
The concept of a system is well-established with roots in philosophy and
thermodynamic theories leading to theories and practise surrounding systems
engineering. Hitchens (2003) provides one definition:
A system is an open set of complementary, interacting parts with properties,
capabilities and behaviours emerging both from the parts and their interactions.
The concept of emergence is an important one; accidents are emergent system
states of disorder. Systems engineering involves the generation of models to
represent a system (Oliver et al., 1997). Leveson (2011) first describes how
safety ought to fit into systems engineering’s primary activities – Needs
Analysis, Feasibility studies, Trade studies, System architecture development
and Interface analysis. This is the basis for system safety assessments
employed in generating evidence for airworthiness certification as per ARP
4761 (Dalton, 1996). Saleh (2010) distinguishes between failure modes
attributable to component failure and those failures attributable to emergent or
interactive failures; his thesis is that a systems theoretic approach addresses
this second set of failures. However he raises concerns that formal systems
theoretic approaches such as co-ordinatability and consistency in hierarchical
and multilevel systems are yet to be fully applied to safety analysis. Leveson’s
23
(2011) Systems-Theoretic Accident Model and Processes (STAMP) uses
control theory and processes as the key to prevention of accidents. It
decomposes the system across the complete lifecycle, from concept to
disposal, into a series of control loops. The key to prevention of accidents is
said to be keeping the entire system in a state of equilibrium, which is achieved
by applying constraints to implement control. The model is said to more
effectively deal with software than traditional notions of failure. STAMP utilises
descriptions of control loops at technological subsystem level, human controller
level and socio-technical organisation level, shown in Figure 2-5. STAMP uses
a taxonomy of control loop failure modes as an audit check list. Salmon et al
(2012) compares STAMP to other models concluding that STAMP provides a
more comprehensive system description but it is difficult to incorporate human
failures into the model, which itself needs a highly developed understanding of
the whole system. This highlights the difficulty in applying theoretically strong
models of complexity to particular scenarios.
24
Figure 2-4 General Form of a Model of Socio-technical Control (Leveson,
2011)
2.3.3 Control Theory
STAMP (Leveson, 2011) suggests that safety can be treated as a control
engineering problem and Saleh (2010) identifies this idea as an important
corollary to the development of a systems thinking approach to safety.
Kontogiannis and Malakis (2012a) describe how the concept of a model with
control loops is fundamental to systems safety incorporating human and
organisational factors. Hollnagel and Woods (2005) produced an Extended
COntrol Model (ECOM) which describes generically how organisational
25
processes transfers downwards to directly interact and control the technological
system and hence alter its state. The Viable System Model (VSM) uses
cybernetics principles to describe how safety goals are transferred downwards
through an organisation and how output is controlled by various measures such
as audit (Espejo, 1989). Kontogiannis (2012a) combines these two models and
applies them to studying the accident involving the crash of flight AEW-241 in
December 1997. Like many control and systems models in the safety literature
Kontogiannis (2012a) highlights the difficulty of applying the models for the
purposes of accident prevention. Kontogiannis (2012b) also tries to apply these
principles in a case study involving emergency helicopter operations.
2.3.4 Non-Linear Dynamics
Control of complex socio-technical systems needs to address the problem of
non-linear behaviour. Bendat (1998) describes how physical and engineering
systems can be divided into linear and non-linear systems. A system is linear,
if for any inputs and and for any constants ,
Equation 1 – Linear System (Bendat, 1998)
[ ] [ ] [ ]
This leads to 2 properties:
Equation 2 - Additive Property (Bendat, 1998)
[ ] [ ] [ ]
Equation 3 – Homogeneous Property (Bendat, 1998)
[ ] [ ]
A non-linear system is therefore one where,
Equation 4 – Non Linear System; lack of Additive Property (Bendat, 1998)
[ ] [ ] [ ]
Equation 5 – Non Linear System; lack of Homogeneous Property (Bendat, 1998)
[ ] [ ]
26
This means that for a linear system with a random theoretical Gaussian
probability density function as an input (e.g. a normal distribution), the system
will transform that data and produce an output with a Gaussian probability
density function as an output. Bendat (1998) also makes the point that any
physical system will display non-linear properties if the input conditions are
suitably wide. As this is true for numerous examples in flight dynamics it is also
true for various instances in safety and reliability, where oversimplifying
assumptions are made regarding the condition of equipment and its interaction
with maintenance and operating organisations. Human behaviour often defies
mathematical modelling due to its complexity and non-linear properties. As
previously described, it is common for safety analyses and models to assume
linear behaviour. In fact complex socio-technical systems generally exhibit a
lack of additive and homogeneous properties; where different inputs combine to
produce unexpected and ‘out-of-control’ outputs resulting in accidents. This
explains some of the difficulties encountered in producing a workable approach
to human and organisational reliability, as outlined by Rasmussen (1997). Non-
linear effects explain the concept of emergence that is the behaviour of linear
systems are predictable and tractable, yet nonlinear systems produce
unexpected results. Grøtan (2011) outlines how this leads to the concept of
‘Black Swan’ events that are unexpected with a huge impact – such as a
catastrophic accident with a complex system. These are understandable in
retrospect but could not have been predicted. Leveson (2011) describes how
such accidents are as a result of non-linear interactions between components of
the system, whether human, organisational or technological. The key to
developing an improved method of managing safety and estimating risk will be
to understand and predict these non-linear interactions.
2.4 Resilience Engineering
The theory of resilience engineering is emerging as a response to the problems
posed to safety management and engineering by complexity theory and the age
of the organisational accident as described by Reason (1997). The central
theme is to move from a focus on failure, where notions of component reliability
27
are applied to complex systems, humans and organisations; to looking at how
systems can succeed under varying conditions. The literature on the subject is
somewhat fragmented, although a series of books has been published, which
bring together the key ideas. One of the aviation organisations embracing
resilience engineering is EUROCONTROL which is a multinational air traffic
management service provider with Leonhardt et al (2009) publishing a white
paper on the application of resilience engineering within the organisation. This
illustrates that there is a blurred line between ‘traditional resilience’ study as
applied to infrastructure, and resilience engineering which has emerged from
the study of safety. Hollnagel et al (2011) give a simple definition of resilience:
“Resilience is the intrinsic ability of a system to adjust its functioning prior to,
during, or following changes and disturbances, so that it can sustain required
operations under both expected and unexpected conditions.”
Woods and Hollnagel (2007) set the scene for resilience engineering. They
outline fundamentals which include a shift away from the traditional safety focus
on ‘what went wrong’ (hindsight) and what could go wrong (risk assessment) to
a focus on ‘what can go right’ for risk assessment and ‘what did go right’ for
accident analysis – also neatly summarised by Schafer (2012). Resilience
engineering also rejects the notion of human failure, error taxonomies and
reliability analysis of complex systems in favour of a theory that failures
represent either the breakdown in strategies for coping with complexity, or an
unfavourable combination of functional variability within a system (technological,
human or organisational). In resilience engineering, safety is redefined as the
ability to succeed under varying conditions. By observing how systems work
under everyday pressures, it should be possible to understand the level of
resilience in a system and how it might be engineered to increase this quality.
For the purposes of both accident investigation and risk assessment it is
necessary to move away from linear combinations of events to an
understanding of how a system might lose its dynamic stability and veer into an
accident trajectory (Hollnagel et al., 2007). In summary, there are four key
precepts to Resilience Engineering:
28
1. Performance conditions are always underspecified. Individuals
and organisations must therefore adjust what they do to match current
demands and resources. Because resources and time are finite, such
adjustments will inevitably be approximate.
2. Some adverse events can be attributed to a breakdown or
malfunctioning of components and normal system functions, but others
cannot. The latter can best be understood as the result of unexpected
combinations of performance variability.
3. Safety management cannot be based exclusively on hindsight,
nor rely on error tabulation and the calculation of failure probabilities.
Safety management must be proactive as well as reactive.
4. Safety cannot be isolated from the core (business) process, or
vice versa. Safety is the prerequisite for productivity, and productivity is
the prerequisite for safety. Safety must therefore be achieved by
improvements rather than by constraints.
These precepts define a theoretical approach drawn from various ideas about
organisational accidents and safety culture. The key development is the focus
on the functions within the system and the emphasis on improving their
combined performance, rather than a focus on the potential sources of hazards
and barriers for accident prevention. This positive standpoint is a key attraction
to the approach; the drive for operational performance improvement and safety
can be in synergy rather than in conflict. Hollnagel (2011) gives four
cornerstones to the practise of resilience engineering. The first is knowing what
to do to respond to everyday disturbances – the actual. The second is knowing
how to monitor potential threats from the environment and from the functioning
of the system itself – the critical. The third part of the practise is knowing what to
expect in terms of threats and opportunities in order to address potential.
Finally, the fourth ‘cornerstone’ is that of the ability to address the factual
through learning.
29
A slightly different conceptual framework for Resilience Engineering is
presented by Madni (2009); offering more concrete requirements for
operationalising the practise:
Responding
(Actual)
Learning
(Factual) Monitoring
(critical)
Anticipating
(Potential)
Knowing what has happened
Knowing what to do
Knowing what to look for
Knowing what to expect
Figure 2-5 The Four Cornerstones of Resilience (Hollnagel, 2007)
30
Figure 2-6 Conceptual Framework for Resilience Engineering (Madni, 2009)
2.4.1 Resilience Engineering as a Successor to Safety Management
Leonhardt et al (2009) puts the resilience engineering approach to safety
management simply:
The more likely it is that something goes right, the less likely it is that it goes
wrong.
Cambon (2006) provides a resilience framework for assessing safety
management systems; they propose a number of metrics based on Tripod
theory, which essentially measures the performance conditions under which the
SMS operates. The balance of these performance conditions is said to
determine the stability of the SMS. ‘Engineering’ implies design and
Beauchamp (2006) notes how this can be achieved through organisational
learning to provide organisational resilience; a model for guidance is provided.
Zarboutis (2006) describes how, analogous to Rasmussen’s (1997) approach to
organisational drift, resilience engineering can identify symptoms of an erosion
in resilience. Johansson (2008) provides a ‘quick and dirty’ approach to
evaluating resilience in systems; a helpful overview but does not prescribe
specific improvement or change activities. Stoker (2008) outlines a
comprehensive approach to the assessment of operational resilience,
effectively specifying a goal based hierarchy for elements contributing to
resilience; producing a check list approach. Whilst this is undoubtedly a
valuable activity, it is questionable whether it will be able to deal with the
emergence of safety issues.
2.4.2 Under Specification of Performance Conditions
Under specification of performance conditions, that is the factors that affect the
execution of a particular function is key concept in the literature (Hollnagel,
2007). In most organisations performance conditions are subject to control
through rules, with the idea that this will improve safety. Hale (2013) reviews the
literature on this, noting that there are two approaches; a classical top down
approach, punishing transgression and secondly a bottom up approach that
31
sees expert ability to adapt to changing circumstances as paramount.
Nathanael (2006) notes that it is impossible to make what happens in practise
match that which is espoused by officialdom; the key to generating resilience is
dialogue between the hierarchical levels.
2.4.3 Performance Variability
Resilience engineering regards performance variability as inherently useful; it
allows operations to continue in underspecified conditions. It also provides the
potential for coupling between functions where upstream performance variability
combines with downstream performance variability to grow in amplitude. This
phenomenon can be harnessed for system success or else it provides an origin
for safety risk ( Hollnagel, 2012).
2.4.4 Examples of Resilience Engineering in Practice
Resilience engineering is more theoretical than its name suggests and
discussion abounds over the practicality of implementing its precepts is
uncertain. However, its principles can be found in evidence where it was not
specifically applied. Table 2.3 provides a brief summary of some examples.
Table 2-3 Examples of Resilience Engineering in Practice
Industry Tools Insights
Process Industry
Survey of workforce using Principal Component Analysis
Shirali et al.(2013) attempt quantitative measurement of resilience at an organisational level. Only possible to measure the potential for resilience rather than resilience itself. The following variables are given as indicators:
Top management commitment
Just culture
Learning culture
Awareness and opacity
Preparedness
Flexibility
Process Industry
Bayesian Networks Resilience Dashboard
Pasman et al. (2013) define a holistic control methodology for plant safety using leading indicators derived from process measurements within the plant. Also use of process simulation tools to develop scenarios. Traditional
32
(not currently achievable)
HAZOP/FMEA analyses do not capture all potential accident scenarios. Key Points:
Technical resilience can be measured/simulated. Organisational factors less so.
Importance of leading indicators to enable response to variations
Difficulty in dealing with drift in safety metrics.
Safety Gains made through interdepartmental cooperation vs common cause failures.
Advocate extensive use of bow-ties.
Aviation Interviews, audit and expert analysis
An investigation into both the sources of resilience and sources of brittleness. Comparison of two comparable small air carriers. Identification through extensive interviews. Resilience and brittleness categorised and risk assessed (Saurin and Carim Junior, 2012).
Air Traffic Management
FRAM Analysis of a mid-air collision fatal accident. Provides notes on buffering capacity, flexibility, margins, tolerance and cross scale interactions. There was no root cause – aircraft and ATM was operating normally. The system was inadequate (de Carvalho, 2011).
Aviation Bayesian Belief Networks (BBN)
Examines the use of and qualification of experts to provide probability estimates for BBN. Hidden common causes in BBN – principally safety culture. Difficulty in estimating frequencies or probabilities of rare events. BBN assume the ‘Causal Markov Condition’ therefore common cause failures are difficult to deal with – maybe applying BBN to FRAM would solve this issue (Brooker, 2011).
Aviation FRAM Alaska Airlines flight 261 accident analysed to understand FRAMs performance against 5 key resilience characteristics: buffering capacity, flexibility, margin, tolerance, and cross-scale
33
interactions (Woltjer, 2007).
Railways FRAM Interdisciplinary safety analysis of complex socio-technological systems based on the Functional Resonance Accident Model: an application to railway traffic supervision (Belmonte et al., 2011).
Nuclear FRAM Specific case study surrounding a task to move Nuclear Fuel – a specific task analysis rather than a generic system approach (Lundberg, 2008).
2.4.5 Criticism of Resilience Engineering
Oxstrand and Sylvander (2010) argue that Resilience engineering is little more
than a rebranding of safety culture; they do not see how the practise can be
applied to the nuclear industry which already uses both PRA and human
reliability analyses in the licensing of nuclear plants. In this industry it is argued,
safety culture forms part of every operation. The nuclear industry defines safety
culture as:
“Safety Culture is that assembly of characteristics and attitudes in organisations
and individuals which establishes that, as an overriding priority, nuclear plant
safety issues receive the attention warranted by their significance.”
International Atomic Energy Authority (Edwards et al., 2013)
Clearly safety culture is fundamental to engineering resilience into a socio-
technical system. The theory of safety culture does not in of itself propose a
different conceptual framework for the origin of unsafe system performance.
Also some safety culture literature describes a requirement for safety to become
the overriding priority for an organisation (Edwards et al., 2013). Clearly this is
at odds with notions of efficiency-thoroughness trade-offs and the requirement
to increase the proportion of activities that ‘go right’ as a means for reducing the
number that ‘go wrong. Whilst Resilience Engineering draws on much of the
theory around safety culture, it goes a lot further in proposing ways in which
organisations can be designed, analysed and modified in order to deliver
34
resilience. Le Coze (2013) describes a number of criticisms of Resilience
engineering the foremost amongst these being scepticism over the need to
introduce a new vocabulary to safety science. He also notes that the social
concept of power is missing from the resilience literature, although it could be
argued that the exercise of social power could be modelled as a function or a
resource. He also notes that many have disagreed with the notion that
resilience engineering does not present anything new; it collects simply
connects a number of existing ideas, foremost of which is the High Reliability
Organisation concept. He does note that the proof of the concept will be in its
application to real systems – testing the worth of the ‘engineering’ aspect of the
theory. McDonald (2008) asserts that Resilience Engineering is attractive
because other models are weak. He notes that the theory needs to be further
unified and demonstrated in practical examples.
2.4.6 Resilience Engineering and Airworthiness
Current MAA (2011a) policy is based on the idea that airworthiness is made up
of four pillars: the safety management system, compliance with recognised
standards, competence (of people and organisations) and independent
assessment. All of these activities and qualities are likely to contribute to the
resilience of an airworthiness system. Wilson (2008) provides a system model
for resilience of an airworthiness system and presents a number of key ideas:
The requirement for ‘organisational mindfulness’ – a safety culture keen
to seek out areas of risk.
Balancing ALARP principles with ‘And Still Stay In Business’ which could
be thought of as an efficiency thoroughness trade off; as per Hollnagel
(2011).
Understand how the organisational boundaries contribute to safety;
dealing with outsourcing, partnering and regulation.
Translate strategies into management frameworks for managing
organisational risk – these can be represented by ‘framework diagrams’
that show the factors that impact on safety management systems.
35
This work was succeeded by a thesis by Wilison (2012) which produced a
framework called RISK2VALUE which provides an integrated management
framework and decision support tool kit which address both safety and value
management at an organisational level. A generic diagram shown at Figure 2-7
is provided to support decisions – the use of which is illustrated by means of an
extensive diagram mapping various relationships. The strength of this approach
is that it either provides a generic approach to an audit of airworthiness or would
guide the construction of a new system. Equally it provides an assessment of
socio-technical factors surrounding accidents. A criticism that could be levelled
at the tool is that the linkages between the elements are not explicitly defined
and it therefore unclear how changes would influence the path that the
organisation took through the diagram.
37
Figure 2-7 Framework for managing the impact organisation, technology and human factors have on safety management systems (Wilson, 2008)
38
2.4.7 Lean Resilience
Leondhart (2009) notes that modern business systems are largely premised on
‘just-in-time’ processes. This methodology increases efficiency and
consequently coupling between upstream and downstream functions. Individual
system boundaries are more difficult to define as, for example, maintenance
units become increasingly tightly dependent on supply chains. Carney (2010)
urged caution in the introduction of lean principles and envisaged a hybrid
between lean maintenance and a more traditional model. Resilience
engineering in other domains has shown that it is in fact possible to harness the
approach to introduce production improvement alongside safety (Hounsgaard,
2013). Lean methodology is profoundly linear in its thinking (Carney, 2010); this
methodology is easily deployable in a highly tractable system such as a
production line. In less tractable systems such as maintenance it is likely that
Resilience Engineering techniques will produce better results.
2.5 Functional Resonance Analysis Method
The resilience engineering literature lacks specific methodologies or tools for
practical implementation of resilience engineering principles. The notable
exception is Hollnagel’s (2012) Functional Resonance Analysis Method
(FRAM). This is a technique for building models of complex socio-technological
systems. It differs from STAMP, in that it is a method for generating a model
rather than a model. FRAM maps the system as a series of functions, defined
by their various ‘aspects’ and linked ‘activities’.
O
C
P
I
T
R
FUNCTION
Time Control
Output
ResourcesPreconditions
Input
Figure 2-8 FRAM Function
39
By analysing the output variability from each function and the extent to which
this variability is damped up-stream, it is possible to begin to understand how to
analyse system performance from a resilience engineering point of view. The
FRAM forms the basis of the case study in later chapters and is described in
detail in Chapter 4.
2.6 Quantifying Resilience
Most approaches to quantifying resilience rely on surveys and audit approaches
such as those described by Shirali (2013) or by Saurin (2012). However whilst
an overall system assessment is of value, system managers are interested in
particular risks and being able to quantify them and manage them towards
ALARP levels, as required by legislation. Within process industries a high
degree of automation can be achieved within intensive data collection and
monitoring. These aspects mean that it is comparatively easy to run simulations
and model different systems. Risks can therefore be assessed in a more
quantifiable manner Pasman (2013). A reliability approach to safety is easily
quantifiable through linear decomposition to produce probabilistic risk
assessment. By contrast it is much more difficult to provide quantitative
assessment using a resilience engineering approach. Luxhøj (2003) and
Williams (1996) present Bayesian Belief Networks as a potential solution to low
probability – high consequence risks. Slater (2013) has presented an approach
to nesting BBN within a FRAM model and hence providing a way of quantifying
risk analysis developed through FRAM. He presents this technique as an
alternative to HAZOPS for use in process and transport industry. Brooker
(2011) analyses BBN in the aviation domain, specifically focusses on the ability
of experts to provide accurate assessments of probability in the case of low
probability events. He notes the ‘Causal Markov Condition’ which is an
assumption in BBN that there is no common cause Failure mode across the
network; issues such as ‘safety culture’ are therefore difficult to address. Other
potential techniques for quantification are the use of fuzzy logic or fuzzy set
theory with the use of Monte Carlo simulation (Shirali, 2013). An approach to
quantifying resilience in the context of civil infrastructure is presented by Vugrin
40
(2009), providing a menu of control engineering methodologies that may be
suitable. The issue of data collection in more human centric systems remains a
barrier to expansion of this method. Quantification is the key if Resilience
Engineering is going to gain ground against more traditional risk assessment
techniques.
2.7 Concluding Remarks
The various ages of safety theory were all products of the technology of their
time. Now in an age characterised by networked technology it is clearly time to
fully address notions of complexity for the purpose of providing safe systems.
This is certainly the case for the new generation of civil and military aircraft.
Resilience Engineering appears to offer a different approach to previous
theories and models. In particular the notion that accidents emerge from
unforeseen combinations of varying functional performance is a powerful one. It
offers the prospect that analysis from this perspective might provide risk insights
that may otherwise be missed. It also rings true from experience within an
airworthiness environment. Notions of ‘accident trajectories’ and holes in
processes or defences do not resonate in the same way. There is an
opportunity to combine efforts in process improvement and efficiency with
safety strategies. Resilience engineering offers the theoretical framework and
FRAM provides a potential method. This will be explored in subsequent
sections. It remains the case however that there is some way to go to
operationalize Resilience Engineering; Madni (2009) lists the key issues:
Help organizational decision makers in making trade-offs between
severe production pressures, required safety levels and acceptable risk.
Measure organizational resilience.
Identify ways to engineer the resilience of organizations.
The following chapters outline a case study in which this approach is tested.
41
3 METHODOLOGY
3.1 Introduction
In order to meet the research aim it was necessary to choose a technique with
which to model an airworthiness management system. The literature review
revealed that the Functional Resonance Analysis Method (FRAM) was the best
way to practically apply resilience engineering principles. The FRAM therefore
formed the basis of the practical element of the research. A single case study
organisation was used, with an aspiration of delivering an operationally useful
tool to the organisation at the end of the project. The case study was conducted
in two stages:
Stage 1 – Construct a FRAM Model of the Airworthiness Management
System and concurrently develop a visualisation tool.
Stage 2 – Test the model using scenarios drawn from occurrence
reporting and potential in-service airworthiness risks.
The model was developed iteratively, using expert opinion and data from a
variety of sources.
3.2 Working Arrangements
A key difficulty reported by other FRAM practitioners has been understanding
‘work as done’ rather than ‘work as imagined’. This was mitigated by conducting
the research from within the case study organisation on a part time basis, whilst
working within the Force Operations Centre. Moreover, this was preceded by 9
years work in other roles in military airworthiness; including quality assurance,
process improvement and error investigation roles. This provided insight into
‘work as done’ practise. Whilst there was a risk of bias, this was mitigated to
some extent through exposing parts of the model to other workers within the
organisation for verification.
3.3 Research Interviews
Semi-structured interviews were conducted with 19 different workers across all
of the functions. The interviews were flexibly arranged at the interviewees work
42
location (generally offices but control rooms and tool stores were also visited). A
pre-briefing was provided in the form of a two sided A4 document, shown at
Appendix C. The average interview duration was around 30 minutes, giving a
rough total of around nine and a half hours of interview time over the course of
the project. The general interview structure was as follows:
Check understanding and clarify scope of the study.
Confirm that participant was currently engaged in the function as part of
their daily activity.
Check accuracy of each of the function aspects.
Open questioning to highlight particular areas of variability in the
‘aspects’ of the function.
Open questions to ascertain whether any aspects had been missed.
Open questions to ascertain whether participants work covered any
further relevant functions.
The following research interviews were conducted:
Deputy Continuing Airworthiness Manager
Engineering Authority – various team members.
Military Airworthiness Review Certificate team member.
Continuing Airworthiness Management Organisation Quality Manager.
Experienced Aircraft Technician at Inspector Level.
Tornado Forward Fleet Manager.
Front Line Squadron Senior and Junior Engineering Officers.
Front Line Squadron Rectification controller, Line Controller, Weapons,
Mechanical and Avionic Trade Managers, with additional contributions
from various mechanics, technicians, supervisors and inspectors.
Tool Stores Controller.
Rolls Royce Technical Support Manager.
BAES Technical Support Manager.
BAES Reliability Engineering Manager.
Depth Workshops Supervisor.
43
Ground Support Equipment Trade Manager.
Station Air Safety Officer.
3.4 Model Development
Most if not all risk assessment or incident investigation methods require
practitioners to be trained in the application of the technique and are generally
most effectively applied in teams (e.g. HAZOPS, Safety Panels, etc.). Time and
resources precluded this approach for the case study; however insight from a
number of other practitioners’ case studies was gained through attendance at
the annual FRAM Workshop in Munich. Whilst Hollnagel’s (2012) guidelines for
FRAM model development were followed, the final Tornado Airworthiness
System Model used a number of innovative approaches. The main innovation
was the use of a Microsoft Visio drawing to provide an interactive ‘visualisation’
tool. This approach allowed the creation of a much larger model than has been
recorded to date in the literature. The visualisation tool was developed
concurrently with the spreadsheet model, which allowed for greater accuracy by
cross-checking between the two methods of describing the model. The final
model contains a total of 69 individual functions with 985 individual aspects
described. Where inconsistencies in the model became apparent or there was a
gap in knowledge, a variety of experts were used to provide additional
information through conversation or correspondence. In particular, various key
meetings were attended which provided insights that assisted with model
development:
Force Operations Centre Daily Summary.
Joint Qualifications and Trials Meeting.
Level B Capability Programme Reviews.
Various Upgrade Readiness Reviews.
Fleet Planning Meetings.
Scheduled Maintenance Reviews.
Depth HQ Value Stream Analysis – Continuous Improvement Event.
Mission Essential Equipment Continuous Improvement Event.
Air Safety Occurrence Investigators Workshop.
44
It is not possible to attribute individual model elements to particular sources; the
iterative nature of FRAM development precludes this in an experimental project
of this nature. It is recognised that this is a weakness in the process, however
this is mitigated volume of cross checking required to ensure model
consistency. The visualisation tool provided the final check of model
consistency in that all aspects had to be connected to another function or to an
external resource – loose ends were not allowed.
3.5 Air Safety Information Management System Data
Data from the Air Safety Information Management System (ASIMS) was used to
provide information on the variability of output from various functions within the
model.
3.5.1 Data Extraction
Data was extracted from ASIMS using the ‘Search Reports’ facility (MAA,
2011a), which allows user to apply various filters to the database. This allowed
only Tornado DASORs to be considered, within an initial selected date range of
1 Jan 2006 to 1 Nov 2013. This range allowed consideration of a time period in
which organisational structures have been relatively stable e.g. since the end-
to-end logistics transformation process (2001-2006), where a large number of
previously in-house functions were outsourced to industry. This date range was
then reduced to the most recent 12 month period 1 November 2012 to 1
November 2013, once the work required to analyse each entry became
apparent. ASIMS uses a standard taxonomy to describe both Occurrence
Cause Groups (OCG) and event descriptors. The MAA describes OCG as the
“final link in the chain which caused the occurrence… the one and only final
cause” and event descriptors are other ‘events in the chain’ (MAA, 2011a).This
clearly represents a linear accident model rather than the complex model
represented by FRAM. That said, both OCG and event descriptors do provide a
useful indication as to instances where undesirable functional variability
occurred. A second issue was that the FRAM Model was limited to
airworthiness management rather than ‘flight safety’ or ‘air safety’ in totality.
Operator actions that affected the airworthiness of the aircraft were included in
45
the ASIMS download, this was because such incidents rely on the performance
of additional functions to maintain continuing airworthiness following harmful
variability in the ‘operate aircraft function’. For example an operating pilot might
inadvertently cause a flap over-speed; this then relies on fault reporting, and
corrective maintenance functions (amongst many others) to perform within
acceptable limits in order to restore airworthiness. Table 3-1 shows the cause
and event descriptors that were included in the ASIMS ‘search report’ filter and
consequently the downloaded data. Reports that were captured in the ASIMS
filters but that were found to have no airworthiness aspect were deleted. This
left a total of 426 reports for analysis.
Table 3-1 D-ASOR Classifications included in Data
Cause and Event Descriptor Sub-Categories Included in Download
Hostile Action Nil
Human Factors (ATC/ABM) Nil
Human Factors (Aircraft Operation)
Flap / Slat / Airbrake Overspeed, Fuel Management, Gear Overspeed, Inadvertent Operation, Incorrect In-flight Shutdown, Incorrect Switch / Control Selection / Position, Overcontrol, Overstress, Overtemp, Overtorque, Undercontrol, Access Not Closed, Equipment Not Secured, Incorrect Use of Emergency Equipment, Loose Article, Collision with Aircraft/Vehicle, Collision with Ground Object, Deep Landing, Downwash, Flap / Slat Overspeed, Gear Overspeed, Heavy Landing, Tail Strike, Blanks / Pins Not Removed, Missed on Walk Round, Wrong aircraft, Blanks / Pins Not Fitted, Chock Jump, Collision with Aircraft/Vehicle.
Human Factors (Maintenance) All
Human Factors (Ground Services) All
Human Factors (Other) Material Dropped into Open System, Material left in Aircraft or Engine, Access not Closed, Equipment not Secured, Incorrect Use of Emergency Equipment.
Not Positively Determined All
46
Organisational Fault All
Technical Fault All
Unsatisfactory Equipment All
3.5.2 Assignment of Related Functions to Incidents
Once ASIMS data had been exported into an MS Excel format, each report was
assigned to up to 3 functions to indicate that the occurrence was a result of
output variability from each of these functions. The three functions were
assigned in a rough order of proximity to the reported occurrence, for example
an incidence of nose wheel steering failure would show as the mechanical
system function as the first and ‘closest’ function to the occurrence because the
variation in this function’s output was what was being reported. However, it may
have been variability in the output of the electrical system function that caused
downstream variability in the mechanical system; in this case the electrical
system function would be recorded second. Only three functions were assigned
to enable expedient data processing; the output was designed to be a rough
indicator of reported functional variability so this simplification was deemed
acceptable. Assignment was a matter of judgement formed by reading the ‘Brief
Title’, ‘Description’, ‘Investigation and Rectification work’, ‘Other Equipment
Involved’, ‘Cause Narrative’ and ‘Cause Observations’ fields. As the taxonomy
and language used by report authors did not correspond directly to the FRAM
model, this had to be carried out manually, which precluded a full analysis of
each report due to the time required. In addition to assigning three FRAM
functions to each report, a further set of fields was added to the data to show
the number of couplings by type of function e.g. Human-Human, Technological-
Organisational, Human-Organisational, etc. It was recognised that these
couplings might not be ‘direct function to function’ couplings and that there may
be intermediary functions identified in the FRAM model. Results from this
process are presented as they used in chapter four.
47
4 BUILDING THE TORNADO AIRWORTHINESS SYSTEM
MODEL USING THE FUNCTIONAL RESONANCE
ANALYSIS METHOD
Chapter two described the theoretical background to resilience engineering and
identified the Functional Resonance Analysis Method (FRAM) as the most
practical way to apply the principles. As described in Chapter one, the RAF’s
Tornado GR4 fast jet aircraft fleet was used as a case study. The following
chapter describes how the Tornado Airworthiness System Model (TASM) was
constructed using the FRAM. The TASM was created within a Microsoft Excel
spreadsheet; Chapter 5 describes the accompanying Visualisation Tool, which
was developed concurrently with the spreadsheet model. A copy of the final
spreadsheet model is at Appendix A.
4.1 Basic Principles
A full description of the FRAM is given by Hollnagel (2012) and also on the
website www.functionalresonance.com (Hollnagel, 2014). Drawing on the
theoretical basis of resilience engineering already described, the basic
principles of FRAM are given as:
The Equivalence of Success and Failure. Things go right and wrong in
fundamentally the same way. Although outcomes may be different, the
underlying processes are not necessarily different.
Approximate Adjustments. Conditions under which work or activity is
conducted never entirely matches that which is prescribed. Systems
normally adjust performance approximately to match existing conditions.
This approximation results in performance variability.
Emergence. Variability is not normally enough to cause an accident.
Variability may combine in unexpected ways leading to disproportionately
large, non-linear outcomes.
Functional Resonance. Occasionally functions reinforce each other
and cause unusually high output variability. This coupling effect is called
functional resonance, which may spread through the system. The
48
phenomena is dynamic and attributable to a simple combination of
causal links.
4.2 Taxonomy
Modelling complex socio-technical systems requires clarity of definition for the
elements of the system. The taxonomy proposed by Slater (2013) is used in this
investigation:
Function – the means by which an outcome is achieved. Can be carried
out by mechanical or electrical technology or by humans or
organisations.
Aspect – those features that describe the operation of a function. These
are Input, Preconditions, Resources, Control, Time and Output.
Activity – The output from the whole system under consideration
requires linkages between the various functions via their aspects. These
linkages are activities.
Process – A process is a sequence of activities.
System – The collection of functions and their dependencies define the
system under consideration.
Input – That which the function processes or transforms or that which
starts the function.
Preconditions – Conditions that must exist before a function execution.
Resources – That which the function needs or consumes to produce the
output.
Control – How the function is monitored or controlled; plan, programme,
instructions etc.
Time – Temporal constraints affecting the function.
Output – That which is the result of the function; an entity or state
change, finishing time or duration.
49
Instantiation – A ‘time-sliced’ map of system activity showing the system
state at a particular time. This is likely to show one or more processes
underway.
Functional Resonance – the detectable signal that emerges from the
unintended interaction of the normal variabilities of many signals
Figure 4-1 provides a visualisation of some aspects of the FRAM taxonomy. For
this study a convention has been adopted in that activities linking functions are
shown as dotted lines to illustrate the potential for their existence. The
illustration of a particular process or instantiation of a system state shows the
activities underway at the time as solid lines. Figure 4-1 also shows an
instantiation of an example process, which is shown by the purple lines. A
system will be complete when all functions are linked by potential activities or to
external dependencies or outputs.
Figure 4-1 FRAM Model Visualisation Demonstrating Taxonomy
O
C
P
I
T
R
DOWNSTREAM FUNCTION
B
O
C
P
I
T
R
UPSTREAM FUNCTION
D
O
C
P
I
T
R
UPSTREAM FUNCTION
E
O
C
P
I
T
R
UPSTREAM FUNCTION
F
O
C
P
I
T
R
UPSTREAM FUNCTION
A O
C
P
I
T
R
UPSTREAM FUNCTION
G
External Dependency
O
C
P
I
T
R
DOWNSTREAM FUNCTION
C
Activity
Function
Aspects
Process
50
4.3 FRAM Step 0 – Recognise the Purpose of the FRAM
Analysis
The primary purpose of the analysis was, in line with the research objective, to
allow airworthiness risk assessment to be conducted. Hollnagell (2012) offers
the choice of conducting a FRAM assessment for either incident analysis or risk
assessment. The research objective is to provide a tool for airworthiness
management and given that resilience engineering is concerned with both
reactive and proactive management of risks, risk assessment is the primary
purpose of the model. This is likely to produce a more complete system than
that created to analyse particular incidents. In order risk assessments to be
carried out the model must aid understanding of how the system operates and
the effect of any future disturbances. The scope of the analysis and the system
boundary was defined as follows:
The fundamental purpose of the system is to provide airworthy aircraft
for operations.
Only the functions that have the potential to affect the airworthiness of
the aircraft fleet were considered to be part of the system. For example,
whilst feeding and paying the technicians who maintain the aircraft is
important; these functions are considered as constant in output and
therefore not in scope.
Factors external to the system could be modelled as functions
themselves; Hollnagell (2012) describes these as ‘background
functions’. These external relationships are not described; activities
simply link to ‘External Factors’.
‘Management’ implies control of the system, therefore the system
boundary is set to encompass only those functions where the users of
the tools created through this application of the FRAM will have an
ability to exert control. Thus the regulatory system is considered an
external factor, noting that other studies have modelled this relationship
with FRAM (Herrera, 2010).
51
Environmental factors such the weather were considered not to vary and
were therefore outside of the system boundary; these were mapped as
external factors linked to specific aspects of some functions.
The total British military fleet of Tornado aircraft was considered.
Functions performed by the Front Line Command, Defence Equipment &
Support and industry contractors were all considered to be part of the
system.
The System is concerned with the management of an in-service aircraft
fleet with an adequate safety case. There is no consideration of the
functions required to design or manufacture the base-line aircraft.
The aircraft itself is modelled as a series of functions which interacting
with functions carried out by the aircrew and maintenance teams.
4.4 FRAM Step 1a – Identify and Describe the Initial Function
List.
The difficulty with modelling systems for the purpose of risk assessment rather
than accident analysis is that more imagination is required to identify functions.
In a complex sociotechnical management system there are no shortage of
candidate functions; the difficulty lies in avoidance of duplication and modelling
an acceptable level of detail. The iterative nature of FRAM means that
unnecessary or duplicate functions will be removed at later stages as the model
is refined. Experience of aircraft operations was initially used to list a number of
candidate functions. This was conducted by noting various functions carried out
by the various organisations involved with the Tornado Force. For example,
starting at a frontline squadron, the daily activity was envisaged by mental
walkthrough of activity, recording the various tasks required to handle and
maintain the aircraft. This ‘thought experiment’ then moved further back through
the organisation. Once an initial list was generated the various policy
documents detailed below were consulted as memory joggers to identify
additional functions.
Tornado Continuing Airworthiness Management Exposition (MOD., 2013)
No. 1 Group Air Safety Management Plan (Dudman, 2012)
52
RAF Marham Air Safety Management Plan
Tornado Equipment Safety Management Plan (Woodbridge, 2012)
It was important at this stage to guard against confusing a task with a function
(Hollnagel, 2012). The Tornado Continuing Airworthiness Management
Exposition (MOD., 2013) was particularly helpful in this context because as it
had only recently been written it was assumed to provide a reasonably close
match to ‘work as done’. Deviations between ‘work as imagined’ and ‘work as
done’ became clearer as the model grew. Once a reasonably complete first set
of function names was produced, these were recorded on a spread sheet as a
number of FRAM frames, as shown in Table 4-1. For each function, the aspects
were examined and recorded. An initial draft of potential aspect descriptions
was created from prior knowledge of the airworthiness system. This first draft
was then cross checked against the policy documents for consistency, with any
conflicts noted for later checking. As described in later steps, this draft was
subject to complete revision based on interviews with subject matter experts
which either validated or changed the first iteration of both the list of functions
and the description of individual function’s aspects. Each function was
assigned a serial number; the order of which is not significant.
Table 4-1 Example FRAM frame for Fault Diagnosis
52 Name of Function Fault Diagnosis
Aspect Description of Aspect
Input Fault shown on rects control board
Output Corrective Maintenance
Precondition Authorised maintenance personnel
Aircraft in correct fuel state
Resource Tools and TME
Approved data (Maintenance Procedures)
Authorised maintenance personnel
Information from aircrew (debrief)
Any unauthorised aide-memoires
GSE
Spare parts
Control Approved data (Fault Diagnosis)
Time Maintenance Programme
Flying Programme
53
In most cases it was effective to start the analysis with the input condition,
which generally led to some work on defining what represented a precondition
as opposed to the input. In order to bound the scope of the function, it was then
useful to define the output and at this stage some effort was required to cross
check against other functions to ensure that the output formed some link with
another function. This inevitably provided ideas for further functions, which were
added to the list. Resources were then identified; some care was taken to
generate a consistent set of named resources across various functions. If future
FRAM analyses are carried out on similar airworthiness systems, it would be
efficient to develop an initial taxonomy of resources based on those listed in this
model. It should be noted that the FRAM requires that functions should be
described as verbs, given that by definition they must perform some action. The
aircraft is described as a series of functions in terms of its subsystems;
structure, propulsion and so on. For the sake of brevity these are simply given
nouns, using the same terminology and subsystem structure that is used to sub-
divide airworthiness management tasks within the engineering authority. Clearly
however, all of these technological subsystems do provide a function. For
example in the case of the aircraft structure this is to react the loads imposed by
aircraft operation. Similarly the function of the propulsion system is to provide
thrust, electrical power, reduce fuel and also to record engine health data.
4.5 FRAM Step 1b – Verify Functions with Experts
The FRAM is designed to be conducted with groups of experts trained in its
use; this was impractical due to resource constraints and the experimental
nature of the approach. In order to verify the accuracy of each function at least
one person who currently formed part the function was identified and was
voluntarily co-opted into working through the function as described in Chapter 3
(with the expectation of technological functions).
54
Figure 4-2 TASM Step 12 – Screen Capture Showing Applicable Spreadsheet
Areas
The final and complete list of functions is given in Table 4-2.
Step 1 Functions and their
Aspects
55
Table 4-2 Listing of TASM Functions
Number Function Brief Description Number Function Brief Description
1 Flight Servicing 35 Monitor Reliability Data
2 Scheduled Maintenance 36 Publish Release To Service
3 Ground Handling 37 Independent Advice
4 3 Month Flying Programme 38Store, Service, Repair Weapons and
Role Equipment
5 Task Maintenance 39 Demand Spare Parts
6 Record Work Done on ac 40 Repair Aircraft
7 Train Maintenance Personnel 41Structural Inspections & Corrosion
Control
8Provide Authorised Maintenance
Personnel42 Fault Diagnosis
9Occurence Reporting, Investigation &
Follow Up43 Corrective Maintenance
10 Supply Chain 44 Technical Asistance Process
11
Fit/Remove Role
Equiment/Weapons/Explosives/Ejecti
on Seats
45 Avionic Flight Systems
12Maintain Ground Support Equipment
(GSE)46 Defensive Aids
13Provide & Account for Tools and Test
Equipment47 Avionic Communications
14 Refuel/Defuel 48 Armament & Electrical Systems
15 Assure Quality 49 Mechanical Systems
16Coordinate Maintenance
Documentation50 Aircraft Structure
17 Defer Faults 51 Propulsion
18 Locally Manufacture Parts 52 Crew Escape Systems
19 Engine Health Monitoring 53 Weapons
20
Ground Services (Cooling, Power,
Dehumidification, Steps, Staging,
Bungs, Blanks)
54 Operate Aircraft
21 Force and A4 Operations 55 Pre-Flight Checks
22Maintenance Programme
Development56 Produce Airworthy Survival Equipment
23 Modify Aircraft 57 Handover
24 Apply Special Instruction (Technical)s 58 Supervise Maintenance
25 Report Fault 59 Independent Inspection
26Replacement of service life limited
parts60 Plan Weekly-Daily Flying Programme
27 Airworthiness Review Certification 61 Rectification & Line Control Boards
28 Repair Spares - Industry 62 Manage Maintenance Extensions
29 Publish Aircrew Publications 63 Configuration Management (LITS)
30Publish Approved Data (Tech Manuals
& Policy)64 Operate Shift Pattern
31Publish Special Instructions
(Technical)65 Software
32Cost Benefit Analysis / Hazard
Analysis/ ALARP Decision66 Engine Performance Monitoring
33 Acquire Spare Parts 67 Engine Fleet Monitoring
34 Repair/Maintain Spares R2 68 Design Organisation
69 Chief Air Engineer
56
4.6 Step 2 – Identification of Output Variability
The purpose of the second step was to identify and characterise the potential
variability of the output of each function. It was first necessary to classify the
type of function; broad inferences could then be drawn from the literature as to
the likely nature the variability. This was compared to the data gathered from
interviews, ASIMS and general system experience. Output variability was
described in two generic dimensions; frequency and amplitude. Frequency
referred to how often the output of the function typically varied and the
amplitude was a measure of this variation in terms of deviation from a normal
level. Figure 4-5 provides a graphical representation of the notion of output
variability:
Figure 4-3 Visualising Functional Output Variability
4.7 Step 2a – Identify the Type of Function
Hollnagel (2012) identifies three classifications of function; Technological,
Human and Organisational. The difficulty of classifying each function varies.
Some, such as the function carried out by an aircraft system e.g. ‘Defensive
Aids’ were clearly technological. The attribution of either ‘human’ or
‘organisational’ characteristics to functions was largely down to the number of
people involved. Broadly defined functions such as ‘Supply Chain’ were clearly
organisational in nature. Others such as ‘Refuel/Defuel’ are carried out by only
one or two people and hence were classified as ‘human’. In other cases, such
as ‘Scheduled Maintenance’ this was less clear, as the function represented the
conglomeration of a number of human functions but also required organisation
with a hierarchical structure. As this sub-step only provided an initial pointer
O
C
P
I
T
R
FUNCTION
amplitude
1frequency
output variability
57
towards identifying output variability, these distinctions were not critical. As
described in chapter three, ASIMS data was used to show reported functional
variability. Figure 4-4 shows the number of times that functional variability was
reported. The majority of reports related to variability in technological functions,
which was because technological functions are those whose output has a most
direct impact on flight safety. In general, ASIMS reports did not identify
organisational or human factors related causes for occurrences. This is
because all incidents were purely related to the reliability of the technology or
that investigations did not probe deep enough into the incidents to uncover
these instances. Also the majority of occurrences did not result in any harm;
reporting of near misses due to human or organisational factors may not be
reported in the same ratio as reliability issues.
58
Figure 4-4 Instances of Functional Output Variability Recorded in Occurrence
Reports 2012/13
59
Figure 4-5 Instances of Reported Functional Output Variability by Function Type
Figure 4-6 Total Instances of Functional Output Variability Recorded in
Occurrence Reports 2012/13
4.8 Step 2b – Identify Internal Sources of Output Variability
Using system experience, interview data and ASIMS data described above, the
sources of internal variability for each function were noted and then
characterised. Internal sources of output variability are those which are
produced from within the function due to its inherent nature. Technological
functions may suffer component failure due to wear-out or human functions are
subject to a variety of psychological and physiological variations.
60
Table 4-3 Summary of Internal Variability (Hollnagel, 2012)
Possible internal sources of performance variability
Likelihood of performance variability
Technological Few, well known Low
Human Very many High frequency, large amplitude
Organisational Many, function specific or relating to ‘culture’
Low frequency, large amplitude
At this point in the analysis, notes were also made relating to any internal
damping mechanisms, for later reference. Damping mechanisms might include
internal redundancy in the case of technological functions, for instance a fail-
safe structure might continue to react loads to the full specification despite the
failure of one load pathway. In the case of an organisation, overlapping
responsibilities might provide cross checking of activity and reduce output
variability.
4.9 Step 2c – Identify External Sources of Output Variability
External output variability can be traced to some external dependency or linked
function in a process. The function ‘Ground Handling’ requires a variety of
resources in order for it to work (mechanics, drivers, tow tractor, etc.) and if
these aspects of the function vary in some respect then the potential exists for
the output of the ground handling function to also vary. For example, if the
ground handling team contained a particularly inexperienced worker then the
output of the function may potentially vary. Of course, damping factors whether
internal or external might remove this potential function output variability.
Damping factors could include additional supervision or time to complete the
task. As well as external variability within the defined function’s aspects (input,
precondition, resources, control and time) there are system-wide external
factors to consider that might exert influence on some or all functions, leading to
output variability. Such factors cannot be easily mapped in the FRAM Model;
they include environmental factors such as weather, infrastructure such as
heating, lighting, office space and IT reliability and also more intangible factors
61
such as cultural dimensions (such as ‘Just’, ‘Safety’ or ‘Reporting’ cultures).
Where external system-wide factors were potentially significant these were
noted at this step. The same data sources used for internal variability were also
used to produce notes on the external sources of variability for each function.
Table 4-4 Summary of External Variability (Hollnagel, 2012)
Possible external
sources of performance variability
Likelihood of performance variability
Technological Maintenance, misuse Low
Human Very many, social and
organisational High frequency, large
amplitude
Organisational Many, instrumental or
‘culture’ Low frequency, large
amplitude
Initial notes on internal and external sources of output variability were entered in
the FRAM Model as show in Table 4-5:
Table 4-5 Example TASM Recording of Step 2a-c for Function 67 - Engine Fleet
Monitoring
Type of Function: Internal Variability External Variability
Organisational
This contains a variety of technological and human judgement functions which combine to provide and overall organisational function
Internal variability is caused by human judgement elements of the function.
There is a variety of commercial and operational production pressures that influence this function. The ability and expertise of front line squadrons also provides context to the advice given out from Propulsion Support Team /Rolls-Royce.
4.10 Step 2d – Most Likely Dimension of Output Variability
Steps 2b and c identified the sources of output variability, the next step
characterised the potential output variability in its most likely dimensions. The
principles of conservation of energy and mass dictate that output must be in
some form of mass or energy transfer. For many functions this also provides for
62
some form of information transfer in various media (verbal, electronic, visual,
etc.). In order to keep the model at a manageable size, not all functional outputs
are described in exhaustive detail. The level of detail is in itself an ‘efficiency
thoroughness trade-off’; the validity of the judgement will be iteratively assessed
and adjusted as the model is used. All outputs were linked to aspects of other
functions, apart from the aircraft functions themselves which interact with the
external environment. The self-contained nature of the system provided a
mechanism for checking the internal consistency of the model – all outputs must
link to another function or to the external environment. Hollnagel (2012)
provides two options for characterising output variability; either a ‘simple’
solution or an ‘elaborate’ solution. The simple solution provides characterisation
in terms of time or precision. Given the broad scope of this model and the
potential wide range of activity covered by a single functional output line, the
elaborate solution was used to characterise output variability. Hollnagel (2012)
identifies 8 manifestations of output variability which are further divided into four
subgroups.
Table 4-6 Elaborate Description of Output Variability (Hollnagel, 2012)
Manifestation of Variability Description
Timing/Duration Too early/ too late/ omission.
Force/ Distance/ Direction Too weak/ too strong/ too short/ not far enough/ wrong direction/ too long too far/ wrong type of movement.
Wrong Object Wrong object or points to wrong object.
Sequence (of actions or information)
Omission, jumping, repetition, reversal, wrong part
Hollnagel (2012) emphasises the difference between actual variability and
potential variability. The main purpose of this model is to allow risk assessment;
potential variability is therefore the important issue and the subject of the initial
assessment that forms the basis of the model. Hollnagel describes potential
variability as what ‘could possibly go right or wrong’. Given the broad scope of
this model, this has been further clarified to the most likely potential variability.
This means that there is a steady state starting point from which the model can
63
be iteratively manipulated. The FRAM spreadsheet uses ‘drop-down’ selections
to allow allocation of ‘most likely’ output variability. The term ‘most likely’ allows
for the fact that some outputs may potentially be able to produce a variety of
manifestations. In particular instantiations of the model, these may not
correspond to the exact activity that is occurring. It is important to emphasise
that the model classifies the most likely output variability not the most likely
output, therefore there may be a more likely form of output but it is the rarer but
more variable form that is captured in the model. For example Table 4-7 shows
the characterisation of the output variability for flight servicing. One output of
this function is ‘replenishment of aircraft systems’ (with oils, greases and
gases). The description of the most likely output variability gives that an
omission or the wrong fluid may be used. Of course, in the majority of cases (or
instantiations) of this function in operation, the correct fluid will be used in the
correct quantity, hence exhibiting no variability.
Table 4-7 Characterising Output Variability – Flight Servicing
As shown in Table 4-8, the FRAM spreadsheet was developed to define output
performance variability in one of three gradations for both frequency of most
likely variability and its most likely amplitude. It is important to note that these
characterisations are related to the observed performance of the system, which
is not necessarily an accurate indicator of future performance. Frequency of
variability was defined as the rate of occurrence of a performance deviation
AC visually inspected (Avionics,
Electrical, Structure, Mechanical,
Crew Escape, Weapons,
Propulsion)
Sequence Omission High Medium
Any faults recorded Sequence Omission High Medium
Husbandry jobs recorded in log Sequence Omission High Low
Flight Servicing Certificate Signed Wrong Object
Sign up for wrong tail
number or omit full
information
Medium High
Frequency of
Output
Performance
Variability
Amplitude of
Output
Performance
Variability
Outputs
Most Likely
Dimension of
Output
Variability
Description of Most
Likely Output
Variability
64
from some accepted level. To provide consistency across the model a set of
qualitative and quantitative descriptions were established:
Table 4-8 Classifications for Frequency of Output Variability
In many cases variability is a designed-in or an inherent part of the system and
other functions serve the purpose of damping the effects of the performance
variability. For example, it is expected that the aircraft structure (a function) will
develop cracks; this is what happens to aero-structures in service. However,
anticipating this, the design organisation specifies a maintenance schedule to
check for cracks. There is then a process (sequence of activities linking
functions) to mitigate the effects of cracking before the structures function is
allowed to vary in its output to the extent that loads are not reacted and
structural integrity is lost. This quality of complex socio-technological systems
makes it difficult to dissect the amplitude of performance variability from its
potential effect on the performance of downstream functions. In the case of a
cracked structure, the designer’s (and regulators) intent was that the cracks
must be spotted before they propagate to a length where integrity will be lost. In
this respect the only variability of the structure function occurs once integrity is
lost. If left uninspected the structure would of course eventually fail. Because
the system is in a state of balance, both frequency and amplitude are assessed
at the current system state (start of any iteration). Given the disparate nature of
functional output; from lubricant top-ups to an operational plan, it is not possible
to give quantitative descriptions of amplitude. A three tier qualitative system was
therefore employed:
Frequency of Variability Qualitative Description Quantitative Description
High
A not unusual occurance in a
monthly period across the
whole fleet.
Occurs 1 to 10-2 times per
event/flying hour/work hour
Medium
The output of this function
infrequently varies from the
standard or proscribed form.
Occurs 10-2 to 10-4 times per
event/flying hour/work hour
Low
Very rarely varies from the
standard or proscribed form
of output
Occurs 10-4 or less times per
event/flying hour/work hour
65
Table 4-9 Classification of Amplitude of Performance Variability
Figure 4-7 TASM Step 2 – Screen Capture Showing Applicable Spreadsheet
Areas
4.11 Step 3 – Aggregation of Variability
Once most likely output variability has been modelled, it is then necessary to
show where these outputs link to other functions and thereafter to aggregate the
effects of these varying upstream outputs on the downstream function. Figure 4-
10 shows how the model uses cell-linking functionality to provide the input to
step three, from the details entered in step two.
Amplitude of Variability Timing/Duration Force/ Distance/Direction Wrong Object Sequence
High
Complete Critical
Ommissions or too
late/early to produce
useful effect
Gross error in
force/distance/direction of
output - requires major
restorative action to correct
Totally wrong object is
output or pointed at
Sequence completely
jumbled /large or critical
sections missed/critically
wrong part inserted
Medium
Less than critical
ommissions or effect is
late/early enough to cause
difficulty for downstream
functions
Error in
force/distance/direction
requires some restorative
action to correct
Nearby or similar object is
pointed at or output
Some significant
ommission/skipping/rev
ersals or additional parts
Low
Ommissions or late/early
output cause minor
difficulty to upstream
functions
Minor error of
force/distance/direction -
requires little if any
correction
Minor difference between
the object output/pointed
at and the correct object
Minor
ommission/skipping/rev
ersals or additional parts
Step 2 Identify Output
Variability
a b c d
66
Figure 4-8 Tracing Output Downstream Dependencies (Screen Capture)
Hollnagel (2012) suggests a variety of possible effects on downstream functions
based on the simple solution to characterising variability. These are used as a
guide for potential effects on the more complicated set of variability descriptions
used in the model. The potential effects are expressed as free text and where
67
relevant highlight the most likely upstream outputs that will vary from the
downstream function. The effects of the upstream function output variability are
considered independently. For functions with multiple outputs and upstream
aspects there will be a large number of potential combinations of variability. It is
therefore not possible to express the overall effect of upstream variability other
than in a particular instantiation of the model. However as a visual aid, each
upstream aspect is rated as to the extent that its most likely potential output
variability will affect the downstream function in question. The possible effect on
this (downstream) Function Output Variability Score is given as either
‘increasing’, ‘no change’ or ‘decreasing’ in terms of upstream output variability.
This score provides an estimate of the likely downstream ability to damp out
upstream variability. To contrast with the frequency and amplitude ratings of the
upstream aspects, the possible effect on the downstream function is shown as a
shade of purple in Table 4-10:
69
Table 4-10 Aggregation of Variability for Flight Servicing
Name of Function Flight Servicing Number Name Aspect
Aspect Description of Aspect
Input Line Controller indicates task on boards 61Rectification & Line Control
Boards
Maintenance
Information for taskingSequence
Inappropriate/unworkable
planMedium Medium
Potential to sway ETTO and cause
ommissions and errorsINCREASE
Output AC systems replenished (propulsion, mechanical)
AC visually inspected (Avionics, Electrical, Structure,
Mechanical, Crew Escape, Weapons, Propulsion)
Any faults recorded
Husbandry jobs recorded in log
Flight Servicing Certificate Signed
Precondition Maintenance Activity complete 16Coordinate Maintenance
Documentation
F700 ready for flight
servicing and ac captain
(pre-flight checks)
Timing/DurationF700 not available for crew
walkMedium Medium
Unlikely to start flight servicing if
pre-condition not in place.NO CHANGE
AC available at groundcrew location 54 Operate AircraftReturn aircraft to
groundcrewWrong Object Parked in incorrect location Low High
Not possible to start flight servicing
without access to aircraftDECREASE
Resource Fuels & Lubricants (from supply chain) 10 Supply Chain
Part Delivered to
Corrective Maintenance,
Scheduled Maintenance,
Repair Aircraft,
replacement or life
limited parts, weapons
& role equipment, tools
& test equipment
Timing/DurationNot delivered in time to
meet requirementHigh Medium
Not possible to fully complete
flight servicing without necessary
consuamables
INCREASE
Tools & Test Equipment 13Provide & Account for
Tools and Test EquipmentTools and TME Wrong Object Incorrect tool Medium Low
Increases likelihood of using unsafe
work-arounds.INCREASE
Authorised manpower 8Provide Authorised
Maintenance Personnel
Appropriately (to
requirement)
Authorised Maintenance
Personnel (record work
done, fuel, scheduled
maintenance, report
faults, conduct quality
tasks etc)
SequenceOmmission of a required
authorised skillHigh Medium
Potential for unauathised (an not
competent) personnel to carry out
servicing; however likely to be
damped out by maintenance
tasking function.
INCREASE
Control Flight Servicing Schedule (approved data) 30Publish Approved Data
(Tech Manuals & Policy)Flight Servicing Notes Sequence Ommission Medium High
Misleading or inaccurate
information causes variability.INCREASE
Supplementary Flight Servicing requirements (from defer
faults)17 Defer Faults
ADF or Lim entry to close
job card hence allows co-
ordination of
maintenance
documentation
Sequence
Element of ADF/Lim
insufficiently defined or
ommitted
High Medium
Additional tasking within the flight
servicing increases human
performance issues.
INCREASE
Time Daily/Weekly Flying Programme 60Plan Weekly-Daily Flying
Programme
Flying Programme (Fuel,
flight service, xx)Sequence
Inappropriate/unworkable
planMedium Medium
Insufficient time likely to cause
inappropriate ETTO.INCREASE
Most Likely Dimension of
Upstream Output Variability
Description of Most Likely
Upstream Output Variability
Step 1 - Identify and Describe the FunctionsStep 3 - Aggregation of Variability
Possible effect on this (downstream)
Function Output Variability (Damping)
Fequency of Upstream
Output Performance
Variability
Amplitude of Upstream
Output Performance
Variability
Possible effect on this
(downstream) Function
Upstream Function
70
The amplitude and frequency classifications for upstream output variability
combined with the Possible Effect rating was then combined to produce a
Rough Downstream Function Variability Score. This Rough Score was
calculated as follows:
Numerical Score High Medium Low
(a) Frequency of Upstream Output
Performance Variability 3 2 1
(b) Amplitude of Upstream Output
Performance Variability 3 2 1
INCREASE NO CHANGE DECREASE
(c) Possible effect on this
(downstream) Function Output Variability (Damping)
3 2 1
Figure 4-9 Rough Score Matrix
Equation 6 - Rough Downstream Function Variability Score
Rough Score = a x b x c
This gave a score between 0 and 27:
Figure 4-10 Rough Downstream Function Variability Score
This rough score is shown in the FRAM Model against each aspect of every
function, except for those aspects linked to external dependencies. These
external dependencies were assumed to be constant.
0 3 6 9 12 15 18 21 24 27
Most Likely
upstream output
does not effect
downstream
functional
variability
Most Likely
upstream output
variability is highly
likely to
significantly effect
the output
variability of the
downstream
function
Increasing effect on downstream function output variability - may be manifested in either frequency or amplitude of downstream output variability
71
Figure 4-11 TASM Step 3 – Screen Capture Showing Applicable Spreadsheet
Areas
4.12 Step 4 – Consequences of the Analysis
Traditional safety models focus on elimination of hazards, prevention of unsafe
conditions and protection from the consequences of unsafe conditions if they
occur. By contrast a FRAM Model may be used to prevent functional resonance
occurring in an activity linking two functions. There are two consequences of
constructing the model; how to monitor performance variability and how to
provide damping to prevent adverse performance variability.
4.12.1 Step 4a – Damping Factors
A number of damping factors to prevent adverse performance variability are
suggested in the final section of the model. These may be in the form of
additional functions, activities or changes to the performance of existing
functions or activities. Alternatively there may be some way in which additional
internal functional damping could be introduced.
4.12.2 Step 4b Performance Indicators
The concept of monitoring safety performance indicators is a much studied
technique. FRAM offers a clear conceptual starting point from which such
Step 3 Aggregation of
Performance Variability
72
indicators may be developed. Dependent on the outcome of the analysis, safety
indicators may be conceived to monitor either overall performance of the
function or particular activities. This is the potentially key means by which the
model can serve a useful purpose. The potential performance indicators shown
in the model represent an initial suggestion for review by subject matter experts;
where possible existing data should be used to generate indicators without
additional work.
Figure 4-12 TASM Step 4 – Screen Capture Showing Applicable Spreadsheet
Areas
Step 3 Aggregation of
Performance Variability
73
Table 4-11 Example of Step 4 - Flight Servicing
Name of Function Flight ServicingAspect Description of Aspect
Input Line Controller indicates task on boardsPre flight checks, fault reporting, engineering
management supervision
Output AC systems replenished (propulsion, mechanical) Related fault reporting
AC visually inspected (Avionics, Electrical, Structure,
Mechanical, Crew Escape, Weapons, Propulsion)
Aircrew pre-flight checks - feedback info.
Husbandry checks and Airworthiness Review
Any faults recordedComparison of flt servicing fault reporting
across shifts/sqns etc
Husbandry jobs recorded in logComparison of flt servicing husbandry
reporting across shifts/sqns etc
Flight Servicing Certificate SignedCaptured in ASIMS reports if found
Precondition Maintenance Activity complete N/A
AC available at groundcrew location N/A
Resource Fuels & Lubricants (from supply chain)Pre flight checks, fault reporting, engineering
management supervision
Tools & Test Equipment
Approved data specifies tools + Pre flight
checks, fault reporting, engineering
management supervision
Authorised manpowerPre flight checks, fault reporting, engineering
management supervision
Control Flight Servicing Schedule (approved data)The F765 process allows for reporting
unsatisfactory features in the approved data
Supplementary Flight Servicing requirements (from defer
faults)
Feedback to engineering management on any
inconsistencies with supplementary flight
servicing requirements
Time Daily/Weekly Flying Programme
Feedback to engineering management as to
likely feasibility of the plan. Line controller is
experienced technician and is able to judge
plan.
Step 4 - Consequences of the AnalysisStep 1 - Identify and Describe the Functions
Damping Factors Potential Performance Indicators
74
4.13 Summary of TASM Layout
The interconnected nature of the TASM Model means that it is challenging to
follow for those not involved in its creation. A summary example of is given in
Figure 4-14 which shows a representation of the FRAM carried out on the
relationship between 2 functions; Function A which is upstream of Function B.
In this case Function B relies upon Function A to provide a time signal to start or
stop the function (or some element of it). This time aspect relationship is
identified in step one, along with the activities linking the other aspects of
Function B to other upstream functions.
75
Figure 4-13 Example FRAM for 2 Functions, A and B
Aspect Description of Aspect
Numb Name Aspect
Time Description of time signal A Function ADescription of
time signalSequence
Describe how Function A's
output variability is
manifested?
Medium Medium
How does variation in time
signal from Function A affect
the output of Function B?
INCREASE
Step 3 - Aggregation of Variability
Upstream Function Most Likely Dimension
of Upstream Variability
Description of Most Likely
Upstream Variability
Fequency of Upstream
Performance Variability
Amplitude of Upstream
Performance Variability
Possible effect on this
(downstream) Function
Possible effect on this (downstream)
Function Output Variability
O
C
P
I
T
R
DOWNSTREAM FUNCTION
B
O
C
P
I
T
R
UPSTREAM FUNCTION
O
C
P
I
T
R
UPSTREAM FUNCTION
O
C
P
I
T
R
UPSTREAM FUNCTION
O
C
P
I
T
R
UPSTREAM FUNCTION
A
O
C
P
I
T
R
UPSTREAM FUNCTION
External Dependency
O
C
P
I
T
R
DOWNSTREAM FUNCTION
Type of Function
Human
Why this type? Describe sources? Describe sources? Description Sequence Ommission Low Medium
External/Exogenous
Variability
Step 2 - Identification of Output Variability
Internal/Endogenous
VariabilityOutputs
Most Likely Dimension of
Output Variability
Description of Most Likely
Output Variability
Fequency of Output Performance
Variability
Amplitude of Output Performance
Variability
OutputIdentify potential
performance
Time
What may damp output
variability? Other
functions? External Factors?
Internal factors?
Step 4 - Consequences of the Analysis
Damping FactorsPotential Performance
Indicators
Name of FunctionDownstream
Function BAspect Description of Aspect
Input What causes the function to
start?
Output What is produced?
Precondition
What is condition is
required to allow function
to occur?
Resource What is used or consumed?
ControlWhat defines the
operation?
Time What sets the schedule?
Step 1 - Identify and Describe the Functions
STEP 1
STEP 2
STEP 3
STEP 4
a b c d
From Function A Step 2
76
Step 2a then identifies the function type (shown as human), followed by 2b
describing sources of output variability caused internally within the function.
Step 2c identifies external sources of variability, whether these are activities
linking function B to other functions or other general ‘environmental’ factors.
Step 2d first shows a repeater of the identities of the output activities shown in
step one. The most likely form of output variability is selected from the
phenotypes discussed previously (shown here as ‘Sequence’). A more detailed
description of the most likely output variability is given (shown as ‘omissions’ in
the sequence). The frequency of the most likely variability is then given in gross
qualitative statements for the frequency and amplitude (shown as ‘low’ and
‘medium’ respectively). Step 3 merely takes the results from Function A’s step
two with respect to the output providing Function B’s time aspect. Step three
then goes on to assess the effect of the Function A’s output variability on the
performance of Function B. This then leads to a categorisation of the potential
effect of function B’s output variability (shown here as an ‘Increase’ in
variability). Step four first looks at what damping factors might reduce the effect
of Function B’s output variability and then on what potential performance
indicators might be available to measure the output variability from Function B.
Note of course that there will be a consideration of performance indicators for
Function A’s output, which will be of use in monitoring the performance of
Function B also. This process will be repeated for all aspects of Function B and
then for all the other functions in the system. This process becomes iterative as
the linking activities are assessed from both sides. It is not possible to
exhaustively test the model to establish whether all of the activities between
functions have been captured. This can only be established by building the
model simultaneously with the visualisation tool. This tool is described in the
following chapter.
77
5 TORNADO AIRWORTHINESS SYSTEM MODEL
VISUALISATION TOOL
5.1 Need for the Tool
Whilst the spreadsheet of the TASM is the actual model, its size and inherent
complexity makes it difficult to interpret. There is a requirement for some other
form of representing the model. As shown in Figure 4-1, functions and their
aspects can be visualised as a series of hexagons linked by lines. This
technique formed the basis of a visualisation tool built using Microsoft Visio.
5.2 Microsoft Visio
If the model is to be of practical use for the RAF, then it must be able to be
interrogated using standard software available on Defence Information
Infrastructure (DII) computer systems. Software for viewing Visio drawings is
available as standard and the full version of Visio is available at an additional
cost to units. Furthermore it is also possible to incorporate interactive Visio
drawings into web-based Microsoft Sharepoint sites which are used for internal
communication, storage of documents and other tools. Visio was therefore
selected as the basis for the FRAM Visualisation Tool because of its ability to
host drawings that can be manipulated both during development and by end
users to aid interpretation. Key to this is part of the project was the the ability to
assign objects within the drawing to ‘layers’.
5.3 Building the Tool
The visualisation tool was developed concurrently with the spreadsheet model.
As the links within the spreadsheet are not easily viewable, the visualisation
provided a method of cross-checking the model for completeness. The
visualisation was developed in the following steps.
5.3.1 General Functional Areas
Because of the complexity of the model, it was important to minimise the
average distance between linked functional aspects so as to minimise the
number of drawing elements placed on top of each other. The first step was to
78
decide on groupings for functions. These were laid out as shown in Figure 5-1,
with the technological functions representing the physical aircraft system in the
top right-hand corner (blue), with the more human related functions involved in
line operations (peach) and maintenance (cream) to the left of the aircraft. The
green area hosts those functions involved in continued airworthiness
management and associated operations management. Type airworthiness
functions carried out by the DE&S Project Team and industry design
organisations are shown in the bottom left in mauve. The purple area in the top
left hosts functions in the supply and repair chain. Grey areas host functions
that are not described within the model; these are external resources or
regulations. It is important to note that these areas are a general aid to building
the visualisation and also an aid to interpretation. They are not definitive
features of the functions overlaid on them – many functions sit between the
areas and there is a limit to how they can be represented in a two dimensional
representation.
LINE OPERATIONS
AIRCRAFT SYSTEMAIRCRAFT MAINTENANCE
CONTINUING AIRWORTHINESS & OPERATIONS SUPPORT
SUPPLY & REPAIR CHAIN
PROJECT TEAM/TYPE AIRWORTHINESS AUTHORITY &
INDUSTRY SUPPORT
REGULATION
EXTERNAL RESOURCES AND FUNCTIONS
© Crown Copyright, 2013
Figure 5-1 Visualisation Functional Groupings
79
5.3.2 Functions
The next step was to add the functions; these were shown in the same manner
as described in chapter two and in Figure 5-3:
O
C
P
I
T
R
Function
Time
Preconditions
Input
Resources
Output
Control
Figure 5-2 A Function and Its Aspects
Additionally a colour code was developed to show as pink those functions with
the potential to directly affect air safety by means of their outputs. These are
primarily the technological functions that comprise the aircraft system and
associated airborne equipment. Functions which directly affect the condition of
aircraft systems are shown as yellow, with other functions shown in grey. These
functions were overlaid as a new layer on the background, along with a key.
The functions are drawn as part of a single ‘functions’ layer and also as part of
layers specific to each function. A illustration of the background and ‘functions’
and ‘callouts layers is shown at Figure 5-4.
81
LINE OPERATIONS
AIRCRAFT SYSTEMAIRCRAFT MAINTENANCE
CONTINUING AIRWORTHINESS & OPERATIONS SUPPORT
SUPPLY & REPAIR CHAIN
PROJECT TEAM/TYPE AIRWORTHINESS AUTHORITY &
INDUSTRY SUPPORT
EXTERNAL RESOURCES AND FUNCTIONS
© Crown Copyright, 2014
Anywhere in system
Reporting / Just Culture + Occurrence or Perception of Risk
somewhere in system
Anywhere in system
Quality Culture
O
C
P
I
T
R
Maintenance Personnel
O
C
P
I
T
R
Locally Manufacture
Parts
O
C
P
I
T
R
Publish SI(T)sO
C
P
I
T
R
Engine Performance Monitoring
18
31 668
O
C
P
I
T
R
Repair Spares – Industry
28O
C
P
I
T
R
Repair/Maintain Spares R2
34O
C
P
I
T
R
Demand & Return Spare
Parts
39O
C
P
I
T
R
Independent Inspection
59O
C
P
I
T
R
Record Work done on Aircraft
6O
C
P
I
T
R
Defer Faults
17 O
C
P
I
T
R
Tools & Test Equipment
13O
C
P
I
T
R
OperateShift Pattern
64
O
C
P
I
T
R
Co-ordinate Maintenance
Documentation
16 O
C
P
I
T
R
Ground Handling
3
O
C
P
I
T
R
Fuel/Defuel
14
O
C
P
I
T
R
Avionic Communicatio
ns
47 O
C
P
I
T
R
Avionic Flight Systems
45
O
C
P
I
T
R
Software
65
O
C
P
I
T
R
Mechanical Systems
49
O
C
P
I
T
R
Armament & Electrical Systems
48O
C
P
I
T
R
Propulsion
51O
C
P
I
T
R
Replacement of service life
limited parts
26O
C
P
I
T
R
Scheduled Maintenance
2
O
C
P
I
T
R
Task Maintenance
5
O
C
P
I
T
R
Flight Servicing
1
O
C
P
I
T
R
Pre-Flight Checks
55O
C
P
I
T
R
Operate Aircraft
54
O
C
P
I
T
R
Crew Escape System
52O
C
P
I
T
R
Aircraft Structure
50
O
C
P
I
T
R
Supply Chain
10
O
C
P
I
T
R
Acquire Spare Parts
33O
C
P
I
T
R
Store & Maintain
Weapons & RE
38
O
C
P
I
T
R
Structural Inspections
41
O
C
P
I
T
R
Repair Aircraft
40O
C
P
I
T
R
Corrective Maintenance
43
O
C
P
I
T
R
Apply SI(T)s
24 O
C
P
I
T
R
Fault Diagnosis
42
O
C
P
I
T
R
Supervise Maintenance
58
O
C
P
I
T
R
Handover
57
O
C
P
I
T
R
Fit/Remove Role & Arm Equipment
11
O
C
P
I
T
R
Rectification and Line
Control Boards
61
O
C
P
I
T
R
Ground Services
20
O
C
P
I
T
R
Report Faults & Husbandry
25
O
C
P
I
T
R
Weapons
53 O
C
P
I
T
R
Defensive AIds
46
O
C
P
I
T
R
Survival Equipment
56
O
C
P
I
T
R
Train Maintenance
Personnel
7
O
C
P
I
T
R
Airworthiness Review
Certification
27
O
C
P
I
T
R
Chief Air Engineer
69
O
C
P
I
T
R
Force & A4 Operations
21
O
C
P
I
T
R
Plan Weekly/Daily Flying Programme
60
O
C
P
I
T
R
Occurrence Reporting
9
O
C
P
I
T
R
Maintain GSE
12
O
C
P
I
T
R
Configuration Management
(LITS)
63O
C
P
I
T
R
Manage Maintenance
Extensions
62
O
C
P
I
T
R
Technical Assistance
Process
44
O
C
P
I
T
R
Modify Aircraft
23O
C
P
I
T
R
Monitor Reliability Data
35
O
C
P
I
T
R
Maintenance Programme
Development
22
O
C
P
I
T
R
Independent Technical
Advice
37O
C
P
I
T
R
Publish Approved Data
30
O
C
P
I
T
R
Release to Service
36O
C
P
I
T
R
Engine Health Monitoring
19O
C
P
I
T
R
3 Month Flying Programme
4
O
C
P
I
T
R
Assure Quality
15
O
C
P
I
T
R
Engine Fleet Monitoring
67
O
C
P
I
T
R
Publish Aircrew Publications
29O
C
P
I
T
R
Cost/Benefit and Hazard
Analysis
32
O
C
P
I
T
R
Design Organisations
68
Figure 5-3 Screen Capture of Visualisation Tool with Functions Added
82
5.3.3 External Dependencies
The next step in the development was to add the external dependencies to the
functions. In order to differentiate these dependencies from the activities linking
functions the dependencies are shown as straight lines. These dependencies
are drawn both as a separate layer called ‘External’ but also each dependency
is assigned to each layer representing a function to which it is linked. As Figure
5-5 shows, the visualisation already begins to resemble a complicated circuit
diagram, however further development steps begin to make it clearer to
interpret.
83
LINE OPERATIONS
AIRCRAFT SYSTEMAIRCRAFT MAINTENANCE
CONTINUING AIRWORTHINESS & OPERATIONS SUPPORT
SUPPLY & REPAIR CHAIN
PROJECT TEAM/TYPE AIRWORTHINESS AUTHORITY &
INDUSTRY SUPPORT
EXTERNAL RESOURCES AND FUNCTIONS
© Crown Copyright, 2014
Licensed Hangar/ Parking Space
71(IR) Sqn – Non Destructive Testing
Force Level 0 Plan
Crew Training Plan
Squadron Planning Staff
Squadron Management Tools
SQEP Engineering Management
ESLOPS (Aircraft State Database)
Personal Notes
LITS Instructions
Codification Process
MJDI System
BAES Supply IT System
StorageTransport Supply Orders
JSP800/886
JSP 886 Pipeline Times
WeatherBowser & Driver
Strategic Fleet Plan
Joint Business Agreement
ATTAC Contract (BAE Systems)
Capability Development Programme
GR4mations IT Tool
Joint Business Agreement
MILITARY EFFECT
Workshop Infra & Tools
Local Finance
Testing
AP100E-15
RB199 Ground Support Station
JetscanDetuner / HP Bay
ROCET Contract (Rolls Royce)
JAMES (IT system)
Capability Requirements Management
Investment Appraisal & Business CaseCommercial
Arrangements
Project Management
5000 Series Regulatory Articles
F799 Instructions for Use – Maintenance Log
Airworthiness/Safety Delegation Holders
Trilogi System
4000 Series Regulatory Articles
RESOLVECAMO Staff
Manual of Airworthiness Processes -01
Other Nations: Tornado Tech
Warning/Special Technical Order
Project Commercial & Financial Advice
Tornado Equipment Safety Management Plan
Commodity Internal Business
Agreement
Inventory Management
Staff
Explosives Regulations
Supply Personnel
Integrated Engineering Database
EDSR (Drawings database)
NETMA
PROQUIS
External Communications
Dynamic Environment
Aircraft Abandoned
Flight Authorisation Process
Qualified and Current Aircrew
AP100B-01 Handover Policy
Duty Auth
Squadron Golden Rules
Maintenance Personnel Assigned to Post
Phase 1 & 2 Training
Trainee Maintenance Personnel
Rigs
Anywhere in system
Reporting / Just Culture + Occurrence or Perception of Risk
somewhere in system
Air Safety Management
Information System
External Occurrence Investigators
CAMO Staff
Air Safety Cell
Air Safety Management Plans
Anywhere in system
Quality Culture
Quality Staff
External Audit
Quality System Plans & Regulation
Archived Data
O
C
P
I
T
R
Maintenance Personnel
O
C
P
I
T
R
Locally Manufacture
Parts
O
C
P
I
T
R
Publish SI(T)sO
C
P
I
T
R
Engine Performance Monitoring
Symptom Capture Tool
Reliability Database
71(IR) Sqn – Repair Team
18
31 668
Handling Squadron
O
C
P
I
T
R
Repair Spares – Industry
28O
C
P
I
T
R
Repair/Maintain Spares R2
34O
C
P
I
T
R
Demand & Return Spare
Parts
39O
C
P
I
T
R
Independent Inspection
59O
C
P
I
T
R
Record Work done on Aircraft
6O
C
P
I
T
R
Defer Faults
17 O
C
P
I
T
R
Tools & Test Equipment
13O
C
P
I
T
R
OperateShift Pattern
64
O
C
P
I
T
R
Co-ordinate Maintenance
Documentation
16 O
C
P
I
T
R
Ground Handling
3
O
C
P
I
T
R
Fuel/Defuel
14
O
C
P
I
T
R
Avionic Communicatio
ns
47 O
C
P
I
T
R
Avionic Flight Systems
45
O
C
P
I
T
R
Software
65
O
C
P
I
T
R
Mechanical Systems
49
O
C
P
I
T
R
Armament & Electrical Systems
48O
C
P
I
T
R
Propulsion
51O
C
P
I
T
R
Replacement of service life
limited parts
26O
C
P
I
T
R
Scheduled Maintenance
2
O
C
P
I
T
R
Task Maintenance
5
O
C
P
I
T
R
Flight Servicing
1
O
C
P
I
T
R
Pre-Flight Checks
55O
C
P
I
T
R
Operate Aircraft
54
O
C
P
I
T
R
Crew Escape System
52O
C
P
I
T
R
Aircraft Structure
50
O
C
P
I
T
R
Supply Chain
10
O
C
P
I
T
R
Acquire Spare Parts
33O
C
P
I
T
R
Store & Maintain
Weapons & RE
38
O
C
P
I
T
R
Structural Inspections
41
O
C
P
I
T
R
Repair Aircraft
40O
C
P
I
T
R
Corrective Maintenance
43
O
C
P
I
T
R
Apply SI(T)s
24 O
C
P
I
T
R
Fault Diagnosis
42
O
C
P
I
T
R
Supervise Maintenance
58
O
C
P
I
T
R
Handover
57
O
C
P
I
T
R
Fit/Remove Role & Arm Equipment
11
O
C
P
I
T
R
Rectification and Line
Control Boards
61
O
C
P
I
T
R
Ground Services
20
O
C
P
I
T
R
Report Faults & Husbandry
25
O
C
P
I
T
R
Weapons
53 O
C
P
I
T
R
Defensive AIds
46
O
C
P
I
T
R
Survival Equipment
56
O
C
P
I
T
R
Train Maintenance
Personnel
7
O
C
P
I
T
R
Airworthiness Review
Certification
27
O
C
P
I
T
R
Chief Air Engineer
69
O
C
P
I
T
R
Force & A4 Operations
21
O
C
P
I
T
R
Plan Weekly/Daily Flying Programme
60
O
C
P
I
T
R
Occurrence Reporting
9
O
C
P
I
T
R
Maintain GSE
12
O
C
P
I
T
R
Configuration Management
(LITS)
63O
C
P
I
T
R
Manage Maintenance
Extensions
62
O
C
P
I
T
R
Technical Assistance
Process
44
O
C
P
I
T
R
Modify Aircraft
23O
C
P
I
T
R
Monitor Reliability Data
35
O
C
P
I
T
R
Maintenance Programme
Development
22
O
C
P
I
T
R
Independent Technical
Advice
37O
C
P
I
T
R
Publish Approved Data
30
O
C
P
I
T
R
Release to Service
36O
C
P
I
T
R
Engine Health Monitoring
19O
C
P
I
T
R
3 Month Flying Programme
4
O
C
P
I
T
R
Assure Quality
15
O
C
P
I
T
R
Engine Fleet Monitoring
67
O
C
P
I
T
R
Publish Aircrew Publications
29O
C
P
I
T
R
Cost/Benefit and Hazard
Analysis
32
O
C
P
I
T
R
Design Organisations
68
Business Procedure BS013
RA 1300
RTSA
ITEA Contract
Codification
LITS ServersTAG TeamSEMA
DAOSBaseline Design
Warton Manpower Drawing SetDevelopment Aircraft Flight Trials
Materials
Figure 5-4 Screen Capture of Visualisation Tool with External Dependencies Added
84
5.3.4 Functional Activities
The most time consuming and difficult part of developing the visualisation tool
was the addition of functional activities. These had to be drawn to link all of the
over 900 functional aspects described in the spreadsheet model. As far as
possible they were drawn as an arc around other functions; however this was
not always possible without introducing unnecessary complication. There were
a small minority of activities which could not be shown directly on the
visualisation, because they would have introduced confusion to the diagram.
Functions that also draw activity from quality, safety or reporting cultures have
these activities shown as cloud shapes in order to prevent the need to link those
functions to every other function. The addition of approximately 900 activities
produces a diagram that itself becomes complicated enough to be described as
both complex and intractable. However it is important to note that the activity
lines represent the potential for these to activities to occur and link the
functions. Any particular instantiation of the model, that is a representation of
total activity at a specific moment, will not need to show every single activity
occurring.
Figure 5-6 shows the visualisation tool with all of the activities shown. This
illustrates that it is impossible to interpret the system with all activities shown at
once.
85
LINE OPERATIONS
AIRCRAFT SYSTEMAIRCRAFT MAINTENANCE
CONTINUING AIRWORTHINESS & OPERATIONS SUPPORT
SUPPLY & REPAIR CHAIN
PROJECT TEAM/TYPE AIRWORTHINESS AUTHORITY &
INDUSTRY SUPPORT
EXTERNAL RESOURCES AND FUNCTIONS
© Crown Copyright, 2014
Licensed Hangar/ Parking Space
71(IR) Sqn – Non Destructive Testing
Force Level 0 Plan
Crew Training Plan
Squadron Planning Staff
Squadron Management Tools
SQEP Engineering Management
ESLOPS (Aircraft State Database)
Personal Notes
LITS Instructions
Codification Process
MJDI System
BAES Supply IT System
StorageTransport Supply Orders
JSP800/886
JSP 886 Pipeline Times
WeatherBowser & Driver
Strategic Fleet Plan
Joint Business Agreement
ATTAC Contract (BAE Systems)
Capability Development Programme
GR4mations IT Tool
Joint Business Agreement
MILITARY EFFECT
Workshop Infra & Tools
Local Finance
Testing
AP100E-15
RB199 Ground Support Station
JetscanDetuner / HP Bay
ROCET Contract (Rolls Royce)
JAMES (IT system)
Capability Requirements Management
Investment Appraisal & Business CaseCommercial
Arrangements
Project Management
5000 Series Regulatory Articles
F799 Instructions for Use – Maintenance Log
Airworthiness/Safety Delegation Holders
Trilogi System
4000 Series Regulatory Articles
RESOLVECAMO Staff
Manual of Airworthiness Processes -01
Other Nations: Tornado Tech
Warning/Special Technical Order
Project Commercial & Financial Advice
Tornado Equipment Safety Management Plan
Commodity Internal Business
Agreement
Inventory Management
Staff
Explosives Regulations
Supply Personnel
Integrated Engineering Database
EDSR (Drawings database)
NETMA
PROQUIS
External Communications
Dynamic Environment
Aircraft Abandoned
Flight Authorisation Process
Qualified and Current Aircrew
AP100B-01 Handover Policy
Duty Auth
Squadron Golden Rules
Maintenance Personnel Assigned to Post
Phase 1 & 2 Training
Trainee Maintenance Personnel
Rigs
Anywhere in system
Reporting / Just Culture + Occurrence or Perception of Risk
somewhere in system
Air Safety Management
Information System
External Occurrence Investigators
CAMO Staff
Air Safety Cell
Air Safety Management Plans
Anywhere in system
Quality Culture
Quality Staff
External Audit
Quality System Plans & Regulation
Archived Data
O
C
P
I
T
R
Maintenance Personnel
O
C
P
I
T
R
Locally Manufacture
Parts
O
C
P
I
T
R
Publish SI(T)sO
C
P
I
T
R
Engine Performance Monitoring
Symptom Capture Tool
Reliability Database
71(IR) Sqn – Repair Team
18
31 668
Handling Squadron
O
C
P
I
T
R
Repair Spares – Industry
28O
C
P
I
T
R
Repair/Maintain Spares R2
34O
C
P
I
T
R
Demand & Return Spare
Parts
39O
C
P
I
T
R
Independent Inspection
59O
C
P
I
T
R
Record Work done on Aircraft
6O
C
P
I
T
R
Defer Faults
17 O
C
P
I
T
R
Tools & Test Equipment
13O
C
P
I
T
R
OperateShift Pattern
64
O
C
P
I
T
R
Co-ordinate Maintenance
Documentation
16 O
C
P
I
T
R
Ground Handling
3
O
C
P
I
T
R
Fuel/Defuel
14
O
C
P
I
T
R
Avionic Communicatio
ns
47 O
C
P
I
T
R
Avionic Flight Systems
45
O
C
P
I
T
R
Software
65
O
C
P
I
T
R
Mechanical Systems
49
O
C
P
I
T
R
Armament & Electrical Systems
48O
C
P
I
T
R
Propulsion
51O
C
P
I
T
R
Replacement of service life
limited parts
26O
C
P
I
T
R
Scheduled Maintenance
2
O
C
P
I
T
R
Task Maintenance
5
O
C
P
I
T
R
Flight Servicing
1
O
C
P
I
T
R
Pre-Flight Checks
55O
C
P
I
T
R
Operate Aircraft
54
O
C
P
I
T
R
Crew Escape System
52O
C
P
I
T
R
Aircraft Structure
50
O
C
P
I
T
R
Supply Chain
10
O
C
P
I
T
R
Acquire Spare Parts
33O
C
P
I
T
R
Store & Maintain
Weapons & RE
38
O
C
P
I
T
R
Structural Inspections
41
O
C
P
I
T
R
Repair Aircraft
40O
C
P
I
T
R
Corrective Maintenance
43
O
C
P
I
T
R
Apply SI(T)s
24 O
C
P
I
T
R
Fault Diagnosis
42
O
C
P
I
T
R
Supervise Maintenance
58
O
C
P
I
T
R
Handover
57
O
C
P
I
T
R
Fit/Remove Role & Arm Equipment
11
O
C
P
I
T
R
Rectification and Line
Control Boards
61
O
C
P
I
T
R
Ground Services
20
O
C
P
I
T
R
Report Faults & Husbandry
25
O
C
P
I
T
R
Weapons
53 O
C
P
I
T
R
Defensive AIds
46
O
C
P
I
T
R
Survival Equipment
56
O
C
P
I
T
R
Train Maintenance
Personnel
7
O
C
P
I
T
R
Airworthiness Review
Certification
27
O
C
P
I
T
R
Chief Air Engineer
69
O
C
P
I
T
R
Force & A4 Operations
21
O
C
P
I
T
R
Plan Weekly/Daily Flying Programme
60
O
C
P
I
T
R
Occurrence Reporting
9
O
C
P
I
T
R
Maintain GSE
12
O
C
P
I
T
R
Configuration Management
(LITS)
63O
C
P
I
T
R
Manage Maintenance
Extensions
62
O
C
P
I
T
R
Technical Assistance
Process
44
O
C
P
I
T
R
Modify Aircraft
23O
C
P
I
T
R
Monitor Reliability Data
35
O
C
P
I
T
R
Maintenance Programme
Development
22
O
C
P
I
T
R
Independent Technical
Advice
37O
C
P
I
T
R
Publish Approved Data
30
O
C
P
I
T
R
Release to Service
36O
C
P
I
T
R
Engine Health Monitoring
19O
C
P
I
T
R
3 Month Flying Programme
4
O
C
P
I
T
R
Assure Quality
15
O
C
P
I
T
R
Engine Fleet Monitoring
67
O
C
P
I
T
R
Publish Aircrew Publications
29O
C
P
I
T
R
Cost/Benefit and Hazard
Analysis
32
O
C
P
I
T
R
Design Organisations
68
Business Procedure BS013
RA 1300
RTSA
ITEA Contract
Codification
LITS ServersTAG TeamSEMA
DAOSBaseline Design
Warton Manpower Drawing SetDevelopment Aircraft Flight Trials
Materials
Figure 5-5 5-6 Screen Capture of Visualisation Tool with all Functional Activities
Shown
5.4 Exploiting the Tool
The tool produces a complex and interesting visualisation of the entire
airworthiness management system. Exploitation of the tool will require particular
processes (that is a series of linked functions) to be analysed to ascertain
whether functional resonance has occurred or is likely to occur. The benefit of
using Viso to develop the tool is the ability to decompose the diagram by
selecting its constituent layers. For example Figure 5-7 shows the external
dependencies and activities linked to the aspects of the ‘Train Maintenance
Personnel’ function. This can easily be achieved by selecting the ‘Train
Maintenance Personnel’ layer within Visio. Figures 5-8 to 5- 9 show how the
layers may be manipulated both within Visio and the DII Visio viewing tool. This
functionality will be exploited in the examples given in the following chapters.
86
LINE OPERATIONS
AIRCRAFT SYSTEMAIRCRAFT MAINTENANCE
CONTINUING AIRWORTHINESS & OPERATIONS SUPPORT
SUPPLY & REPAIR CHAIN
PROJECT TEAM/TYPE AIRWORTHINESS AUTHORITY &
INDUSTRY SUPPORT
REGULATION
EXTERNAL RESOURCES AND FUNCTIONS
© Crown Copyright, 2013
Licensed Hangar/ Parking
Space71(IR) Sqn – Non
Destructive Testing
Force Level 0 Plan
Crew Training Plan
Squadron Planning Staff
Squadron Management Tools
SQEP Engineering Management
ESLOPS (Aircraft State Database)
Personal Notes
LITS Instructions
Codification Process
MJDI System
BAES Supply IT System
Storage Supply Orders
JSP800/886
JSP 886 Pipeline Times
WeatherBowser & Driver
Strategic Fleet Plan
Joint Business Agreement
ATTAC Contract (BAE Systems)
Capability Development Programme
GR4mations IT Tool
Joint Business Agreement
MILITARY EFFECT
O
C
P
I
T
R
Software
Workshop Infra & Tools
Local Finance
Testing
AP100E-15
Detuner / HP Bay
ROCET Contract (Rolls Royce)
JAMES (IT system)
Capability Requirements Management
Commercial Arrangements
Project Management
5000 Series Regulatory Articles
F799 Instructions for Use – Maintenance Log
Airworthiness/Safety Delegation Holders
Trilogi System
4000 Series Regulatory Articles
RESOLVECAMO Staff
Manual of Airworthiness Processes -01
Other Nations: Tornado Tech
Warning/Special Technical Order
Project Commercial & Financial Advice
Tornado Equipment Safety Management Plan
Commodity Internal Business
Agreement
Inventory Management
Staff
Explosives Regulations
Supply Personnel
Integrated Engineering Database
EDSR (Drawings database)
PROQUIS
External Communications
Dynamic Environment
Aircraft Abandoned
Flight Authorisation Process
Qualified and Current Aircrew
AP100B-01 Handover Policy
Duty Auth
Squadron Golden Rules
Maintenance Personnel Assigned to Post
Phase 1 & 2 Training
Trainee Maintenance Personnel
Rigs
Air Safety Management
Information System
External Occurrence Investigators
Air Safety Cell
Air Safety Management Plans
Quality Staff
External Audit
Quality System Plans & Regulation
Archived Data
O
C
P
I
T
R
Fit/Remove Role Equiment/Weapons/Explosives/
Ejection Seats
O
C
P
I
T
R
Co-ordinate Maintenance
Documentation
O
C
P
I
T
R
Scheduled Maintenance
O
C
P
I
T
R
3 Month Flying Programme
O
C
P
I
T
R
Task Maintenance
O
C
P
I
T
R
Provide Authorised
Maintenance Personnel
O
C
P
I
T
R
Supply Chain
O
C
P
I
T
R
Maintain GSE
O
C
P
I
T
R
Provide and Account for
Tools and Test Equipment
O
C
P
I
T
R
Fuel/Defuel
O
C
P
I
T
R
Flight Servicing
O
C
P
I
T
R
Defer Faults
O
C
P
I
T
R
Locally Manufacture
Parts
O
C
P
I
T
R
Engine Health Monitoring
O
C
P
I
T
R
Ground Services
O
C
P
I
T
R
Force & A4 Operations
O
C
P
I
T
R
Maintenance Programme
Development
O
C
P
I
T
R
Modify Aircraft
O
C
P
I
T
R
Apply Special Instructions (Technical)
O
C
P
I
T
R
Report Faults & Husbandry
O
C
P
I
T
R
Replacement of service life
limited parts
O
C
P
I
T
R
Airworthiness Review
Certification
O
C
P
I
T
R
Repair Spares - Industry
O
C
P
I
T
R
Publish Aircrew Publications
O
C
P
I
T
R
Publish Approved Data
O
C
P
I
T
R
Publish Special Instructions
Technical
O
C
P
I
T
R
Cost/Benefit and Hazard
Analysis
O
C
P
I
T
R
Acquire Spare Parts
O
C
P
I
T
R
Repair/Maintain Spares R2
O
C
P
I
T
R
Monitor Reliability Data
O
C
P
I
T
R
Publish Release to Service
O
C
P
I
T
R
Independent Technical
Advice
O
C
P
I
T
R
Store, Service, Repair
Weapons and Role
Equipment
O
C
P
I
T
R
Demand & Return Spare
Parts
O
C
P
I
T
R
Repair Aircraft
O
C
P
I
T
R
Structural Inspections and
Corrosion Control
O
C
P
I
T
R
Fault Diagnosis
O
C
P
I
T
R
Corrective Maintenance
O
C
P
I
T
R
Technical Assistance
Process
O
C
P
I
T
R
Operate Aircraft
O
C
P
I
T
R
Pre-Flight Checks
O
C
P
I
T
R
Avionic Flight Systems
O
C
P
I
T
R
Defensive AIds
O
C
P
I
T
R
Avionic Communicatio
ns
O
C
P
I
T
R
Armament & Electrical Systems
O
C
P
I
T
R
Mechanical Systems
O
C
P
I
T
R
Aircraft Structure
O
C
P
I
T
R
Propulsion
O
C
P
I
T
R
Crew Escape System
O
C
P
I
T
R
Weapons
O
C
P
I
T
R
Produce Airworthy Survival
Equipment
O
C
P
I
T
R
Handover
O
C
P
I
T
R
Supervise Maintenance
O
C
P
I
T
R
Independent Inspection
O
C
P
I
T
R
Plan Weekly/Daily Flying Programme
O
C
P
I
T
R
Archive Continuing
Airworthiness Records
O
C
P
I
T
R
Chief Air Engineer/
Accountable Manager
O
C
P
I
T
R
Assure Quality
O
C
P
I
T
R
Occurence Reporting,
Investigation & Follow Up
O
C
P
I
T
R
Train Maintenance
Personnel
O
C
P
I
T
R
Ground Handling
O
C
P
I
T
R
Rectification and Line
Control Boards
O
C
P
I
T
R
Manage Maintenance
Extensions
O
C
P
I
T
R
Configuration Management
(LITS)
O
C
P
I
T
R
OperateShift Pattern
O
C
P
I
T
R
Record Work done on Aircraft
O
C
P
I
T
R
Engine Performance Monitoring
O
C
P
I
T
R
Engine Fleet Monitoring
O
C
P
I
T
R
Design Organisations
Symptom Capture Tool
Reliability Database
71(IR) Sqn – Repair Team
Figure 5-7 Activities and Dependencies Linked to Aspects of the ‘Train Maintenance Personnel’ Function
87
Figure 5-8 Selecting Layers within Visio – Screen Capture
88
Figure 5-9 DII Visio Viewer – Screen Capture
89
5.5 Summary
A visualisation tool to complement the TASM has been created using MS Visio.
The tool allows linked functions to be highlighted using the layers feature of the
Visio application or the Visio Viewer tool within the MOD’s standard IT system.
An overview of the tool is given in the key included within it and reproduced at
Figure 5-11.The tool allows processes to be investigated for the purpose of risk
assessment or incident investigation. When used with the TASM spreadsheet,
experienced engineers or safety managers will be able to assist in engineering
resilience into the Tornado airworthiness system by adjusting controls on
existing processes so as to prevent harmful functional resonance occurring.
Such system adjustments will need to be based on assessment of the risks
posed by particular hazards, which may only become apparent through
investigation of incidents using the tool. Examples of incident investigation and
risk assessment are given in chapters 6 and 7. System adjustments themselves
may take any form that alters the way in which particular functions perform. For
example, if a reliability problem arose with particular technical subsystem,
resources may be increased such as the provision of additional funding to
procure more spares to feed scheduled maintenance. Control of the
maintenance function would need to change through changing the output of the
‘provide approved data’ function. Whilst all of these things may have been done
without the use of the tools, it is hoped that FRAM will provide insights into
‘whole system’ operation and emergent behaviour that would otherwise be
difficult to achieve.
90
TORNADO GR4 AIRWORTHINESS SYSTEM MODEL – Visualisation Tool
Using the Functional Resonance Analysis Method (FRAM) to model Complex Organisational, Human Factors and Technological Functions
A function (identified by a serial number) has 6 types of aspects, which define its interaction with the system:
Functions have potential couplings between their aspects
*An instantiaion is a ‘time-slice’ of the system showing a specific series of activities
Further info: [email protected]
These couplings only exist for finite periods of time and represent activities.
A process can be shown by an instantiation* of the model; showing a series of coupled functions forming a process.
The complexity of the system makes the model intractable if all processes are considered together. Using the interactive layers, various processes can be can be visualised and cross referred to the spreadsheet model.
The FRAM Spreadsheet Model contains information regarding likely variability of functional outputs; if inadequately controlled this variability may lead to hazards and accidents.
O
C
P
I
T
R
X
O
C
P
I
T
R
X
Function with potential to produce direct air safety hazards through their output
Function which directly affects condition of aircraft & equipment
O
C
P
I
T
R
X O
C
P
I
T
R
Z
O
C
P
I
T
R
Y
Instructions for Highlighting a Process: Click the ‘Layers’ Button above - Select a tick against each function identified as
a part of the process. Select a new colour for each function that has
been ticked – this must be the same colour for each function.
If you wish to also highlight the external processes involved both ‘0 - External’ and ‘0 – External Resources’ layers must be selected and given the same colour as the functions.
Tracking The Process Further into the System: Simply keep selecting and colouring functions. You can highlight all potential activities by
selecting the layer ‘0 – BLUE’ The background can be selected or deselected
using the ‘0 - Background’ layer.Printing an Instantiation: The Internet Explorer print function produces
a poor quality image. Instead, press Ctrl + Prt Scr on your keyboard and then paste into a word document, the use crop tool.
Purpose of the tool: To allow visualisation of specific instantiations* of the Tornado GR4 airworthiness system to enable air safety/airworthiness occurrence investigation, airworthiness risk assessment and system improvement activity
O
C
P
I
T
R
Function Name
Time
Preconditions
Input
Resources
Output
Control
00
Figure 5-10 Visualisation Tool Key
91
This tool allows processes to be investigated for the purpose of risk assessment
or incident investigation. When used with the TASM spreadsheet, experienced
engineers or safety managers will be able to assist in engineering resilience into
the Tornado airworthiness system by adjusting controls on existing processes
so as to prevent harmful functional resonance occurring. Such system
adjustments will need to be based on assessment of the risks posed by
particular hazards, which may only become apparent through investigation of
incidents using the tool. Examples of incident investigation and risk assessment
are given in chapters six and seven. System adjustments themselves may take
any form that alters the way in which particular functions perform. For example,
if a reliability problem arose with particular technical subsystem, resources may
be increased such as the provision of additional funding to procure more spares
to feed scheduled maintenance. Control of the maintenance function would
need to change through changing the output of the ‘provide approved data’
function. Whilst all of these things may have been done without the use of the
tools, it is hoped that FRAM will provide insights into ‘whole system’ operation
and emergent behaviour that would otherwise be difficult to achieve.
93
6 USING THE TORNADO AIRWORTHINESS SYSTEM
MODEL FOR INCIDENT ANALYSIS
Chapter four described Step zero in the FRAM used to build the Tornado
Airworthiness System Model (TASM); this specified that the main purpose of the
TASM was for risk assessment. However, using the visualisation tool allows
ready decomposition of the model into parts pertinent to particular incidents.
Particular processes can be highlighted, with other functions being left as
background functions on the assumption that their variability was not significant
in controlling the processes involved in the incident.
6.1 Case for Using FRAM for Incident Modelling
Chapter one discusses commonly applied accident models, whether they are
technological, human or organisational. The military air safety management
system uses an Occurrence Investigation process manned by local personnel to
understand any occurrences that had the potential to pose an unacceptable air
safety risk. For accidents or serious occurrences the MAA will convene Service
Inquiries to investigate using experts from the military air accident investigation
branch. Similar arrangements exist within civilian operators and regulators. The
purpose of applying FRAM to incident analysis is to provide a resilience
engineering perspective to understanding how incidents occurred and to
provide recommendations that are more likely than traditional methods to
prevent reoccurrence of similar or unrelated incidents. By understanding how
functional performance variability combined to produce an adverse outcome it
should be possible to understand how performance conditions might be shaped
or controlled to produce more desirable outcomes in the future. In order to
explore this hypothesis, two particular incidents that have occurred within the
RAF Tornado Force were selected and analysed. As the following analyses rely
only on data from existing occurrence reports, no new findings will be
highlighted – this chapter just demonstrates how incidents can be described
using the TASM.
94
6.2 Incident One – Thrust Reverser Incidents
Tornado employs a thrust reverse system to provide braking on landing in order
to slow the aircraft to safe taxying speeds. In the event thrust reversers fail to
operate, wheel braking may be used although this does increase the likelihood
of fire hazards from hot brakes, both to the aircraft and to ground crews.
Significantly thrust reverse is also required in the event of high a high speed
abort during take-off. Thrust reversers deploy as ‘clam-shell’ buckets directly
into the jet efflux, rear of the final nozzle in the RB199 engine exhaust system.
Figure 6-1 Tornado GR4 with Thrust Reversers Deployed (Cooke, 2004)
95
6.2.1 Description of Incidents
Tornado has experienced a recent history of thrust reverser incidents, some of
which are summarised here, using data taken from ASIMS:
Table 6-1 Thrust Reverser Air Safety Occurrence Reports 2012/13
# Report ID Date of Occurrence
Brief Title
1 asor\Marham - RAF\2(AC) Sqn\Tornado\13\8805
20/09/2013 Thrust Reverse Failure
2 asor\OOA Kandahar\TorDet (LOS) - KAF\Tornado\13\8559
13/09/2013 Thrust Reverse Failure on Landing
3 asor\Marham - RAF\2(AC) Sqn\Tornado\13\6791
25/07/2013 Thrust reverse failure on landing.
4 asor\Marham - RAF\9 Sqn\Tornado\13\4415
21/05/2013 Thrust Reverse Fault on Taxi
5 asor\OOA Kandahar\TorDet (MRM) - KAF\Tornado\13\3056
16/04/2013 TR Failure on Landing
6 asor\Lossiemouth - RAF\XV(R) Sqn\Tornado\13\2506
25/03/2013 Thrust reverse failure on landing
7 asor\Lossiemouth - RAF\XV(R) Sqn\Tornado\13\2299
18/03/2013 Thrust Reverse failing to stow correctly
8 asor\Marham - RAF\2(AC) Sqn\Tornado\13\1778
01/03/2013 Thrust Reverse Failure on Landing
9 asor\Marham - RAF\2(AC) Sqn\Tornado\13\1440
19/02/2013 Thrust-Reverse Failure
10 asor\OOA Kandahar\TorDet (MRM) - KAF\Tornado\13\1276
13/02/2013 TR bucket failed to stow
11 asor\OOA Kandahar\TorDet (MRM) - KAF\Tornado\12\21294
11/12/2012 Thrust reverse failure on landing
96
The recent history shown in Table 6-1 has been preceded by a number of
detailed investigations into earlier incidents shown in Table 6-2:
Table 6-2 Thrust Reverse Occurrences with Detailed Investigation
# Report ID Date of Occurrence
Brief Title
12 asor\Marham – RAF\31 Sqn\Tornado\12\18524
18/09/2012 Lift Dump and thrust reverser failure on landing
13 asor\Marham – RAF\9 Sqn\Tornado\12\17514
16/08/2012 Thrust Reverse Failure on Landing
14 asor\OOA Kandahar\TorDet (LOS) – KAF\Tornado\10\133434
20/08/2010 Thrust Reverse Failure and Brake Fire
15 asor\Lossiemouth – RAF\14 Sqn\Tornado\10\133083
04/08/2010 Thrust Reverse Failure on Landing
In report number 12 the cause of the system failure was not positively
determined, although a throttle box was changed as there was speculation that
this may have caused a wiring fault. In reports 12-14 the thrust reverse system
did not operate because circuit-breakers (CB) had been pulled some time prior
to the flight, consequently when the pilot selected reverse thrust, the system did
not operate.
6.2.2 Summary of the Investigations
The ASIMS record contained various detailed investigations into occurrences
12-14, these all centred on the reasons for the circuit breakers remaining pulled
after the aircraft was released for flight. The investigations worked within a
frame of reference that relied on a hazards and barriers model of the situation.
In each case a number of missed opportunities were identified where the error
could have been spotted. Three particular circuit breakers (CBs) were the focus
of the investigations and in different combinations were responsible for the
failure of the thrust reversers to deploy. These circuit breakers had been
legitimately pulled to inhibit the thrust reverse system as a result of a
requirement to conduct engine ground runs, prior to the flights where each
occurrence happened. The circuit breakers prevented the activation of relays
97
which would put the electrical system into the on-the-ground rather than
airborne state, once the aircraft had landed. This relay system is in of itself a
safety barrier to the operation of thrust reversers whilst airborne. The engine
ground runs were conducted because maintenance of the Environmental
Conditioning System (ECS) had been carried out. ECS has suffered numerous
reliability issues since the mid-life upgrade of the fleet to GR4 standard and had
been subject to considerable work by the Design Organisation (DO) and the
Engineering Authority (EA) since that upgrade programme. This resulted in the
issue of a complicated Routine Technical Instruction1 (RTI) that required the
ECS to be tested and adjusted during engine ground runs. This RTI was not
required in all cases but engine grounds runs were required for other reasons
where it was not, as a result of mandatory maintenance procedures called up
during fault rectification. The difficulty in carrying out the RTI was raised as a
factor in the some of the investigations. The approved maintenance data
required that yellow clip-on safety tags were to be fitted to any CB that has been
pulled, in order that their inoperative state became highly visible. It had become
normal practise for this not to happen during maintenance on front line
squadrons, the reason being that in many cases CBs are set and re-set multiple
times during fault diagnosis. However in some instances, CBs were in fact
pulled as a safety measure during engine ground runs where continual setting
and resetting was not a factor. None the less, the practise of failing to safety-tag
pulled CBs had become an organisational norm. There obviously was a need to
reset CBs so that the system would perform as demanded in the air. However,
maintenance procedures often mandated a complicated series of CB setting
and resetting. It was highlighted that the layout of these procedures in the
written form was difficult to follow. In some cases there appears to have been
some confusion surrounding which technician had performed the final check of
the CB panel, however in all cases this was recorded as having been done. A
final opportunity to spot the pulled CBs was missed during the servicing of each
aircraft before flight. This action was specifically mandated by the flight
1 An RTI is a category of Special Instruction (Technical) instigated by the EA.
98
servicing notes and further amplified through the addition of a supplementary
flight servicing requirement in the aircraft logbook. The flight servicing task was
self-supervised. Shortly after occurrence 15 a local technical instruction was
enacted; the instruction mandated another independent check of CBs as an
additional barrier to failure. The instruction did not include the CB that prevented
the thrust reversers deploying in occurrence 14. In case 14, the approved data
was not rigorously followed as the technicians realised that this would require
functional tests to be duplicated. An unforeseen consequence of this approach
was the removal of an opportunity to check the condition of the CB, although to
add to the confusion, the maintenance procedure that was not followed
contained an error in that it required the CB to be pulled and not reset.
Checking the CBs visually was not easy due to their confined location on the
roof of the nose undercarriage bay; a technique often employed was to sweep a
hand over the panel to feel any raised CBs. In case 12 there was mention of
distraction caused by work being carried out by other tradesmen on systems on
the same aircraft, as well as the implications of the shift system being
employed. The investigations made some recommendations towards
considering a post-taxi check of the thrust reverser system. This was rejected
on the basis of independent technical advice from QinetiQ following a ground
trial.
6.2.3 Instantiation of the FRAM Model
The FRAM model described in Chapter 4 and shown in detail at appendix A
seeks to describe all potential and actual activities in the system linking the
various functions. In order to understand the incidents in question this model
must be ‘time-sliced’ to produce an instantiation of activity during the function. It
is not however quite as simple as describing the activities underway at a single
moment with respect to a single aircraft. The time-slice can be considered to be
moving as the activities permeate downstream through the model. For example
issue of approved data will have happened well before any downstream
coupling occurs between that function and any human function which requires it
as a control. The first step to describing an instantiation of the model in
99
reference to this series of incidents is to list the functions and make of note of
which have been referenced (directly or by implication) in the investigation
reports as contributing in some way to the incident:
Table 6-3 Thrust Reverser FRAM Instantiation
Number Type Function Variability noted in DASOR
1 Human Flight Servicing CB check missed or ineffective
5 Human Task Maintenance Discontinuity in tasking shifts
6 Human Record Work Done on ac
Current CB configuration not correctly recorded; not recorded as work progressed
11 Human
Fit/Remove Role Equipment/Weapons/Explosives/Ejection Seats
Unrelated ‘Litening’ Pod fit task caused distraction
15 Organisational Assure Quality CB Clip normalised practise note dealt with
23 Organisational Modify Aircraft GR4 Modification caused ECS system reliability issues
24 Organisational Apply Special Instruction (Technical)s
Application of complicated RTI required CBs to be disturbed - not completed correctly
30 Organisational Publish Approved Data (Tech Manuals & Policy)
Maintenance Procedures requires CBs to be disturbed
31 Organisational Publish Special Instructions (Technical)
EA mandated repeated disturbance of CBs
32 Human
Cost Benefit Analysis / Hazard Analysis/ ALARP Decision
EA decision not to mandate Thrust Reverser system tests during taxy because of FOD risk.
35 Human Monitor Reliability Data
RTI was means of achieving require reliability data
37 Organisational Independent Advice QQ advice on FOD ingestion during taxi checks
42 Human Fault Diagnosis ECS Fault diagnosis called up engine ground runs.
43 Human Corrective Maintenance
Replacement of parts called up engine ground runs
48 Technological Armament & Electrical Systems
Electrical system did not function in CB pulled state
49 Technological Mechanical Systems ECS system not performing to specification
100
51 Technological Propulsion Thrust reverse system did not operate - failure to provide thrust in required direction
54 Human Operate Aircraft Pilot selected thrust reverse
55 Human Pre-Flight Checks Did not highlight CBs in correctly set or Thrust Reverse serviceability
58 Human Supervise Maintenance
Supervision (including self-supervision) not adequate to identify failure to set CBs
59 Human Independent Inspection
Not in place - now instigated
60 Human Plan Weekly-Daily Flying Programme
Task did not match maintenance personnel resource
64 Organisational Operate Shift Pattern
Caused discontinuity in tasking
These functions of interest can be selected as layers within the Visualisation
Tool, with colour added to highlight coupled functions. Because 23 functions
have been identified as having variability the visualisation of the FRAM Model
Instantiation is still very complicated.
101
Figure 6-2 Thrust Reverser Incidents Visualisation
SQEP Engineering Management
ESLOPS (Aircraft State Database)
Personal Notes
LITS Instructions
Weather
Flight Authorisation Process
Qualified and Current Aircrew
Archived Data
O
C
P
I
T
R
Maintenance Personnel
O
C
P
I
T
R
Locally Manufacture
Parts
O
C
P
I
T
R
Publish SI(T)sO
C
P
I
T
R
Engine Performance Monitoring
18
31 668
O
C
P
I
T
R
Repair Spares – Industry
28O
C
P
I
T
R
Repair/Maintain Spares R2
34O
C
P
I
T
R
Demand & Return Spare
Parts
39O
C
P
I
T
R
Independent Inspection
59O
C
P
I
T
R
Record Work done on Aircraft
6O
C
P
I
T
R
Defer Faults
17 O
C
P
I
T
R
Tools & Test Equipment
13O
C
P
I
T
R
OperateShift Pattern
64
O
C
P
I
T
R
Co-ordinate Maintenance
Documentation
16 O
C
P
I
T
R
Ground Handling
3
O
C
P
I
T
R
Fuel/Defuel
14
O
C
P
I
T
R
Avionic Communicatio
ns
47 O
C
P
I
T
R
Avionic Flight Systems
45
O
C
P
I
T
R
Software
65
O
C
P
I
T
R
Mechanical Systems
49
O
C
P
I
T
R
Armament & Electrical Systems
48O
C
P
I
T
R
Propulsion
51O
C
P
I
T
R
Replacement of service life
limited parts
26O
C
P
I
T
R
Scheduled Maintenance
2
O
C
P
I
T
R
Task Maintenance
5
O
C
P
I
T
R
Flight Servicing
1
O
C
P
I
T
R
Pre-Flight Checks
55O
C
P
I
T
R
Operate Aircraft
54
O
C
P
I
T
R
Crew Escape System
52O
C
P
I
T
R
Aircraft Structure
50
O
C
P
I
T
R
Supply Chain
10
O
C
P
I
T
R
Acquire Spare Parts
33O
C
P
I
T
R
Store & Maintain
Weapons & RE
38
O
C
P
I
T
R
Structural Inspections
41
O
C
P
I
T
R
Repair Aircraft
40O
C
P
I
T
R
Corrective Maintenance
43
O
C
P
I
T
R
Apply SI(T)s
24 O
C
P
I
T
R
Fault Diagnosis
42
O
C
P
I
T
R
Supervise Maintenance
58
O
C
P
I
T
R
Handover
57
O
C
P
I
T
R
Fit/Remove Role & Arm Equipment
11
O
C
P
I
T
R
Rectification and Line
Control Boards
61
O
C
P
I
T
R
Ground Services
20
O
C
P
I
T
R
Report Faults & Husbandry
25
O
C
P
I
T
R
Weapons
53 O
C
P
I
T
R
Defensive AIds
46
O
C
P
I
T
R
Survival Equipment
56
O
C
P
I
T
R
Train Maintenance
Personnel
7
O
C
P
I
T
R
Airworthiness Review
Certification
27
O
C
P
I
T
R
Chief Air Engineer
69
O
C
P
I
T
R
Force & A4 Operations
21
O
C
P
I
T
R
Plan Weekly/Daily Flying Programme
60
O
C
P
I
T
R
Occurrence Reporting
9
O
C
P
I
T
R
Maintain GSE
12
O
C
P
I
T
R
Configuration Management
(LITS)
63O
C
P
I
T
R
Manage Maintenance
Extensions
62
O
C
P
I
T
R
Technical Assistance
Process
44
O
C
P
I
T
R
Modify Aircraft
23O
C
P
I
T
R
Monitor Reliability Data
35
O
C
P
I
T
R
Maintenance Programme
Development
22
O
C
P
I
T
R
Independent Technical
Advice
37O
C
P
I
T
R
Publish Approved Data
30
O
C
P
I
T
R
Release to Service
36O
C
P
I
T
R
Engine Health Monitoring
19O
C
P
I
T
R
3 Month Flying Programme
4
O
C
P
I
T
R
Assure Quality
15
O
C
P
I
T
R
Engine Fleet Monitoring
67
O
C
P
I
T
R
Publish Aircrew Publications
29O
C
P
I
T
R
Cost/Benefit and Hazard
Analysis
32
O
C
P
I
T
R
Design Organisations
68
102
6.2.4 The Sources of Variability
The output of the propulsion system and the electrical system on which it relies
varied outside of the required performance envelope in that the thrust reverse
did not deploy because the upstream electrical function did not provide the
required output. In this case the electrical system output was an extreme case
of output variability – there was no power supplied to the thrust reverse circuit
when demanded. There were potentially other dimensions in which the
electrical output might have varied e.g. power, current, voltage etc.
Figure 6-3 Propulsion & Electrical System
Clearly this situation arose because the upstream maintenance functional
output meant that the electrical system was in the wrong configuration (CB
pulled). Thrust reversers are used on nearly every Tornado sortie – what then
was the key element of variability that made these instances different to most
other times the aircraft was operated? In each case, human maintenance
activity was required on a system that had an upstream connection with the
electrical system prior to the occurrence. Every Tornado flight requires a
significant degree of variable human functions to allow it to take place. Using
FRAM and the TASM an occurrence investigator needs establish:
How the functions came to combine in a manner that was potentially
hazardous to the system?
No Electrical Supply to Thrust ReversersO
C
P
I
T
R
Armament & Electrical Systems
48O
C
P
I
T
R
Propulsion
51
Dynamic Environment
No Reverse Thrust
103
Given functional output variability is normally sufficiently damped so as
not to produce a hazardous output (e.g. thrust reverse normally operates
correctly), what damping function that is normally present was not
adequate in this case?
The TASM visualisation tool can be used to trace back through the system to
identify where functional resonance has occurred. Figure 6-4 highlights
potentially functionally resonant activities, which can then be examined in the
FRAM Model – shown with red outlines in Table 6-4:
104
Figure 6-4 Electrical System Potential Functionally Resonant Activities
O
C
P
I
T
R
Maintenance Personnel
O
C
P
I
T
R
Locally Manufacture
Parts
O
C
P
I
T
R
Publish SI(T)sO
C
P
I
T
R
Engine Performance Monitoring
18
31 668
O
C
P
I
T
R
Repair Spares – Industry
28O
C
P
I
T
R
Repair/Maintain Spares R2
34O
C
P
I
T
R
Demand & Return Spare
Parts
39O
C
P
I
T
R
Independent Inspection
59O
C
P
I
T
R
Record Work done on Aircraft
6O
C
P
I
T
R
Defer Faults
17 O
C
P
I
T
R
Tools & Test Equipment
13O
C
P
I
T
R
OperateShift Pattern
64
O
C
P
I
T
R
Co-ordinate Maintenance
Documentation
16 O
C
P
I
T
R
Ground Handling
3
O
C
P
I
T
R
Fuel/Defuel
14
O
C
P
I
T
R
Avionic Communicatio
ns
47 O
C
P
I
T
R
Avionic Flight Systems
45
O
C
P
I
T
R
Software
65
O
C
P
I
T
R
Mechanical Systems
49
O
C
P
I
T
R
Armament & Electrical Systems
48O
C
P
I
T
R
Propulsion
51O
C
P
I
T
R
Replacement of service life
limited parts
26O
C
P
I
T
R
Scheduled Maintenance
2
O
C
P
I
T
R
Task Maintenance
5
O
C
P
I
T
R
Flight Servicing
1
O
C
P
I
T
R
Pre-Flight Checks
55O
C
P
I
T
R
Operate Aircraft
54
O
C
P
I
T
R
Crew Escape System
52O
C
P
I
T
R
Aircraft Structure
50
O
C
P
I
T
R
Supply Chain
10
O
C
P
I
T
R
Acquire Spare Parts
33O
C
P
I
T
R
Store & Maintain
Weapons & RE
38
O
C
P
I
T
R
Structural Inspections
41
O
C
P
I
T
R
Repair Aircraft
40O
C
P
I
T
R
Corrective Maintenance
43
O
C
P
I
T
R
Apply SI(T)s
24 O
C
P
I
T
R
Fault Diagnosis
42
O
C
P
I
T
R
Supervise Maintenance
58
O
C
P
I
T
R
Handover
57
O
C
P
I
T
R
Fit/Remove Role & Arm Equipment
11
O
C
P
I
T
R
Rectification and Line
Control Boards
61
O
C
P
I
T
R
Ground Services
20
O
C
P
I
T
R
Report Faults & Husbandry
25
O
C
P
I
T
R
Weapons
53 O
C
P
I
T
R
Defensive AIds
46
O
C
P
I
T
R
Survival Equipment
56
O
C
P
I
T
R
Train Maintenance
Personnel
7
O
C
P
I
T
R
Airworthiness Review
Certification
27
O
C
P
I
T
R
Chief Air Engineer
69
O
C
P
I
T
R
Force & A4 Operations
21
O
C
P
I
T
R
Plan Weekly/Daily Flying Programme
60
O
C
P
I
T
R
Occurrence Reporting
9
O
C
P
I
T
R
Maintain GSE
12
O
C
P
I
T
R
Configuration Management
(LITS)
63O
C
P
I
T
R
Manage Maintenance
Extensions
62
O
C
P
I
T
R
Technical Assistance
Process
44
O
C
P
I
T
R
Modify Aircraft
23O
C
P
I
T
R
Monitor Reliability Data
35
O
C
P
I
T
R
Maintenance Programme
Development
22
O
C
P
I
T
R
Independent Technical
Advice
37O
C
P
I
T
R
Publish Approved Data
30
O
C
P
I
T
R
Release to Service
36O
C
P
I
T
R
Engine Health Monitoring
19O
C
P
I
T
R
3 Month Flying Programme
4
O
C
P
I
T
R
Engine Fleet Monitoring
67
O
C
P
I
T
R
Publish Aircrew Publications
29O
C
P
I
T
R
Cost/Benefit and Hazard
Analysis
32
O
C
P
I
T
R
Design Organisations
68
No Electrical Supply to Thrust Reversers
105
Table 6-4 FRAM Model of Electrical System
Name of Function Armament & Electrical SystemsAspect Description of Aspect Number Name Aspect
Input Ground Services (Electrical Power Generation) 20
Ground Services (Cooling,
Power, Dehumidification,
Steps, Staging, Bungs,
Blanks)
AC connected/removed
to ground services - Arm
Elect, Structure, Mech
Sys,
SequenceOmissions - items left
attached or not fittedMedium Medium
No electrical output during
maintenanceINCREASE 12
Propulsion System (Electrical Power Generation) 51 Propulsion Electrical Power Force/ Distance/Direction Fail to provide power Low High No electrical power INCREASE 9
Output Electrical Power/Signals
Precondition Apply Special Instructions (Technical) 24Apply Special Instruction
(Technical)
Special Instruction
(Technical) Applied to
applicable
Timing/DurationInstruction not complied
within specified timeHigh High Unsafe condition develops INCREASE 27
Scheduled Maintenance 2 Scheduled MaintenanceAC Inspected (all
systems)Sequence Omission High Medium
Unsafe condition develops due to
component failureINCREASE 18
Repair Maintenance 40 Repair Aircraft Aircraft Structure Repair Timing/Duration Not completed within time High MediumIncorrect functional output or
unsafe conditionINCREASE 18
Corrective Maintenance 43 Corrective MaintenanceSystem Restored to
correct functionForce/ Distance/Direction
Ineffective maintenance
actionHigh High Function impaired INCREASE 27
Modify Aircraft 23 Modify AircraftAircraft Systems
Modified under Service Sequence
Modification occurs in wrong
sequence - config control Medium High Function impaired INCREASE 18
Fit/Remove Role Equipment and Weapons 11
Fit/Remove Role
Equipment/Weapons/Expl
osives/Ejection Seats
Aircraft in Changed Role
Fit (Arm
Elec/Weapons/Crew
Timing/DurationTakes longer than
forecast/requiredHigh Low
Incorrect connection impairs signal
to weaponsINCREASE 9
Pre-Flight Checks 55 Pre-Flight ChecksArmament & Electrical
Systems CheckedTiming/Duration Omissions High Medium
Function impaired remains in a
failed state whilst airborneNO CHANGE 12
Resource Structure 50 Aircraft StructureArmament & Electrical
System Loads are Force/ Distance/Direction Fails to react load Low High
Potential for electrical shorting or
sparkingINCREASE 9
Control Operate Aircraft 54 Operate AircraftInputs to aircraft
systemsForce/ Distance/Direction Incorrect control input High High Potential for unsafe condition INCREASE 27
Time Not initially described NO CHANGE 0
Possible effect on this (downstream)
Function Output Variability
(Damping)
Rough Downstream
Function Variability Score
Not initially described
Upstream Function Most Likely Dimension of
Upstream Output Variability
Description of Most Likely
Upstream Output Variability
Frequency of Upstream
Output Performance
Variability
Amplitude of Upstream
Output Performance
Variability
Possible effect on this
(downstream) Function
106
It is important to note that the visualisation tool automatically highlights all
activities which are linked to the electrical system function and any other
function identified in Table 6-3; this does not necessarily mean that these
activities were functionally resonant. To understand the relationship further it is
necessary to compare the model data to the occurrence report described
above. This shows that neither the operator function (pilot) nor the propulsion
system (providing power) output variability was significant during the
occurrence. This left the Apply Special Instructions (Technical), Corrective
Maintenance and Pre-Flight Checks aspects of the Electrical system function.
These three upstream functions are linked by various activities to the ‘pre-
condition’ aspect of the Electrical System function. In this occurrence, all of
three of these functions should have resulted in the CBs being correctly set.
The variation in their functional output meant that the CBs were incorrectly set
and the preconditions (otherwise termed ‘execution conditions’) for the Electrical
System were not present and therefore the electrical signal was not sent to the
thrust reverse element of the propulsion system. Table 6-5 shows these three
preconditions highlighted within the FRAM Model.
107
Table 6-5 Electrical System Precondition Variability
Name of Function Armament & Electrical Systems
Aspect Description of Aspect Number Name Aspect
Input Ground Services (Electrical Power Generation) 20
Ground Services (Cooling,
Power, Dehumidification,
Steps, Staging, Bungs,
Blanks)
AC connected/removed
to ground services - Arm
Elect, Structure, Mech
Sys,
SequenceOmissions - items left
attached or not fittedMedium Medium
No electrical output during
maintenanceINCREASE 12
Propulsion System (Electrical Power Generation) 51 Propulsion Electrical Power Force/ Distance/Direction Fail to provide power Low High No electrical power INCREASE 9
Output Electrical Power/Signals
Precondition Apply Special Instructions (Technical) 24Apply Special Instruction
(Technical)
Special Instruction
(Technical) Applied to
applicable
Timing/DurationInstruction not complied
within specified timeHigh High Unsafe condition develops INCREASE 27
Scheduled Maintenance 2 Scheduled MaintenanceAC Inspected (all
systems)Sequence Omission High Medium
Unsafe condition develops due to
component failureINCREASE 18
Repair Maintenance 40 Repair Aircraft Aircraft Structure Repair Timing/Duration Not completed within time High MediumIncorrect functional output or
unsafe conditionINCREASE 18
Corrective Maintenance 43 Corrective MaintenanceSystem Restored to
correct functionForce/ Distance/Direction
Ineffective maintenance
actionHigh High Function impaired INCREASE 27
Modify Aircraft 23 Modify AircraftAircraft Systems
Modified under Service Sequence
Modification occurs in wrong
sequence - config control Medium High Function impaired INCREASE 18
Fit/Remove Role Equipment and Weapons 11
Fit/Remove Role
Equipment/Weapons/Expl
osives/Ejection Seats
Aircraft in Changed Role
Fit (Arm
Elec/Weapons/Crew
Timing/DurationTakes longer than
forecast/requiredHigh Low
Incorrect connection impairs signal
to weaponsINCREASE 9
Pre-Flight Checks 55 Pre-Flight ChecksArmament & Electrical
Systems CheckedTiming/Duration Omissions High Medium
Function impaired remains in a
failed state whilst airborneNO CHANGE 12
Resource Structure 50 Aircraft StructureArmament & Electrical
System Loads are Force/ Distance/Direction Fails to react load Low High
Potential for electrical shorting or
sparkingINCREASE 9
Control Operate Aircraft 54 Operate AircraftInputs to aircraft
systemsForce/ Distance/Direction Incorrect control input High High Potential for unsafe condition INCREASE 27
Time Not initially described NO CHANGE 0
Possible effect on this (downstream)
Function Output Variability
(Damping)
Rough Downstream
Function Variability Score
Not initially described
Upstream Function Most Likely Dimension of
Upstream Output Variability
Description of Most Likely
Upstream Output Variability
Frequency of Upstream
Output Performance
Variability
Amplitude of Upstream
Output Performance
Variability
Possible effect on this
(downstream) Function
108
By conducting an iterative examination of both the TASM and the visualisation
tool it is possible to construct an instantiation of the occurrence – in so far as
the investigation report details. Copying and pasting the functions and their
linked activities into a new Visio drawing a slightly clearer picture can be
created as shown in Figure 6-8. This instantiation uses the investigation report
to describe how the activities shown in the TASM at Appendix A varied in this
series of occurrences.
6.2.5 Insights from TASM
Figure 6-8 contains a number of insights that can be drawn from the accident
reports. The functions coloured red produced variability in their output; this was
either through leaving CBs in the incorrect position or by failing to spot this
condition before the aircraft flew. Purple functions provided a damping external
control to the red functions. In the reported occurrences these damping controls
were inadequate at ensuring that the red functional output was delivered inside
appropriate bounds. For example, in the case of supervision, the function failed
to adequately monitor and adjust the output of the functions that carried out
work on the aircraft (corrective maintenance, apply SI(T), etc.). Moving further
back into the system, it is possible to see how functions carried out by the Type
Airworthiness Authority staff did not provide additional damping mechanisms
through mandating pre-flight checks or independent checks of the CBs. In the
latter case, this additional control loop was provided later. The balance of the
system was tipped by the need to carry out Routine Technical Inspections on
the Environmental Conditioning System, resulting in the disturbance of the CBs.
This process was in itself a method of controlling the performance of the ECS;
changed a result of modification some years prior to the occurrences.
Examination of Figure 6-8 shows that the provide maintenance personnel
function was a significant sources of variable system performance. A potential
control loop existed to provide damping on the effects of this variability – the
planning process. This is a sub process within the visualisation shown at Figure
6-8. Adjustments could have been made to maintain system stability through
adjusting the three month flying programme to account for the lack of SQEP.
109
This would have flowed through to the weekly flying programme which provides
a time constraint on the performance of the various maintenance functions. This
method of incident analysis does not provide a chain of events or failure mode
explanation to what happened, rather it paints a picture of a system tipped out
of balance, resulting in a hazardous physical output.
110
Figure 6-5 Instantiation of Thrust Reverse Occurrence Reports
Time Available for Flight Servicing
Requirement to Rectify ECS Fault
Requirement to Record CB Positions
Litening Pod Fit Required System Access
GR4 Modification to ECS System introduced Reliability Issues
Decision on requirement for CB Checks
Requirement for ECS RTI
Specification for Time Window to Carry Out ECS RTI
Requirement to Pull CBs for EGR
Time to Complete ECS Corrective Maintenance
Rectification of ECS Faults
No Electrical Supply to Thrust Reversers
CBs not correctly reset
Thrust Reverse Pilot Input
ECS Failures
Inexperienced Supervisor Assigned
Supervision not adequately carried out
Inadequate Time for Supervision
No Independent Inspection of CB position
No Requirement for Independent CB checks
Quality Culture
O
C
P
I
T
R
Publish SI(T)s
Requirement to Carry Out ECS RTI on Specific Aircraft
Programme Not Matched to Swing Shift Resource
Time Available for RTI
RTI Specifies CB Resets
Time Available for Checks
No Check of Thrust Reverse System
No Aircrew CB Checks
O
C
P
I
T
R
Independent Inspection
59
O
C
P
I
T
R
Record Work done on Aircraft
6
O
C
P
I
T
R
OperateShift Pattern
64
O
C
P
I
T
R
Mechanical Systems
49
O
C
P
I
T
R
Armament & Electrical Systems
48
O
C
P
I
T
R
Propulsion
51
O
C
P
I
T
R
Task Maintenance
5
O
C
P
I
T
R
Flight Servicing
1
O
C
P
I
T
R
Pre-Flight Checks
55
O
C
P
I
T
R
Operate Aircraft
54
O
C
P
I
T
R
Corrective Maintenance
43
O
C
P
I
T
R
Apply SI(T)s
24
O
C
P
I
T
R
Fault Diagnosis
42
O
C
P
I
T
R
Supervise Maintenance
58
O
C
P
I
T
R
Fit/Remove Role & Arm Equipment
11
O
C
P
I
T
R
Plan Weekly/Daily Flying Programme
60
O
C
P
I
T
R
Modify Aircraft
23
O
C
P
I
T
R
Monitor Reliability Data
35
O
C
P
I
T
R
Independent Technical
Advice
37
O
C
P
I
T
R
Publish Approved Data
30
O
C
P
I
T
R
Assure Quality
15
O
C
P
I
T
R
Publish Aircrew Publications
29
O
C
P
I
T
R
Cost/Benefit and Hazard
Analysis
32
Task to Analyse FOD Risk From Thrust Reverse Pre-Flight Check
Data
Requirement to Pull CBs for EGRs
Requirement/Approval of GR4 ECS Modification
Requirement for CB checks
Advice on FOD Hazard
No Aircrew CB Check Specified
31
CBs not Reset after Fault Diagnosis
CBs not Reset After ECS Rectification
Supervision not adequately carried out
Supervision not adequately carried out
CB Incorrect Configuration Not Spotted
No Up-to-Date Record of CB Position as Tasks Progress
No Report of CB Issues
O
C
P
I
T
R
Maintenance Personnel
8
Inadequate SQEP for ECS RTI
Inadequate SQEP for ECS Fault Diagnosis
Inadequate SQEP for ECS Rectification
Time Available for Litening Pod Fit
Notes: Red lines show activities with unacceptable
variability. Black lines show other activities recorded or
inferred as having significant variability in the Occurrence Investigations.
Other aspects parts of the system are not shown for clarity.
Variability could be traced back to other functions not currently shown, with further investigation.
Inadequate SQEP for Minimum Shift List
O
C
P
I
T
R
EXCESS OUTPUT
VARIABILITY
O
C
P
I
T
R
INADEQUATE DAMPING
O
C
P
I
T
R
DIRECT INTERFACE
WITH AIRCRAFT
O
C
P
I
T
R
AIRCRAFT SYSTEMS
Harmful Variability
Variable Activity
111
6.2.6 Incident 2 – Missing Rigging Pin
The Tornado is equipped with a first generation fly-by-wire flight control system,
with a reversionary mechanical control system as a back-up. Flying control
surfaces are mechanically actuated through hydraulics. When components
within the system are disturbed or removed for maintenance, there is a
requirement to fit rigging pins to hold the remaining components in the correct
configuration. The issue of such pins is strictly controlled by tool control
processes mandated by regulation. These processes are designed to prevent
pins being inadvertently left fitted to the aircraft and restricting control
movement in flight.
6.2.7 Description of Incident
An incident (MAA, 2011b) occurred during maintenance when a set of rigging
pins was found to be deficient of a single pin, after they were provided from tool
stores to aid work on an aircraft. Having found that the pin was missing, all
further flights on the Squadron were delayed whilst a search for the missing
item was conducted. Following an examination of paperwork records, the pin
was eventually found inside a second aircraft where the set had previously been
used. This aircraft was in the process of being rebuilt to an airworthy condition
following maintenance. The incident clearly presented a near-miss in that it was
only by chance that the set of pins was re-used before the aircraft with the loose
pin inside was flown. The loose article hazard presented by the pin was
sufficient to potentially cause a control restriction and conceivably cause loss of
the aircraft.
112
6.2.8 Summary of Investigation
The Occurrence Investigation report identified a series of contributory factors.
The occurrence was during a period of disruption caused by military operations
over Libya (Operation ELLAMY). The units remaining in the UK were left
deficient of a variety of technicians and the maintenance and flying task was felt
not have been adequately reduced to compensate. The first element of the
maintenance task required removal of the High Lift and Wing Sweep Control
Unit (HLWSCU). At this stage the required rigging pins were not fitted, this
having become the normal practise at the unit concerned. When installation of
the component occurred some days later, the technician involved at the time
realised that pins had not been fitted and withdrew a set from the tool stores. Of
the set of four pins that were required to be fitted for the HLWSCU replacement
only three were in fact fitted, with two of them being placed correctly and the
third placed in an incorrect position and not fully engaged in the mechanism
(see dashed line in Figure 6-9). There was some confusion between the
technician and his supervisor over the correct location for the third pin although
neither of them referred to the approved data.
113
Figure 6-6 Location of Where Lost Pin was Installed (MAA, 2011b)
The task was identified as being particularly complicated with less than totally
clear information provided by maintenance procedures (approved data).
Compounding the issue was the lack of experience of the supervisor and the
overall lack of resources on the maintenance unit due to manpower being
drawn to the Libyan air campaign. As shown in Figure 6-7, the access to the
area was difficult for the technicians.
114
Figure 6-7 General Installation Location of Lost Pin (MAA, 2011b)
A week then elapsed before other functional tests were carried out, these
initially failed because the rigging pins prevented control movement. The
technician performing the test realised this and removed the two pins from the
normal HLWSCU location but omitted to remove the third erroneously placed
pin. Because the third pin had not been correctly inserted, it failed to prevent
movement of the controls during functional testing and its presence remained
unnoticed. Once mechanical aspects of the task were complete, all tools were
returned including the set of pins. The technician who returned these pins failed
to check its contents and returned it directly to the tools stores shelf. Seeing that
it had been returned to the shelf, the worker in tool stores did not check its
contents either. The investigation concluded that the tool stores worker’s
training for the task was inadequate. A further 100% check was required at the
end of the shift but the inspector who should have carried this out did not do so.
During the series of handovers as the task progressed, regulation required
100% tool checks at each handover. It is not clear whether these were carried
115
out, however the configuration of the box in which the pin set is kept is not
conducive to highlighting missing items – e.g. no ‘shadowing’ of the items (see
Fig 6-8.). The contents list was also not clear.
Figure 6-8 Pin Location in Tool Kit (MAA, 2011b)
116
6.2.9 Instantiation of the TASM
The first step was to take the points raised in the DASOR investigation and map
these to functions within the FRAM model:
Table 6-6 Functional Variability Noted From Investigation
Using the FRAM Visualisation these functional layers can be selected to
produce the diagram shown in Figure 6-9.
As with the previous investigation the visualisation tool only provides an initial
step in the investigation process; highlighting all links between those functions
noted as having significant output variability. These linked functions can be
easily selected and pasted into a new drawing where pertinent information
relating to the variability can be added to the diagram to produce a more
complete picture of the incident, illustrated in Figure 6-10.
Number Type Function Variation noted in DASOR
4 Organisational 3 Month Flying Programme Output not matched to shift resources
5 Human Task Maintenance Lack of continuity in tasking e.g. broken activity
6 Human Record Work Done on ac Pins not accounted for in records
7 OrganisationalTrain Maintenance
PersonnelInsufficient on the job training for supervisor
8 OrganisationalProvide Authorised
Maintenance PersonnelInsufficient manpower to match the flying programme
13 OrganisationalProvide & Account for Tools
and Test EquipmentPin returned to store with missing item not checked
21 Organisational Force and A4 OperationsMost experienced manpower diverted to Operations
elsewhere
30 OrganisationalPublish Approved Data
(Tech Manuals & Policy)Insufficient specification of rigging pin placements
43 Human Corrective MaintenanceNot carried out iaw approved data; functional test
passed with pin in place
49 Technological Mechanical SystemsFailures required corrective maintenance, pinning
required
50 Technological Aircraft Structure Pin did not interface correctly with structure
57 Human Handover100% tool check not complete at handover; pin
location not specified
58 Human Supervise Maintenance Supervision not close enough to spot errors
64 Organisational Operate Shift PatternShift pattern broke task into factured elements
causing discontinuity
117
Figure 6-9 Visualisation Tool Output for Rigging Tool Occurrence2
2 Note that a Visio software bug means that some connections become ‘un-glued’ when copying and pasting as images – this results in some lines being erroneously pasted into the corner of the drawing. On-screen performance is
not affected.
Licensed Hangar/ Parking Space
71(IR) Sqn – Non Destructive Testing
Force Level 0 Plan
Crew Training Plan
Squadron Planning Staff
Squadron Management Tools
SQEP Engineering Management
ESLOPS (Aircraft State Database)
Personal Notes
LITS Instructions
MJDI System
BAES Supply IT System
StorageTransport Supply Orders
JSP800/886
JSP 886 Pipeline Times
WeatherBowser & Driver
Strategic Fleet Plan
Joint Business Agreement
ATTAC Contract (BAE Systems)
Capability Development Programme
GR4mations IT Tool
Joint Business Agreement
MILITARY EFFECT
AP100E-15
RB199 Ground Support Station
JetscanDetuner / HP Bay
ROCET Contract (Rolls Royce)
JAMES (IT system)
Capability Requirements Management
Investment Appraisal & Business CaseCommercial
Arrangements
Project Management
5000 Series Regulatory Articles
F799 Instructions for Use – Maintenance Log
Airworthiness/Safety Delegation Holders
Trilogi System
4000 Series Regulatory Articles
RESOLVECAMO Staff
Manual of Airworthiness Processes -01
Other Nations: Tornado Tech
Warning/Special Technical Order
Project Commercial & Financial Advice
Tornado Equipment Safety Management Plan
Commodity Internal Business
Agreement
Inventory Management
Staff
Explosives Regulations
Supply Personnel
Integrated Engineering Database
EDSR (Drawings database)
NETMA
PROQUIS
External Communications
Dynamic Environment
Aircraft Abandoned
Flight Authorisation Process
Qualified and Current Aircrew
AP100B-01 Handover Policy
Duty Auth
Squadron Golden Rules
Maintenance Personnel Assigned to Post
Phase 1 & 2 Training
Trainee Maintenance Personnel
Rigs
Anywhere in system
Reporting / Just Culture + Occurrence or Perception of Risk
somewhere in system
Air Safety Management
Information System
External Occurrence Investigators
CAMO Staff
Air Safety Cell
Air Safety Management Plans
Anywhere in system
Quality Culture
Quality Staff
External Audit
Quality System Plans & Regulation
Archived Data
O
C
P
I
T
R
Maintenance Personnel
O
C
P
I
T
R
Locally Manufacture
Parts
O
C
P
I
T
R
Publish SI(T)sO
C
P
I
T
R
Engine Performance Monitoring
Reliability Database
71(IR) Sqn – Repair Team
18
31 668
Handling Squadron
O
C
P
I
T
R
Repair Spares – Industry
28O
C
P
I
T
R
Repair/Maintain Spares R2
34O
C
P
I
T
R
Demand & Return Spare
Parts
39O
C
P
I
T
R
Independent Inspection
59O
C
P
I
T
R
Record Work done on Aircraft
6O
C
P
I
T
R
Defer Faults
17 O
C
P
I
T
R
Tools & Test Equipment
13O
C
P
I
T
R
OperateShift Pattern
64
O
C
P
I
T
R
Co-ordinate Maintenance
Documentation
16 O
C
P
I
T
R
Ground Handling
3
O
C
P
I
T
R
Fuel/Defuel
14
O
C
P
I
T
R
Avionic Communicatio
ns
47 O
C
P
I
T
R
Avionic Flight Systems
45
O
C
P
I
T
R
Software
65
O
C
P
I
T
R
Mechanical Systems
49
O
C
P
I
T
R
Armament & Electrical Systems
48O
C
P
I
T
R
Propulsion
51O
C
P
I
T
R
Replacement of service life
limited parts
26O
C
P
I
T
R
Scheduled Maintenance
2
O
C
P
I
T
R
Task Maintenance
5
O
C
P
I
T
R
Flight Servicing
1
O
C
P
I
T
R
Pre-Flight Checks
55O
C
P
I
T
R
Operate Aircraft
54
O
C
P
I
T
R
Crew Escape System
52O
C
P
I
T
R
Aircraft Structure
50
O
C
P
I
T
R
Supply Chain
10
O
C
P
I
T
R
Acquire Spare Parts
33O
C
P
I
T
R
Store & Maintain
Weapons & RE
38
O
C
P
I
T
R
Structural Inspections
41
O
C
P
I
T
R
Repair Aircraft
40O
C
P
I
T
R
Corrective Maintenance
43
O
C
P
I
T
R
Apply SI(T)s
24 O
C
P
I
T
R
Fault Diagnosis
42
O
C
P
I
T
R
Supervise Maintenance
58
O
C
P
I
T
R
Handover
57
O
C
P
I
T
R
Fit/Remove Role & Arm Equipment
11
O
C
P
I
T
R
Rectification and Line
Control Boards
61
O
C
P
I
T
R
Ground Services
20
O
C
P
I
T
R
Report Faults & Husbandry
25
O
C
P
I
T
R
Weapons
53 O
C
P
I
T
R
Defensive AIds
46
O
C
P
I
T
R
Survival Equipment
56
O
C
P
I
T
R
Train Maintenance
Personnel
7
O
C
P
I
T
R
Airworthiness Review
Certification
27
O
C
P
I
T
R
Chief Air Engineer
69
O
C
P
I
T
R
Force & A4 Operations
21
O
C
P
I
T
R
Plan Weekly/Daily Flying Programme
60
O
C
P
I
T
R
Occurrence Reporting
9
O
C
P
I
T
R
Maintain GSE
12
O
C
P
I
T
R
Configuration Management
(LITS)
63O
C
P
I
T
R
Manage Maintenance
Extensions
62
O
C
P
I
T
R
Technical Assistance
Process
44
O
C
P
I
T
R
Modify Aircraft
23O
C
P
I
T
R
Monitor Reliability Data
35
O
C
P
I
T
R
Maintenance Programme
Development
22
O
C
P
I
T
R
Independent Technical
Advice
37O
C
P
I
T
R
Publish Approved Data
30
O
C
P
I
T
R
Release to Service
36O
C
P
I
T
R
Engine Health Monitoring
19O
C
P
I
T
R
3 Month Flying Programme
4
O
C
P
I
T
R
Assure Quality
15
O
C
P
I
T
R
Engine Fleet Monitoring
67
O
C
P
I
T
R
Publish Aircrew Publications
29O
C
P
I
T
R
Cost/Benefit and Hazard
Analysis
32
O
C
P
I
T
R
Design Organisations
68
Business Procedure BS013
RA 1300
RTSA
ITEA Contract
Codification
LITS ServersTAG TeamSEMA
DAOSBaseline Design
Warton Manpower Drawing SetDevelopment Aircraft Flight Trials
118
Figure 6-10 Instantiation of Rigging Pin Occurrence
Flying Requirement Not Sufficiently Reduced for Additional Operational Requirement
Reduced number of SQEP Technicians on Shift List
Reduced EngineeringResources
Lack of Continuity in Tasking
Insufficient Manpower to Meet Requirement
Insufficient Time to Conduct Continuous Work on Task
Rigging Pin Left In Mechanical System
Inadequate Time Allowed to Conduct Handover
Supervisor was not SQEP
Supervisor Had Received Insufficient On-the-Job Training
Trained Personnel Diverted to Operations – No Back Fill
Flying Requirement not Matched to Engineering Resource
Discontinuity in Allocation of Personnel to Task
O
C
P
I
T
R
Record Work done on Aircraft
6
O
C
P
I
T
R
Tools & Test Equipment
13
O
C
P
I
T
R
OperateShift Pattern
64
O
C
P
I
T
R
Mechanical Systems
49
O
C
P
I
T
R
Task Maintenance
5
O
C
P
I
T
R
Corrective Maintenance
43
O
C
P
I
T
R
Supervise Maintenance
58
O
C
P
I
T
R
Handover
57
O
C
P
I
T
R
Train Maintenance
Personnel
7
O
C
P
I
T
R
Force & A4 Operations
21
O
C
P
I
T
R
Publish Approved Data
30
O
C
P
I
T
R
3 Month Flying Programme
4
Supervision did not highlight errors in pin placement
NCO in Charge of Tool Stores was Not Available
Supervisor not SQEP
Placement of Rigging Pin Inadequately Defined
Rigging Pin Set Issued with Pin Missing
Location of Rigging Pins Not Described in Handover
O
C
P
I
T
R
Plan Weekly/Daily Flying Programme
60
HLWSCU Fit/Removal/Test Repeatedly
Interrupted to Divert Resources
Position of Rigging Pins Inadequately Documented
Flying Requirement not Matched to Engineering Resource
Supervision Broken Across Shifts
Supervisor Time Required for Higher Priority Tasks
Flying Requirement not Matched to Engineering Resource
Insufficient Total Manpower to Match the Task
Flying Requirement Not Matched to Engineering Resource
O
C
P
I
T
R
Maintenance Personnel
8
Rigging Pin Set Returned with Pin Missing
Tool Stores Worker Inadequately Trained
Functional Test Passed with Pin In place
Rigging Pin Incorrectly Positioned
Handover Tool Checks not Completed
Supervision did not ensure correct pin placement
Notes: Red lines show activities with unacceptable variability. Black lines show other activities recorded or inferred as
having significant variability in the Occurrence Investigations. Other aspects parts of the system are not shown for clarity. Variability could be traced back to other functions not
currently shown, with further investigation.
O
C
P
I
T
R
EXCESS OUTPUT
VARIABILITY
O
C
P
I
T
R
INADEQUATE DAMPING
O
C
P
I
T
R
DIRECT INTERFACE
WITH AIRCRAFT
O
C
P
I
T
R
AIRCRAFT SYSTEMS
Harmful Variability
Variable Activity
119
6.2.10 Insights from TASM
The TASM again shows this incident as a control problem; figure 6-10 provides
an instantiation of the TASM. It shows in red a number of activities which linked
functions and caused output variability to permeate downstream through the
system. Other activities are shown where they are mentioned in the RAF
investigation. Many of these activities could have exerted more control over the
functions that produced variable output. In some cases a complete control loop
was missing – despite the practise of CBs not being ‘safety-tagged’ when in the
pulled condition, there was not quality control over this practise. The quality
function is shown as having an unlinked activity, which for the purpose of this
instantiation meant that there was no output. The main variability in question in
this occurrence was that surrounding the performance of tool control processes
and the way the corrective maintenance was conducted on the mechanical
system. There was a potentially harmful variability from the corrective
maintenance function in that the rigging pin was left in the system, which
resonated with the way that the mechanical system performed under the
functional test – passing with the pin in place. If the whole system had been
operating within acceptable bounds of control then the supervision and tool
control function would have provided further damping through checks
adjustments on the way that the scheduled maintenance was conducted. It was
a serendipity rather than ‘design for resilience’ that provided a warning that
there was a tool control issue before the aircraft was released in an un-
airworthy condition. A source of variability that is shown to permeate through
the system was the operational plan to divert resources to Operation ELLAMY.
The flying and shift programming functions did not adjust their outputs to
compensate adequately and neither did the maintenance tasking function. A
potential damping mechanism was therefore lost. The DASOR included an error
management investigation which focussed on the organisational and human
failings in the scenario. The benefit in using FRAM is that it shows how all of
these aspects of the situation are linked.
121
7 USING THE TORNADO AIRWORTHINESS SYSTEM
MODEL FOR RISK ANALYSIS
The use of FRAM for risk analysis has been the subject of discussion amongst
researchers and an accepted form of practise has not yet been developed. This
chapter seeks to contribute to this development process. The FRAM attempts to
provide a more complete solution for managing risks in complex systems in
comparison to other more linear methods. In this chapter the existing risk
management process is described along with the current theoretical basis for
risk management. A new theoretical basis is proposed and then a new risk
assessment process is given. This is followed by a detailed example. The new
theoretical basis and assessment technique is then combined to give a new
approach to risk management.
7.1 Case for Using TASM for Risk Analysis
Airworthiness (or ‘equipment safety’) risk management systems employed by
the MOD in relation to Tornado include the construction and management of a
Weapon System Safety Case, which uses Goal Structured Notation (GSN). Part
of the safety case argument (Manson, 2001), is the requirement to manage
equipment safety risks to within the MAA’s targets for Risk of Death from All
Causes as required by RA1210 (MAA, 2012a). This is achieved through the use
of an Equipment Hazard Management Process (LI-BS0056) summarised in
Figure 7-1 (MOD, 2013a). A Fault Tree Loss Model is used to aid this process
of assessment. The Air Safety Duty Holder then maintains a platform risk
register, which forms part of the overall Operational Safety Case. Bow tie
models are beginning to be used to aid the assessment of operating safety
risks. All of these models are based on the assumption that safety is a resultant
property of the aggregated activities, arguments or elements of the model or
safety case. The resilience engineering understanding of safety is that it is
equivalent to its converse condition (an accident) and both of these system
states are emergent properties. The purpose of developing the TASM for risk
assessment is to provide a resilience engineering view of safety risks as a more
realistic contrast to linear methods such as bow tie, which have the potential to
122
produce a false level of accuracy when applied to a human-centred centric
system such as the Tornado airworthiness system. A resilience engineering
perspective may produce either a more positive or negative view of a particular
risk but dependent on the level of complexity that applies to the process under
consideration, it is unlikely to be able to produce a quantifiable assessment. To
achieve quantification, Bayesian or fuzzy logic principles are required. It should
be noted of course that the risk assessment process is an implied part of the
‘cost-benefit analysis’ function carried out by the Tornado Engineering Authority.
Figure 7-1 Tornado Process for Emergent Airworthiness Issues (MOD, 2013)
With regard to safety critical complex systems in general, isolating individual
potential risks is challenging due to the interconnected nature of such systems.
Initially FRAM analysis will provide a Resilience Engineering assessment of
123
risks that have already been identified within a system. It may allow a more
realistic understanding of the nature of particular hazards and how they may be
avoided or mitigated. Typically, hazard logs and risk registers record isolated
hazards/risks; FRAM has the potential to more accurately describe how both
hazards and mitigating (or damping) factors are linked. Many accident reports
detail seemingly unlikely combinations of unfortunate circumstances; FRAM
seeks to deal with the issue of harmful combination of varying functional output
more effectively. This provides a novel approach to assessing Common Cause
failures during airworthiness assessments, particularly with respect to
maintenance or design ‘error’. As the approach develops, it may be possible to
identify the potential for previously unexpected risks to emerge. Risk analysis
techniques need to work from a theoretical basis for the origin of risk.
7.2 Current Theoretical Basis for Airworthiness Risk
Management
The current theoretical basis for managing Tornado airworthiness risk is shown
in Figure 7-2, which is adapted from the Hazard log structure illustrated in Local
Instruction BS0056 (MOD, 2013a). The overall risk to life from a particular
accident scenario is calculated by means of adjusting the historical reliability or
event rate data within the fault tree loss model to reflect the new issue
identified. Alternatively a qualitative engineering judgement based assessment
using broad likelihood and consequence categories may be employed. Figure
7-2 does not illustrate a process; the arrows indicate the aggregation of risk.
The current theory, as illustrated, starts with various system controls which
prevent accident causes emerging, which in turn lead to hazards which are then
subject to additional controls. Potential accident scenarios may develop
dependent on the likelihood of the preceding elements in the chain. These
potential accidents may develop through a series of events, some of which may
prevent the situation developing into an actual accident. The likelihood and the
severity of the accident should it develop is based on an arithmetic or qualitative
aggregation of all the preceding elements in the chain. This provides an
124
estimate for risk to life for a particular scenario which is then managed
(including changing elements in the chain) in the manner outlined in Figure 7-1.
The current theoretical basis implies that unless explicitly connected in some
way, adjustments to the various controls will provide a resultant increase or
decrease in the risk to life attributable to a particular potential accident scenario.
The advantage of the current theory is that it provides a basis from which risks
can be separated, considered in isolation and managed in an auditable manner.
An overall quantitative or qualitative estimation of risk to life is based in a linear
aggregation of estimated probabilities of hazards occurring and the
effectiveness of the various mitigating pre or post-accident controls. Historical
data in the form of a loss model is used alongside data relating to specific
issues, such as failure rate data from inspections or tests (MOD, 2013). The
combination of loss models, hazard logs, and risk registers are used to model
the system on the basis of the theory shown in Figure 7-2.
7.3 Proposal of FRAM Based Airworthiness Risk Theory
Resilience engineering theory and FRAM in particular proposes that accidents
are an emergent result of system performance and accidents themselves are
mitigated or prevented by the system behaviour once a hazard has begun to
emerge. In the TASM, potential accident sequences3 may be modelled in very
broad terms through the ‘operate aircraft’ and the various aircraft subsystem
technological functions. Figure 7-3 proposes an alternative or complementary
3 With TASM an accident sequence starts when an aircraft is operating hazardous. It does not
refer to the way the airworthiness management system is behaving at any particular time.
Cause Hazard Potential Accidents Controls
Controls & Events
Figure 7-2 Current Theoretical Basis for Tornado Airworthiness Risk Management
Cause Hazard Controls
Accident
Controls
Controls
Risk to Life
125
theoretical model for the derivation of risks to life to the current theory shown in
Figure 7-2. In this case it is not possible to directly calculate quantitative risks
without some means of describing the TASM in quantitative terms itself; through
some numerical or algebraic calculation of model behaviour. As the TASM
already contains some qualitative descriptions of performance, it should also be
possible to provide a qualitative output in terms of risk. The risk to life may only
be calculated by estimating the likelihood of the system developing into a
functionally resonant state (or states) where an accident is generated.
Figure 7-3 Proposed Functional Resonance Risk Management Theory -
Visualisation of a Generic Hazardous Process
The theory shown in Figure 7-3 makes it difficult to extract a meaningful
description of any risk from the system, as this risk is the product of both the
damping and varying performance of a function. The literature does not provide
HAZARD
O
C
P
I
T
R
Hazard Generating
Function
O
C
P
I
T
R
Upstream Background
Function
O
C
P
I
T
R
Upstream Damping Function
O
C
P
I
T
R
Upstream Forcing
Function
O
C
P
I
T
R
Upstream Forcing
Function
O
C
P
I
T
R
Upstream Damping Function
External Dependency
O
C
P
I
T
R
Downstream Aircraft System
or Operating Function
O
C
P
I
T
R
Upstream/Downstream
Damping Function
ACCIDENT
126
examples of how this can be done using the existing FRAM; a bespoke
technique has therefore been developed. For an in-service system such as
Tornado, many risks are currently recorded and it is likely other potential risks
remain unrecorded. Such unknown risks may become apparent through close
examination and experimentation with the TASM. The reassessment of
currently recorded risk is easier and is the focus of this initial study. Clearly
airworthiness risk to life will only manifest itself through unacceptable variation
in the output of one of the FRAM functions relating to a physical element of the
aircraft. For example, the mechanical system might prevent proper control of
the aircraft or the structure may fail to react loads through a loss of structural
integrity. Defining an associated risk likelihood element depends on the
upstream performance variability of the system. Likelihood of hazardous
variability will also depend on the effectiveness of upstream/downstream
functions in providing damping against harmful variability. Where this
upstream/downstream damping fails to control the hazardous variability,
functional resonance occurs. In order to examine the risk associated with a
particular hazard, the following terms are defined:
Hazardous Process – All functions and activities contributing to the
hazard generation.
Hazard Generating Function – The aircraft system function whose
variable output directly generates a hazardous condition.
Upstream Forcing Function – A function whose output contributes to
forcing the generation of a hazardously variable output from a
downstream function.
Upstream Damping Function – A function whose output reduces the
variability of a downstream Hazard Generating Function.
Background Function – A function whose output can be assumed not
to vary to any significant degree and does not contribute to the variability
of the hazard generating function.
A hazardous process and hence a hazard can emerge in any part of the
system. An airworthiness related accident can only occur as a result of one or
127
more hazardous processes producing uncontrollable variable output from
aircraft system functions (structure fails, electrical fire, loss of power etc.). In the
majority of cases this would be deemed a technical failure in the existing
Tornado air safety risk management process. For example, a hazard might be
corrosion of the aircraft structure. An accident relating to this hazard would
involve the loss of structural integrity as a result of an out of limits variability in
the react-loads output from the aircraft structure function. Corrosion can be
considered to be largely due to internal variability as it is related to the material
of the structure. This is of course based on the assumption that the environment
is not a function within the TASM. If there is significant variability in the
environment experienced across the fleet, then this should be mapped as a new
function in the TASM. Corrosion may also be caused by external variability from
other functions – damage inflicted during maintenance or contamination from
other aircraft systems for example. These would be upstream forcing functions
and could link together in a hazardous process. Damping of this negative
variability is provided by a series of control processes acting on the aircraft
structure function, for example structural inspections called up during scheduled
maintenance and specified by the Engineering authority in the approved data as
a result of a cost benefit analysis. Airworthiness related issues could also play a
part in generating accidents by providing an upstream forcing function for the
‘operate aircraft’ function whilst still remaining providing an output that varies
within the bounds of the system specification. For example, the output of the
avionics flight system may provide an accurate but potentially confusing signal
to aircrew, which when combined with the internal variability of the ‘operate
aircraft’ function may result in an accident. Test and Evaluation is intended to
identify such hazardous variability.
7.4 Proposal for a FRAM Based Risk Assessment Process
As already discussed, quantitative risk assessment is difficult using the FRAM.
It would be possible to apply probabilistic data to the outputs of various
functions in the TASM, however without some using fuzzy or Bayesian
techniques it is not possible to model how these probabilities aggregate through
128
any particular process. To do so would require assumptions to be made that all
functions not considered are background functions and will not vary in output as
a result of the variability in the process under consideration. An inspection of the
TASM shows that the level of connectivity between all functions means that any
such assumption is of dubious validity. Figure 7-4 shows the risk assessment
methodology developed for this project. This iterative process expands through
the TASM allowing all functions in the hazardous process to be highlighted and
then the hazardous output variability to be re-assessed based on the forcing
and damping functions in play.
129
Figure 7-4 FRAM Model Risk Assessment Process
Identify Hazard
Identify the initial/next Hazard Generating Function and select in
Visualisation Tool
Select Upstream Function
Is Upstream Function Forcing, Damping or Background?
BackgroundAssign to a Background
Layer
Any Upstream Functions remaining?
Assign to a Forcing Layer
Forcing
DampingAssign to a
Damping Layer
Yes
Produce/Print Visualisation Tool Output highlighting damping
and forcing functions and activities
Can Function
Output constitute accident?
No
No
Highlight Damping and
Forcing activities within
FRAM Spreadsheet
Model
For each Hazard Generating Function assess whether
combination of highlighted dependent activities (damping +
forcing) changes Hazardous Output variability
Yes
Output Variability Needs Adgustment?
No
Yes
Adjust frequency/amplitude of hazardous output variability
Make qualitative assessment as to how
likely it is that hazardous output variability will
exceed safe level
Record Risk in terms of Likelihood and Severity
130
The TASM includes a risk assessment in terms of frequency and amplitude of
variability and Figure 7-5 shows how functional variability may be differ. There
will some level which defines a safe limit, although this may differ according to
the total system performance condition.
Time
Output(Force, Information, etc.)
Safe limits of variability
Unsafe output – Accident sequence initiates
Digital Variability
Complicated Variability
Figure 7-5 Demonstration of Variability Resulting in Accident Sequence
If this variability is digital in nature (e.g. failed or not failed) then the frequency of
failure may be treated as analogous to the likelihood element of a traditional risk
rating. If the variable output is more complicated in nature (as shown in Figure
7-5) then this is more problematic to map to a likelihood measure. Severity of
any potential accident depends on the extent of out of control variability in the
activities linking functions directly associated with any accident. This is
problematic as the starting point for risk analysis is usually upstream in the
system and there are a multitude of potential outcomes from this upstream
variability. The FRAM output could more reasonably be mapped to an area
rather than a point on a likelihood-severity risk matrix.
Figure 7-5 shows that for every type of functional output from an aircraft system
there will be some level at which output will become unsafe; this depends on
the damping in place downstream at that moment. Where downstream damping
is inadequate this is where functional resonance occurs and system
performance becomes harmful and thus an accident develops. Fundamentally,
131
resilience engineering presents a very different notion of the origin of hazards
and nature of accidents. This does not easily fit within the bounds of current risk
management practise or regulation – RA 1210 (MAA, 2013b).
7.5 Risk Example – Operation of Components in Excess of
Cleared Life
Within the Tornado airworthiness system there have been historical difficulties
managing the quality of data relating to component lives within the Logistics
Information System (LITS). There have also been difficulties in tracking the
consumption of component shelf life for items in storage. This leads to a hazard
arising from out-of-life components that may be installed and then operated in
the aircraft systems in excess of their cleared life. Consequently there is an air
safety risk due to potential component failure. The issue clearly encompasses
technical, organisational and human functions; as such it makes an interesting
case study for the TASM. The mitigating/damping actions currently in place
include a Tornado Asset Gateway Team (TAG) whose task was the restoration
of LITS records for a series of safety critical airworthiness items.
7.5.1 Generating a FRAM Model Risk Assessment
The initial Hazard Generating Function is the ‘Configuration Management
Function’ – this function seeks to ensure that records are accurately maintained
on the lifing and modification state of components on and off the aircraft. The
primary tool to achieve this is LITS. Various records on this issue have been
examined; the ‘Annex A’ process highlighted the issue as it emerged (Freed
and Priday, 2008) and Aitken (2009) provided detailed follow-up examination.
The MOD then contracted BAE Systems to provide a Tornado Asset Gateway
(TAG) Team to resolve the problem (Singleton, 2009) and QinetiQ provided
independent advice (Jeffery, 2009). Once the issue had been resolved to the
satisfaction of the Operational Duty Holder and ALARP report was raised
(Bagwell, 2011). All of this information allowed the construction of a FRAM
Model for the residual risk to be constructed using the TASM as baseline.
Aspects of the Hazard Generating Function have been highlighted as either a
forcing (red), damping (purple) or background functions in Table 7-1. An
132
additional column has been added to show the manner in which damping or
forcing is achieved. Following a first iteration of the process for identifying
downstream functions (as described in Figure 7-5) a visualisation was
produced:
Figure 7-5 Operation of Components Beyond Cleared Life - First Stage Risk
Visualisation, Excluding Background Functions
It is worth noting that the scope of the configuration management in the TASM
is broad – it encompasses all aspects of managing the state of the components
fitted to the aircraft. That includes the compatibility of modifications and the
control of which items can be fitted for what length of time.
Erroneous Entries in LITS
O
C
P
I
T
R
Repair Spares – Industry
28O
C
P
I
T
R
Repair/Maintain Spares R2
34
O
C
P
I
T
R
Record Work done on Aircraft
6
O
C
P
I
T
R
Co-ordinate Maintenance
Documentation
16
O
C
P
I
T
R
Configuration Management
(LITS)
63
O
C
P
I
T
R
Publish Approved Data
30
TAG Team Restoring Configuration Control for safety critical assets
ERCs provide hard copy cross check for component life
ERCs provide hard copy cross check for component life
Clear LITS Policy to Follow
Difficult to complete LITS actions if data corrupt
O
C
P
I
T
R
Maintenance Personnel
8
133
Table 7-1 Configuration Management Aspects
An erroneous configuration management output in terms of lifing information is
a hazard produce. In order to understand how the hazardous variability
potentially permeates through the system to generate an accident it is
necessary to look at the output from the configuration management function
using the process shown in Figure 7-4. Following the process, this leads to the
identification of a further eight hazard generating functions. Tracing these
functions within the FRAM Spreadsheet model reveals whether or not the
variability of lifing data is likely to cause downstream variability in these
functions. In some cases, the upstream activity required by the downstream
Name of
FunctionConfiguration Management (LITS)
Aspect Description of Aspect Number Name AspectActual Variability in
Risk Assessment
Input Work Done on Aircraft Recorded 6 Record Work Done on ac
LITS Record
(Configuration
Management)
Erroneous Entries in
LITS
Lifing policy change from EA 30Publish Approved Data
(Tech Manuals & Policy)
Schedule of Life Limited
Parts
Component Life extension 62Manage Maintenance
Extensions
Maintenance Extended
on LITS
Output Information reports to EA and CAMSS
Life limited item reports to Maintenance
Organisations
Life limted items reports to FOC Maintenance Taskers
Pre-printed maintenance work orders and data input
facility to maintenance orgs.
LITS asset gateway to update records for items
delvered from supply chain
Modification retro-list
Precondition Modification approved by TAA in TLARC 23 Modify Aircraft
TAA approval through
TLARC and Issue of
Approved Data (SM
leaflet etc)
Previous Work (config change) Correctly recorded in
LITS6 Record Work Done on ac
LITS Record
(Configuration
Management)
Erroneous Entries in
LITS
LITS team in EA, LITS Team on units 8Provide Authorised
Maintenance Personnel
Appropriately (to
requirement)
Authorised Maintenance
Personnel (record work
done, fuel, scheduled
maintenance, report
faults, conduct quality
tasks etc)
TAG Team Restoring
Configuration
Control for safety
critical assets
Component Engineering Record Cards update - R2 34 Repair/Maintain Spares R2 Update Log cards
ERCs provide hard
copy cross check for
life
Component Engineering Record Cards update -
industry28 Repair Spares - Industry
Update Component Log
Card
ERCs provide hard
copy cross check for
life
Control LITS policy - DAP 300A-01 & 2(R) 1A 30Publish Approved Data
(Tech Manuals & Policy)Support Policy
Clear LITS Policy to
Follow
TimeRequired to Coordinate maintenance documentation
(individual updates)16
Coordinate Maintenance
Documentation
F700 ready for flight
servicing and ac captain
(pre-flight checks)
Diificult to complete
LITS actions if data
corrupt
Upstream Function
134
function does not relate to lifing, therefore these downstream functions remain
as background functions rather than additional hazard generating functions. In
other cases the configuration management output provides data for functions
with an element of data checking and correction – these functions provide an
upstream/downstream damping process. This step is summarised in Table 7-2.
There are two further downstream hazard generating functions identified (two
and 21).
135
Table 7-2 Summary of Second Stage of Risk Assessment
The next stage in the process is to repeat the analysis for these two
downstream hazard generating functions, again using both the Visualisation
Tool and the TASM. In both cases we are interested in the variability of the
Hazard Generating Function 1
63Function
NumberFunction Name Activity Actual Variability in Risk Assessment
6 Record Work Done on acLITS Record (Configuration
Management)Erroneous Entries in LITS
6 Record Work Done on acLITS Record (Configuration
Management)Erroneous Entries in LITS
16Coordinate Maintenance
Documentation
F700 ready for flight
servicing and ac captain
(pre-flight checks)
Diificult to complete LITS actions if data
corrupt
8Provide Authorised
Maintenance Personnel
Appropriately (to
requirement) Authorised
Maintenance Personnel
TAG Team Restoring Configuration
Control for safety critical assets
34 Repair/Maintain Spares R2 Update Log cardsERCs provide hard copy cross check for
life
28 Repair Spares - IndustryUpdate Component Log
Card
ERCs provide hard copy cross check for
life
30Publish Approved Data (Tech
Manuals & Policy)Support Policy Clear LITS Policy to Follow
2 Scheduled Maintenance
Life limited item reports to
Maintenance
Organisations
Errors in Lifing details
16Coordinate Maintenance
Documentation
Pre-printed maintenance
work orders and data input
facility to maintenance
orgs.
No effect - No corrupt lifing data
involved
17 Defer Faults
Pre-printed maintenance
work orders and data input
facility to maintenance
orgs.
No effect - No corrupt lifing data
involved
21 Force & A4 OperationsLife limted items reports
to FOC Maintenance
Failure to task removal of life limted
items during scheduled maintenance
25 Report Faults and Husbandry
Pre-printed maintenance
work orders and data input
facility to maintenance
orgs.
No effect - No corrupt lifing data
involved
28 Repair Spares - Industry
LITS asset gateway to
update records for items
delvered from supply
chain
Variation damped out - ERCs provide
hard copy cross check for life
34 Repair/Maintain Spares R2
Pre-printed maintenance
work orders and data input
facility to maintenance
orgs.
Variation damped out - ERCs provide
hard copy cross check for life
27Airworthiness Review
Certification
Information reports to EA
and CAMSS
Variation damped out - ARC proccess
provides a dip check of data accuracy
Dam
pin
g Fu
nct
ion
sFo
rcin
g Fu
nct
ion
s P
ote
nti
al D
ow
nst
eam
Haz
ard
Ge
ne
rati
ng
Fun
ctio
ns
136
output relating to the replacement of life limited parts – this calls up a specific
additional function, which becomes another hazard generating function.
Table 7-3 Stage 2 - Scheduled Maintenance Function
Hazard Generating Function 2a
2Function
NumberFunction Name Activity Actual Variability in Risk Assessment
63Configuration Management
(LITS)
LITS 'pull' at start of shift
(configuration
management)
Hazardous Variability from Stage 2 -
Erroneous LITS lifing data
10 Supply Chain Spare Parts (supply chain)Potential for over-life components to be
delivered from supply chain
21 Force and A4 Operations Force Operations Tasking
Potential for life limited part removal
not to be included within CMU scheduled
maintenance tasking.
5 Task Maintenance Rectification Control TaskingExperienced Task Controllers spot items
that have unusual life attached
62Manage Maintenance
Extensions
Extend Maintenance (lined
up to available window)
Experienced engineering management
spot unusual life attached to item that
requires extension
19 Engine Health Monitoring Engine Health MonitoringEffects of over-life components within
propulsion system may be spotted
15 Assure QualityDefence Quality Assurance
Field Force Checks
Experienced DQAFF auditors may spot
over-life items
62Manage Maintenance
Extensions
Extend Maintenance (lined
up to available window)
Experienced engineering management
spot unusual life attached to item that
requires extension
Various Various AC Inspected (all systems)Physical degredation of over-life parts
may be spotted.
26Replacement of service life
limited parts
Life limited parts change
out tasking
Failure to initiate removal of over/due
life parts
6 Record Work Done on acRecord Work Done on
Aircraft
Record of installed life may allow later
audit and recovery of situation
25 Report FaultEmergent Work (Report
Faults)
Age related failure of life limited parts
may be spotted and tasked for recovery
39 Demand Spare Parts Demand Spares No effect
41Structural Inspections &
Corrosion Control
Structural Inspection &
Corrosion Control
Age related degredation of life limited
parts may be spotted
27Airworthiness Review
CertificationAirworthiness Review
Lifing errors may be spotted downstream
at review
Forc
ing
Fun
ctio
ns
Dam
pin
g Fu
nct
ion
sP
ote
nti
al D
ow
nst
eam
Haz
ard
Ge
ne
rati
ng
Fun
ctio
ns
137
Table 7-4 Stage 2 - Force and A4 Operations Function (Part 1)
Hazard Generating Function 2b
21Function
NumberFunction Name Activity Actual Variability in Risk Assessment
Forc
ing
Fun
ctio
n
63Configuration Management
(LITS)
LITS mod configuration
informationErroneous lifing details
Dam
pin
g
Fun
ctio
n
30Publish Approved Data (Tech
Manuals & Policy)
Maintenance Schedule
(approved data)
Maintenance Schedule provides a
hardcopy cross check for lifing details
69Chief Air Engineer
AuthorisationForce Operations Tasking
40 Repair Aircraft Force Operations Tasking
31Publish Special Instructions
(Technical)Force Operations Tasking
30Publish Approved Data (Tech
Manuals & Policy)Force Operations Tasking
29 Publish Aircrew Publications Force Operations Tasking
27Airworthiness Review
CertificationForce Operations Tasking
Lifing errors may be spotted downstream
at review
7Train Maintenance
PersonnelForce Operations Tasking
4 3 Month Flying Programme Force Operations Tasking
2 Scheduled MaintenanceMaintenance Schedule
(From MOC via FOC)
Potential for life limited part removal
not to be included within CMU scheduled
maintenance tasking.
18 Locally Manufacture Parts
Scheduled Maintenance
and modification
programme
23 Modify Aircraft
Scheduled Maintenance
and modification
programme
Po
ten
tial
Do
wn
ste
am H
azar
d G
en
era
tin
g Fu
nct
ion
s
138
Table 7-5 Stage 2 - Force and A4 Operations Function (Part 2)
Hazard Generating Function 2b (continued)
21Function
NumberFunction Name Activity Actual Variability in Risk Assessment
24Apply Special Instruction
(Technical)
Scheduled Maintenance
and modification
programme
26Replacement of service life
limited parts
Scheduled Maintenance
and modification
programme
Failure to initiate removal of over/due
life parts
29 Publish Aircrew Publications
Scheduled Maintenance
and modification
programme
36 Release To Service
Scheduled Maintenance
and modification
programme
41Structural Inspections &
Corrosion Control
Scheduled Maintenance
and modification
programme
Age related degredation of life limited
parts may be spotted
42 Fault Diagnosis
Scheduled Maintenance
and modification
programme
43 Corrective Maintenance
Scheduled Maintenance
and modification
programme
44 Technical Asistance Process
Scheduled Maintenance
and modification
programme
62Manage Maintenance
Extensions
Scheduled Maintenance
and modification
programme
Experienced engineering management
spot unusual life attached to item that
requires extension
10 Supply Chain Supply Chain Prioritisation
20 Ground Services Mission Critical GSE list
38
Store, Service, Repair
Weapons and Role
Equipment
Role Equipment/Weapon
Prioritisation
4 3 Month Flying Programme Level 0 Plan
36 Release To Service Level 0 Plan
7Train Maintenance
Personnel
Force Operations
(Manning) Prioritise
Allocation
Po
ten
tial
Do
wn
ste
am H
azar
d G
en
era
tin
g Fu
nct
ion
s
139
Table 7-6 Stage 3 Replacement of Life Limited Parts Function
The data in Tables 7-3 to 7-6 is combined within a visualisation in Figure 7-6.
This diagram illustrates the relationship between the configuration management
function (mainly exercised through the LITS computer system) the tasking
function carried out by Force Operations, the scheduled maintenance activity
carried out on the aircraft and then the specific function of involved in fitting and
removing life limited items. It demonstrates how various other functions force
variability in the hazard generating functions and also how other functions
provide damping on the frequency and amplitude of that variability. In the case
of the third hazard generating function, ‘Replacement of Life Limited Parts’, the
dimension of output under consideration is whether the part currently fitted is
Hazard Generating Function 3
26Function
NumberFunction Name Activity Actual Variability in Risk Assessment
5 Scheduled Maintenance Task JobFailure to task removal of component
before authorised life expires
10 Supply Chain Spare PartsComponent Supplied beyond authorised
service life
63Configuration Management
(LITS)
LITS (Configuration
Management)Erroneous LITS information
8Provide Authorised
Maintenance Personnel
Authorised Maintenance
Personnel
Experienced personnel inspect items and
paperwork before fitting
30Publish Approved Data (Tech
Manuals & Policy)Approved Data
Approved life limits in Schedule of Life
Limted items will allow cross-checking
45 Avionic Flight Systems Replace Life Limited Parts Failure to operate, fire hazards
47 Avionic CommunicationsReplacement of Life
Limited PartsFailure to operate, fire hazards
49 Mechanical SystemsReplacement of Life
Limited PartsFailure to operate, Leaks, fire etc
50 Aircraft StructureReplacement of Life
Limited Parts
Failure to react loads (loss of structural
integrity)
48Armament and Electrical
Systems
Replacement of Life
Limited PartsFailure to operate, fire hazards
Dam
pin
g Fu
nct
ion
sP
ote
nti
al D
ow
nst
eam
Haz
ard
Ge
ne
rati
ng
Fun
ctio
ns
Forc
ing
Fun
ctio
ns
140
within its authorised service life (as opposed to other considerations such as
whether the installation is achieved satisfactorily). Figure 7-7 then demonstrates
how an accident could develop as a result of the variability from the
‘Replacement of Life Limited Parts’ function. Four further functions are identified
as being additional downstream hazard generating functions – these are all
physically part of the aircraft system and therefore output variability from these
functions has the potential to generate an airworthiness related accident. The
TASM has been further dissected to show those activities that have the
potential to vary and generate an accident; in turn some of the activities are
linked to further downstream aircraft system functions. The operate aircraft
function has some considerable ability to damp out variability in aircraft system
functions – in other words the crew would often be able to deal with technical
failures of life limited items and still land safely, with actions such as using
redundant systems or limiting the flight envelope (G limits etc.). However, there
are some cases in which variability would be too high in amplitude for the crew
to successfully deal with; such as a sudden loss of structural integrity. Particular
risks such as fire have been highlighted in the diagram, this may occur due to
internal failure within a system producing an unlinked output. In most cases this
would actually be a product of out-of-control activity linking functions – such as
a fuel leak interacting with an electrical component chafing on structure. An
analysis of the stage four hazard generating functions also highlighted that
there are additional damping factors that may prevent excess variability in the
aircraft system outputs. These damping activities include inspection and
functional testing of the systems to highlight variable output (failures) prior to
flight. The damping would be accomplished though further downstream process
e.g. fault reporting and maintenance that is not shown but can be traced out in
the full FRAM Visualisation tool.
141
Figure 7-6 Visualisation of Hazard Generation Process
Erro
ne
ou
s En
trie
s in
LIT
S
O
C
P
I
T
R
Rep
air
Spar
es –
In
du
stry
28
O
C
P
I
T
R
Rep
air/
Mai
nta
in
Spar
es R
2
34
O
C
P
I
T
R
Rec
ord
Wo
rk
do
ne
on
A
ircr
aft
6
O
C
P
I
T
R
Co
-ord
inat
e M
ain
ten
ance
D
ocu
men
tati
on
16
O
C
P
I
T
R
Co
nfi
gura
tio
n
Man
agem
ent
(LIT
S)
63
O
C
P
I
T
R
Pu
blis
h
Ap
pro
ved
Dat
a
30
TAG
Te
am R
est
ori
ng
Co
nfi
gura
tio
n
Co
ntr
ol f
or
safe
ty c
riti
cal a
sse
tsER
Cs
pro
vid
e h
ard
co
py
cro
ss c
he
ck f
or
com
po
ne
nt
life
ERC
s p
rovi
de
har
d c
op
y c
ross
ch
eck
fo
r co
mp
on
en
t lif
e
Cle
ar L
ITS
Po
licy
to F
ollo
w
Dif
ficu
lt t
o c
om
ple
te L
ITS
act
ion
s if
dat
a co
rru
pt
O
C
P
I
T
R
Mai
nte
nan
ce
Per
son
nel
8
Inco
rre
ct It
em
s sp
eci
fie
d
in L
ITS
pu
ll (F
orw
ard
)
O
C
P
I
T
R
Sch
edu
led
M
ain
ten
ance
2
Po
ten
tial
fo
r o
ver-
life
co
mp
on
en
ts t
o b
e d
eliv
ere
d f
rom
su
pp
ly c
hai
n
Erro
ne
ou
s d
ata
Sup
plie
d w
ith
Par
ts
O
C
P
I
T
R
Sup
ply
Ch
ain
10
O
C
P
I
T
R
Forc
e &
A4
O
per
atio
ns
21
Inco
rre
ct li
fe li
mit
ed
ite
ms
spe
cifi
ed
fro
m L
ITS
pu
ll (D
ep
th)
HG
F 1
HG
F 2
a
HG
F 2
b
FOC
om
its
task
ing
for
life
lim
ite
d it
em
ch
ange
O
C
P
I
T
R
Task
M
ain
ten
ance
5 Exp
eri
en
ced
Tas
k C
on
tro
llers
sp
ot
ite
ms
that
hav
e u
nu
sual
life
att
ach
ed
O
C
P
I
T
R
Man
age
Mai
nte
nan
ce
Exte
nsi
on
s
62
Exp
eri
en
ced
en
gin
ee
rin
g m
anag
em
en
t sp
ot
un
usu
al li
fe a
ttac
he
d t
o it
em
th
at r
eq
uir
es
ext
en
sio
n
O
C
P
I
T
R
Engi
ne
Hea
lth
M
on
ito
rin
g
19
Effe
cts
of
ove
r-lif
e c
om
po
ne
nts
wit
hin
p
rop
uls
ion
sys
tem
may
be
sp
ott
ed
O
C
P
I
T
R
Ass
ure
Qu
alit
y
15
Exp
eri
en
ced
DQ
AFF
au
dit
ors
may
sp
ot
ove
r-lif
e it
em
s
Mai
nte
nan
ce S
che
du
le p
rovi
de
s a
har
dco
py
cro
ss c
he
ck f
or
lifin
g d
eta
ils
O
C
P
I
T
R
Rep
lace
men
t o
f se
rvic
e lif
e lim
ited
par
ts
26
HG
F 3
Inco
rre
ct It
em
s sp
eci
fie
d in
LIT
S p
ull
(Fo
rwar
d)
FOC
om
its
task
ing
for
life
lim
ite
d it
em
ch
ange
Exp
eri
en
ced
pe
rso
nn
el i
nsp
ect
it
em
s an
d p
ape
rwo
rk b
efo
re f
itti
ng
Ap
pro
ved
life
lim
its
in S
che
du
le o
f L
ife
Lim
ted
ite
ms
will
allo
w c
ross
-ch
eck
ing
143
O
C
P
I
T
R
Replacement of service life
limited parts
26
HGF 3
Life Expired Item Fitted
Life Expired Item Fitted
Life Expired Item Fitted
Life Expired Item Fitted
Failure to React Loads
Failure to React Loads
Failure to React Loads
Failure to React Loads
PowerFailure
O
C
P
I
T
R
Avionic Communicatio
ns
47
O
C
P
I
T
R
Avionic Flight Systems
45
O
C
P
I
T
R
Mechanical Systems
49
O
C
P
I
T
R
Armament & Electrical Systems
48
O
C
P
I
T
R
Aircraft Structure
50
O
C
P
I
T
R
Operate Aircraft
54
Incorrect Information Signalled
Isolate Malfunctioning
system
Aircraft Not Controllable/ Life Support Failure
Exploit Redundancies
Isolate Malfunctioning System
Isolate Malfunctioning System
Incorrect Information Signalled
Fire, Electrocution etc
Exploit Redundancies
Fire, Electrocution etc
O
C
P
I
T
R
Defensive AIds
46
Power Failure
Power Failure
O
C
P
I
T
R
Software
65Power Failure
ElectricalSignal Failure
Power Failure
O
C
P
I
T
R
Weapons
53
Failure to React Loads
O
C
P
I
T
R
Propulsion
51
Failure to React Loads
O
C
P
I
T
R
Crew Escape System
52
Failure to React Loads
Fire
Life Expired Item Fitted
POTENTIAL ACCIDENT
HGF 4a
HGF 4b
HGF 4c
HGF 4d
HGF 4e
O
C
P
I
T
R
Flight Servicing
1
Inspection
O
C
P
I
T
R
Corrective Maintenance
43
Functional testing
Fault indications
O
C
P
I
T
R
Pre-Flight Checks
55
Inspection
Inspection
Inspection
Functional testing
Fault indications
Functional testing
Inspection
Inspection
Inspection
Functional testing
CommunicationFailure
Restrict Envelope
O
C
P
I
T
R
Structural Inspections
41
Inspection
Inspection
Figure 7-7 Visualisation of Potential Accident
Processes
144
7.5.2 Insights into Risk
Figure 7-7 represents a multitude of potential scenarios and is a baseline from
which a risk assessment could be conducted for any specific component. The
loose-ended activities shown (largely fire) are given to represent activity that
involves what would normally be background environmental functions e.g. the
atmosphere. Damping processes such as the fire suppression (part of the
mechanical system) could also be plotted.
The risk assessment so far shows how difficult it is to ‘reverse engineer’ a
potential emergent accident process when considering all potential
dependencies and connections, rather than relying on linear assumptions. This
specific hazard is of interest because specific issues with upstream variability
had been noted – how can the risk of this variability triggering a downstream
accident sequence be isolated from the variability in the rest of the system?
Certain assumptions have to be continually made as to the variability of other
functions in the hypothetical system instantiation that is created for the purpose
of assessing the risk. The current safety management system for Tornado
separates a log of hazards and maps these to a number of ‘accident sets’ and
thereafter, at a top level, to a set of air safety risks. The FRAM provides an
alternative model for the analysis of these risks. The final stage in the process
described by Figure 7-6 requires a judgement on how the aggregated upstream
variability is likely to affect the output variability and then to express this in terms
of likelihood and severity. The general layout of the reduced FRAM Model frame
is shown in Table 7-7:
145
Table 7-7 Example Accident Generating Function FRAM Frame Layout
Name of Function Avionic Flight SystemsAspect Description of Aspect Number Name Aspect
Input Operate Aircraft 54 Operate Aircraft Inputs to aircraft systemsIsolate Malfunctioning
SystemHigh High
Likely to induce system state that
does not produce required
outcome
DECREASE Aircrew training and STANEVAL
OutputAircraft Control Signals (mechanical
systems)
Information to Aircrew (operate
aircraft)
Precondition Flight Servicing 1 Flight Servicing
AC visually inspected (Avionics,
Electrical, Structure, Mechanical,
Crew Escape, Weapons,
Propulsion)
Inspection High Medium Induces fault - contamination etc DECREASE Pre-flight checks
Apply Special Instructrions (Technical) 24Apply Special Instruction
(Technical)
Special Instruction (Technical)
Applied to applicable
aircraft/equipment
Unsafe condition develops of not
rectifiedMaintenance document coordination
Scheduled Maintenance 2 Scheduled MaintenanceLife limited parts change out
tasking
Incorrect functional output or
undafe conditionMaintenance document coordination
Repair Maintenance 40 Repair Aircraft Aircraft Structure Repair Function impared Maintenance document coordination
Corrective Maintenance 43 Corrective MaintenanceSystem Restored to correct
functionFunctional testing High High Function impared DECREASE Maintenance document coordination
Modify Aircraft 23 Modify AircraftAircraft Systems Modified under
Service Mod or Designer Mod Function impared Maintenance document coordination
Aircraft Structure 50 Aircraft StructureAvionic Flight System Loads are
reactedFailure to react loads Low High
Incorrect functional output or
undafe conditionINCREASE 9 Pre-flight checks
Pre-Flight Checks 55 Pre-Flight Checks Avionic Flight Systems Checked Inspection High MediumFunction impared remains in a
failed state whilst airborneDECREASE Ground handler spots issues on dispatch
Resource Electrical Power 48Armament & Electrical
SystemsElectrical Power/Signals Power Failure Low High Electronic failure INCREASE 9
Redundancy within electrical system e.g.
battery
Replace Life Limited Parts 26Replacement of service life
limited parts
Part Replaced (system
concerned)Life Expired Items Fitted Medium High
Burn-out of components - function
imparedINCREASE 18 Redundancy and internal failure monitoring
Control Software 65 Software Avionic Flight Systems Signal Fault indications Low HighSpurious or misleading signals or
failure conditionDECREASE Aircrew training and STANEVAL
Time Not initially described
Potential Damping Factors to counter upstream-
downstream coupling
Not initially described
Possible effect on this
(downstream) Function
Possible effect on this (downstream)
Function Output Variability
Rough Downstream
Function Variability Score
Upstream FunctionActual Variability
Frequency of Upstream
Output Performance
Amplitude of Upstream
Output Performance
146
Essentially a judgement is required as to the combined effect of the forcing
functions (red) and the damping functions (purple) taking into account their
respective frequencies and amplitude of variability. The effect will be manifested
in the output from the Hazard Generating function. The initial TASM provides a
prior assessment of the output variability of all of the stage four Hazard
Generating Functions. As these functions are all technological safety critical
systems that form the aircraft they are all also assessed as having a low
frequency of variability (i.e. they their output is highly reliable) and the amplitude
of variability is high. The thinking behind the high amplitude categorisation is
that performance is more likely to cease entirely than be degraded; clearly
however there are more frequent but low amplitude (degraded system
performance) failures – the output from these functions is complicated. Table 7-
7 shows how this is recorded in the baseline FRAM Model.
Considering the risk of posed by operation of components in excess of cleared
life, it is likely that the frequency of variability will increase if the hazard to
propagates through the system to this fourth stage. The key point to consider
here is how meaningful any assessment of a generic risk based on this initial
hazard can be. For example a structural component may present a more
serious mode of failure than an avionic component. There will be a variety of
levels of redundancy within the different systems. Using the FRAM it is only
possible to generate a risk assessment in the case of a specific component or
class of components. As previously described this would also require functions
not involved in the hazardous or associated damping processes to remain
constant. This could be set by describing allowable boundaries for the
performance indicators described in the TASM.
147
Table 7-8 Avionic Flight Systems Output – Baseline FRAM Model
45Name of
FunctionAvionic Flight Systems
Aspect Description of Aspect
Input Operate Aircraft
Output Aircraft Control Signals (mechanical systems) Aircraft Control Signals (mechanical systems) Sequence Incorrect information signalled Low High
Information to Aircrew (operate aircraft)Information to Aircrew (operate aircraft)
Sequence Incorrect information signalled Low High
Precondition Flight Servicing
Apply Special Instructrions (Technical)
Scheduled Maintenance
Repair Maintenance
Corrective Maintenance
Modify Aircraft
Aircraft Structure
Pre-Flight Checks
Resource Electrical Power
Replace Life Limited Parts
Control Software
Time Not initially described
Amplitude of
VariabilityFrequency of Variability
Step 2 - Identification of Output Variability
OutputsMost Likely Dimension of
Output Variability
Description of Most Likely Output
Variability
Step 1 - Identify and Describe the Functions
148
7.6 Proposal for a FRAM Based Risk Management
Whilst it is difficult to produce a quantified risk assessment along the lines of
current practise using a FRAM model, the model does provide for a greater
level of understanding of the system. There is potential to use this feature of the
approach to facilitate system redesign. A typical hazards and barrier approach
to safety would involve introducing additional barriers. Resilience engineering
principles advocate that safety may be increased by increasing the instances of
successful operations of a function or a system to prevent the instances of
harmful operations. With this in mind, focus on system re-design should be on
strengthening the damping internal to hazard generating functions and that
provided by the wider system. Figure 7-8 shows the means by which risks can
be managed; using the output from the various performance indicators given in
the TASM to allow managers to make adjustments through system redesign.
Redesign could be through improvements in the way functions operate in terms
of internal process, addition of new or better resource or by amending
controlling outputs from other functions.
Figure 7-8 Proposed Risk Management Process
This can be achieved by understanding the ‘work as done’ rather than as
imagined or prescribed. Obviously the only way to achieve this is through
dialogue with the individuals involved in the process or where data systems
Potential Accidents Risk to
Life Adjustment
s to System
Performance Indicators
System Damping or Forcing
Functional Performance
Variability
149
such as LITS are involved by analysing the flow of that data. Similarly those
potential performance indicators already provided within the Baseline FRAM
Model should be collected and monitored. Risk management in this manner will
allow for risks to life to mitigated to an appropriate ALARP level.
7.7 Chapter Summary
At this stage of development it has only been possible to describe an outline
process for risk assessment methodology. This requires further work and it is
likely that more advanced mathematical techniques will be required. Whilst it is
more difficult to apply the FRAM for risk assessment than other more traditional
techniques, resilience engineering principles would suggest that a more
accurate result is more likely, albeit with a considerable level of uncertainty and
validity limited to specific system states. In order to progress the FRAM should
first be used as a monitoring tool, developing leading safety performance
indicators as described in the model. Thereafter it can be used a anticipating
tool to deal with threats from emergent safety risk noting Hollnagel’s (2011) four
cornerstones of resilience.
151
8 DISCUSSION
Chapters four and five described how a Tornado Airworthiness System Model
(TASM) and associated Visualisation Tool were created using the FRAM. The
use of these tools was then discussed both for incident investigation and risk
assessment in chapters six and seven. This chapter discusses the results of
these exercises and considers how well the research objective has been met,
with reference to the current literature on resilience engineering reviewed in
Chapter one.
8.1 Applicability of the Resilience Engineering Paradigm to
Airworthiness
Chapter 1 highlighted various views on the progress of safety science over the
last century. Experience within high hazard industries such as military
aerospace suggests that accident reports impose a certain degree of hind-sight
simplification to deconstruct complexity (Dekker et al, 2011). The literature
review highlighted that resilience engineering is emerging as a new paradigm in
safety. Dekker (2011) also describes resilience engineering as a ‘post
Newtonian’ analysis of safety, which aptly describes the change in perspective
required to adopt its precepts. The boundaries of resilience engineering practise
are far from clear and it is not possible to isolate it from other elements of safety
engineering and management best practise. The theory is however markedly
different to most existing notions of safety.
The reality of everyday work in high hazard industries is one of compromise and
pragmatic application of safety regulation – through approximate adjustments.
Yet, given that such industries are now successfully meeting safety targets it
would appear that current safety management and analysis techniques may be
sufficient. Clearly better regulation and understanding of human factors and
organisational behaviour are responsible to some extent for increasing levels of
safety but will an enhanced understanding of the science of complexity enhance
safety? The Functional Resonance Analysis Method provides one way in which
resilience can truly be engineered into systems at organisational, technological
152
and human levels. It has the potential, not yet fully realised, to replace or
augment many existing safety engineering techniques. Resilience engineering
could potentially produce a paradigm shift in safety engineering, although much
further work is required to operationalise FRAM, which stands out as the most
useful methodology. Much of the literature on resilience engineering focusses
on operational aspects of high hazard industries. Continuing airworthiness is
within that particular scope and FRAM may well offer practical ways to address
continuing concerns over maintenance error. Wider than that, a focus on safety
being driven by increasing the probability of success instead of the prevention
of failure is an attractive engineering design philosophy in that it promotes a
more efficient use of mass, power and volume – an application that requires
further investigation. This study however focussed on the organisational
aspects of maintenance and modification. The key finding was that even by
modelling the socio-technical system at a relatively high level of abstraction
(with TASM) it became clear that all aspects of the airworthiness management
system have evolved to become extremely tightly connected in interlocking
processes.
Airworthiness regulation and practise is managed in a compartmentalised and
therefore mostly linear fashion. In the civil sector this is through a system of
licences for individual maintenance personnel and approvals for design,
production, maintenance and training organisations. Increasingly the UK military
regulator is adopting similar practise, which facilitates operational flexibility as it
provides a variety of potential options for sourcing airworthiness related activity.
In many cases, for example depth maintenance or maintenance programme
development, these functions have been contracted to industry. Type
Airworthiness Authorities must maintain oversight of the complexities of the
whole system. The technical safety cases for air platforms such as Tornado
(Mason, 2012) make use of Goal Structured Notation to break the argument for
a safe system down into to manageable and auditable portions. Such
techniques do not effectively deal with the interconnected nature of support
systems. Fundamentally, existing airworthiness practise assumes that safety is
a resultant rather than an emergent system property. The evidence presented in
153
the literature review makes a compelling case for the latter assumption. The
tools available to practitioners need to develop to match this new paradigm.
Ideas of complexity and resilience in safety are becoming more prescient. There
is a clear trend for aircraft systems to become more complex, due to their size
(e.g. A380) or their advanced software driven systems (e.g. F-35 Lightning II).
The system of systems approach is becoming more relevant as individual
aircraft become more integrated into Air Traffic Management Systems in order
to increase efficiency (open skies) or safety (TCAS), similarly the advent of
Network Enabled Capability sees military air platforms integrated into wider
systems, which in the case or Remotely Piloted Air Systems (e.g. Reaper) have
the ability to directly control the aircraft. Hence the scope of airworthiness must
be extended beyond the actual aircraft itself; a key issues in regulating of civil
unmanned aircraft (Hodson, 2008). In continuing airworthiness, maintenance is
conducted using networked systems to provide approved data to technicians
and interact with the aircraft directly to diagnose faults. Similarly the supply
chain is similarly becoming fused with the aircraft (e.g. Boeing Gold system). On
the organisational front, there is a trend for traditional company or military
structures to become fragmented with various aspects of aircraft acquisition,
design, manufacture, support and maintenance outsourced or subcontracted.
Thus it could be argued that there is now a requirement to be able to model
these complexities and estimate how they might interact in productive or
counter-productive ways.
The practise of airworthiness management has remained strongly rooted in the
technical era of safety management. Accepted practise for linear combinations
of reliability assessment such as fault trees do not meet the ideal of resilience
engineering at the system level. The resilience engineering framework itself
draws on the rigour of systems engineering and some of the insights of
complexity theory. Resilience engineering is positioning itself as a successor to
conventional forms of safety management; this requires further justification
through real world examples. The line between reliability and safety assessment
has often seemed to become blurred; this is understandable for simple aircraft
154
systems. However software and human factors have made aircraft systems
increasingly intractable – it is not possible to predict their performance under all
conceivable conditions. For this reason Development Assurance Levels or
Safety Integrity Levels are used to control the design of safety critical software.
This is an example of resilience engineering already in practise – in the case of
software it is not possible to analyse it in a Newtonian-linear manner so instead
upstream controls are used on the development process in order to maximise
the likelihood of successful operation. It is quite common particularly in military
aviation, for aircraft to be operated in a manner quite different to the way the
designer had originally envisaged. It is for this reason that both ‘type’ and
‘continuing’ airworthiness management is important. Airworthiness activity is in
practise is delegated across multiple organisational boundaries. Managing the
output of these various organisations is therefore important. Human factors in
maintenance are also a subject of much debate and research (RAeS, 2013).
Research interviews found common complaints in the proliferation of regulation,
assurance activity and other barriers. Barriers designed with good intentions,
have the potential to shape functions in unexpected ways; producing and new
emergent properties from the system. This is especially the case when work is
viewed from an efficiency-thoroughness-trade-off perspective. If additional work
is required with no increase in resources it inevitable that thoroughness will
suffer in some other area.
Accident theory has tended to focus on producing techniques that prove useful
in a large number of circumstances; perhaps at the expense of compliance with
a more rigorous theoretical basis. Given the complexity of sociotechnical
systems and the property of incompressibility; how can an analyst be sure that
a model is viable? Also, given the low probability of occurrence of the
catastrophic accidents that we are concerned with, how is it possible to validate
these models? In terms of accident investigation, FRAM deals with
incompressibility through its fractal property – functions can be continually
decomposed into further functions until an appropriate level of detail is provided.
Incompressibility leads to the requirement for dealing with cross-scale
interactions between very large and very small functions. For accident
155
investigation the TASM can be altered at will to account for this requirement and
the facts as they are found. This process cannot be reverse engineered for risk
assessment and the Tornado Airworthiness System Model (TASM) represents a
first guess at an appropriate level of decomposition. Assuming that the system
is in fact non-linear, the associated properties mean that a catastrophic out-of-
control condition may develop from an element in the system that exists below
the level of detail modelled. There is not yet a satisfactory answer for how to
deal with this. .
Madni’s (2007) diagram at Figure 2-7 shows a variety of activities that would be
familiar to current safety managers within a variety of industries, how is
resilience engineering different to safety management? Resilience engineering
seeks to engineer safety into the system at all stages of the lifecycle;
emphasising pro-activity rather than reactivity. It seeks to enhance performance
in synergy with the operational output rather than the introduction of additional
checks and barriers.
Resilience engineering offers a theoretical basis for understanding how
airworthiness management currently operates in practise and perhaps how it
might be better designed or ‘engineered’; providing a framework in which
technical, organisational and human factors risks can be managed more
holistically. It goes further than the trend for employing safety management
systems or striving to shape safety cultures. Wider than that, it offers a positive
outlook in which safety can only be improved as by improving the way that
functions perform. New tools and techniques are required in order to bring
resilience engineering theory into practise. The literature presents a variety of
possible tools that may be applicable but none have yet gained widespread use
in comparable hazardous industries. The tool that appears to have the most
potential is the Functional Resonance Analysis Method; it was therefore
selected for use in the case study organisation.
8.2 The Tornado Airworthiness System Model – Initial Version
The key point which the model emphasises is the connectivity between the
different functions involved in Tornado Airworthiness. This is where the detail of
156
the model is contained and where its value is provided. How the functions
themselves internally operate is not explicitly described although much is
implicit from the activities which link functions. This is a weakness of the model;
it does not allow predictions to be made as to future behaviour or risks of
particular outcomes based on internal variability. Its strength however is in its
relative completeness. The iterative process of construction means that
activities highlighted during the data gathering phase must link either two
functions or out to some external or background function. In many cases this
property of completeness led to the identification of previously un-modelled
functions. Another weakness is the level of detail in which activities are
specified; these are generally described in a one-dimensional manner whereas
in reality each activity describes multiple changes of state, flows of information
or conversion of energy. Similarly with the description of variability, this is
described in its ‘most likely’ form whereas the form of variability that may prove
to be worst could be completely different in nature. However complete the
model is in this initial version, it is anticipated that if it is used over time it will be
added to and refined.
8.3 Incident Investigation
The accident or occurrence models currently employed within the MOD’s
ASIMS system is essentially linear in nature; it classifies the occurrence by
cause, event descriptors and by contributory factors. Use of the FRAM model
allows a more nuanced approach to the investigation. It is anticipated that future
investigations will be able to use the model as a framework to guide
investigations. Although ASIMS has a specific taxonomy which must be used,
there is the facility to attach files so outputs from the model can easily be
archived within the system. The most important aspect of using TASM during
incident investigation is its encapsulation of resilience engineering precepts and
more specifically the basic principles of FRAM. It is hoped that this will help
prevent future investigators using assumptions linked to more linear ages of
safety thinking.
157
8.3.1 Data Collection
A key part of investigating any incident or accident is the collection of evidence.
This data is then organised in some manner to provide a narrative description of
what happened. At this stage mental models of causation become relevant; the
TASM provides both a starting point and an aid deciding on the completeness
of the data collection stage of any investigation. In broad terms it provides a
model for how each function performs; if investigations show that there was
unexpected variability from the output of a specific function, conclusions as to
why this occurred should not be drawn until all aspects of that function have
been investigated. A ‘guilty until proven innocent’ approach to output variability
needs to be taken so that the emphasis is on showing that output variability was
within safe bounds during the accident model instantiation. Using the model as
an investigative framework helps to guard against the arbitrary allocation of
‘root’ causes as described in Chapter one. Conversely, using the model as a
framework does pose a risk that investigations will be drawn into avenues that
do not fit the evidence as a result of assumptions made when constructing the
model. Any investigator must be highly sceptical as to the accuracy of the
model and where functions and activities occur that do not seem to fit, they
should propose alterations to the baseline model. That said the level at which
the functions and activity has been mapped is sufficiently high to encompass
most levels of variability. The two cases studies provided in Chapter six
demonstrate that this is the case.
8.3.2 Aids to Investigation
The use of Visio to provide an interactive visualisation tool allows for quick and
easy manipulation of potential scenarios within the model. This is highly useful
at the overview level but as detail is investigated it becomes difficult to trace
variability through the model. For further detail the spreadsheet model is
particularly useful and the detailed interactions between functions may be
tracked using the excel ‘trace dependents’ tool.
158
8.4 Risk Assessment
The area of risk assessment is particularly problematic as linear analysis
methods are able to aggregate risks and provide quantitative and semi-
quantitative risk assessments for particular accident scenarios or sets of
scenarios. Non-linear mathematical methods are required to estimate
probabilities for risks when the system is analysed from a FRAM perspective.
Linear methodologies such as Bow-Tie or Boolean fault and event trees rely on
digital interpretations of functions or events, e.g. the system/human/process has
failed or not failed. The FRAM functions used in the TASM do not specify all of
the potential dimensions of the output. For example as a technological function,
the mechanical systems will produce hydraulically actuated flying control
movement, breathing air and cabin conditioning amongst other outputs.
Similarly as an organisational function, scheduled maintenance produces a
variety of outputs which are as diverse as surface finishing, inspection,
lubrication and functional testing. Human functions further defy description with
a huge variety in possible outputs. In order to overcome this the number of
functions could be increased by further decomposing the current set of
functions. The nature of FRAM is that functions are fractal and can be
decomposed into an infinite number of functions depending on the level of detail
required in the model. These possible functions below the level of detail used in
the model may produce some or all of the outputs of the higher level functions.
This is illustrated in Figure 8-1, which also demonstrates how functional outputs
may take a variety of dimensions.
159
Figure 8-1 Fractal Property of the FRAM - Function Decomposed into Lower
Level Functions
Estimation of functional output variability is down to ‘expert’ opinion on the likely
performance of the function on the basis of the combination of upstream
activities presented. This provides for a less than satisfactory estimate of risk,
particularly given the doubts raised about the ability of ‘experts’ to produce
reliable estimates of risk for low probability events (Brooker, 2011). Rather than
a final method for estimating risk, the FRAM provides a framework from which
such a method may be developed in the future. A key point to be drawn from
the resilience engineering perspective is the need to improve the success rate
of the system and thus prevent the emergence of hazards and risks. This is a
different perspective for consideration of duty holders and regulators, despite
the difficulty of interfacing this perspective with the current legal system and
need to attribute blame and culpability in the event of an accident.
O
C
P
I
T
R
FUNCTION
O
C
P
I
T
R
FUNCTION
O
C
P
I
T
R
FUNCTION
O
C
P
I
T
R
FUNCTION
O
C
P
I
T
R
FUNCTION
160
8.4.1 Hazard Management vs Functional Resonance Management
The key to managing risk to life in the current practise is the management of
hazards. Such hazards are derived from historical records or from functional
hazard assessment techniques (FHA). Chapter 7 outlines a methodology which
to some extent seeks to replace some of these techniques. It is unlikely that the
process of managing functional resonance can completely replace these
standard techniques using the TASM. In order to fully evaluate all potential
accidents and the antecedent hazard generating functions within the TASM, it
would be necessary to conduct a similarly FRAM modelling exercise for the
operation of the aircraft.
8.5 Utility of the TASM for Type Airworthiness Activities
The TAA’s responsibilities are set out in RA1015 (MAA, 2012a); the TASM will
facilitate many of these responsibilities in its current version. As is discussed
below, there is scope to develop this tool further and this will provide greater
assistance to the TAA. Table 8-2 provides details whether or not each of the
responsibilities required by the RA can be assisted by the TASM. The key way
in which the TASM is useful is that it provides a model for how the aircraft
system interacts with its supporting personnel and organisations; this will allow
the safety management system to be adapted to provide enhanced resilience.
Type airworthiness activities are predominantly organisational in nature and
therefore more susceptible to low frequency variability, known as organisational
drift; this is more difficult to sense and control. The use of a rigorous approach
such as the TASM will enable this feature of organisational safety to be more
effectively dealt with.
161
Table 8-1 Utility of TASM for TAA Activities
Us
e o
f T
AS
M V
ers
ion
1P
ote
nti
al F
utu
re U
se
s
aM
ana
gin
g, o
n b
eha
lf o
f th
e O
CD
, a
irw
ort
hin
ess a
ctivi
ty o
f a
n a
ir s
yste
m d
uri
ng
its d
eve
lop
me
nt.
No
t a
pp
lica
ble
to
To
rna
do
but use
ful b
ase
line
fo
r fu
ture
acq
uis
tio
ns
Pre
dic
tive
mo
de
lling
and
de
sig
n o
f sup
po
rt s
yste
ms (
org
anis
atio
ns)
alo
ng
sid
e
techno
log
ica
l de
sig
n
bC
om
pili
ng
ce
rtific
atio
n e
vid
ence
to
sup
po
rt m
ilita
ry typ
e c
ert
ific
atio
n/ R
ele
ase
to
Se
rvic
e R
eco
mm
end
atio
n (
RT
SR
)/R
ele
ase
to
Se
rvic
e (
RT
S).
N
/AN
/A
cT
he
co
mp
lete
ne
ss a
nd
accura
cy
of th
e A
pp
rove
d D
ata
, e
lem
ents
of th
e A
ircra
ft
Do
cum
ent S
et (A
DS
), a
nd
the
up
ke
ep
of th
e T
ype
inclu
din
g a
ll D
esig
n.
Gre
ate
r und
ers
tand
ing
of d
ow
nstr
ea
m a
ffe
cts
of cha
ng
es to
AD
S a
nd
air
cra
ft
mo
dific
atio
ns. F
or
Mo
dific
atio
ns u
se
mo
de
l to
de
mo
nstr
ate
tha
t e
xce
ss v
ari
ab
ility
will
no
t b
e c
rea
ted
.
Inte
gra
ted
da
ta w
ill p
rovi
de
ne
ar
rea
l tim
e fe
ed
ba
ck o
f th
e e
ffe
ctive
ne
ss o
f A
DS
.
d
De
velo
pin
g, m
ain
tain
ing
and
enha
ncin
g a
Sa
fety
Ma
na
ge
me
nt S
yste
m,
co
mp
liant w
ith the
OC
D a
pp
rove
d p
roje
ct a
irw
ort
hin
ess s
tra
teg
y, w
hic
h w
ill
co
ntr
ibute
to
the
Op
era
ting
Duty
Ho
lde
r’s (
OD
H’s
) A
ir S
yste
m S
afe
ty C
ase
fo
r
ea
ch typ
e.
Pro
vid
es a
fra
me
wo
rk o
n w
hic
h the
SM
S c
an b
e b
ase
d u
sin
g r
esili
ence
eng
ine
eri
ng
pri
ncip
les.
Inte
gra
ted
da
ta w
ill p
rovi
de
ne
ar
rea
l tim
e fe
ed
ba
ck o
f th
e e
ffe
ctive
ne
ss o
f S
MS
.
e
Ensuri
ng
tha
t a
pp
rop
ria
te a
ctio
n is ta
ke
n in r
esp
onse
to
air
wo
rthin
ess issue
s
inclu
din
g, b
ut no
t lim
ite
d to
, th
e issuin
g o
f T
echnic
al I
nstr
uctio
ns, a
nd
reco
mm
end
ing
to
the
OC
D the
sto
pp
ag
e o
f, o
r m
ajo
r re
str
ictio
n to
, o
f fly
ing
.
Pla
nnin
g the
effe
ctive
im
ple
me
nta
tio
n a
nd
eva
lua
ting
the
se
co
nd
and
thir
d o
rde
r
effe
cts
of in
tro
ducin
g S
I(T
)s. A
sse
ssin
g the
full
imp
lica
tio
ns o
f (a
nd
re
sili
ence
ag
ain
st)
any
ne
wly
id
entifie
d h
aza
rd p
rio
r to
any
sto
pp
ag
e o
r re
str
ictio
n o
f fly
ing
reco
mm
end
atio
n.
Inte
gra
ted
da
ta w
ill a
llow
mo
re u
se
ful a
nd
ne
ar
rea
l tim
e fe
ed
ba
ck; a
llow
ing
mo
re
pre
cis
ly ta
rge
tte
d r
estr
ictio
ns in fly
ing
whe
re n
ece
ssa
ry.
f
Co
llecting
, in
vestig
ating
and
ana
lysin
g r
ep
ort
s o
f, a
nd
to
, fa
ilure
s, m
alfu
nctio
ns,
de
fects
or
oth
er
occurr
ence
s info
rma
tio
n r
ela
ted
to
co
nfirm
tha
t th
e typ
e d
esig
n
rem
ain
s a
irw
ort
hy.
Ana
lysis
of te
chnic
al f
ailu
re a
nd
ma
lfunctio
ns fro
m a
pe
rfo
ma
nce
va
ria
bili
ty/
resili
ence
po
int o
f vi
ew
. A
na
lysis
of o
ccure
nce
s u
sin
g r
esili
ence
eng
ine
eri
ng
pri
ncip
les.
Mo
re d
eta
iled
FR
AM
mo
de
l of a
ircra
ft s
yste
ms to
re
pla
ce
fa
ult
tre
e b
ase
d lo
ss
mo
de
l.
gIn
form
ing
the
typ
e d
esig
ne
r, o
the
r o
pe
rato
rs a
nd
the
MA
A o
f th
e o
utc
om
e o
f a
ny
inve
stig
atio
n into
a s
ignific
ant a
irw
ort
hin
ess o
ccurr
ence
.P
rese
nta
tio
n o
f a
irw
ort
hin
ess o
ccurr
ence
re
po
rts u
sin
g F
RA
M v
isua
lisa
tio
ns.
h
Ma
inta
inin
g the
Str
uctu
ral,
Pro
puls
ion a
nd
Sys
tem
s In
teg
rity
o
f p
latfo
rm typ
e
thro
ug
h li
fe inclu
din
g p
latfo
rm typ
e s
erv
ice
exp
eri
ence
ag
ain
st th
e d
esig
n
assum
ptio
ns.
Te
sting
the
de
sig
n a
ssum
ptio
ns (
pro
ba
bly
no
w o
nly
fo
r m
od
ific
atio
ns)
ag
ain
st th
e
in-s
erv
ice
mo
de
l fo
r a
irw
ort
hin
ess m
ana
ge
me
nt.
iE
nsuri
ng
tha
t th
ere
is a
de
qua
te c
o-o
rdin
atio
n b
etw
ee
n d
esig
n a
nd
pro
ductio
n
org
aniz
atio
ns.
N/A
Usin
g d
ata
inte
gra
tio
n to
und
ers
tand
and
mo
nito
r a
ny
do
wnstr
ea
m e
ffe
cts
of D
O-
PO
issue
s.
jA
cce
pta
nce
of b
uild
de
via
tio
ns fro
m d
esig
n, b
y ta
il num
be
r.
N/A
N/A
k
Re
tain
ing
, o
r ha
ving
acce
ss to
, a
ll re
leva
nt d
esig
n info
rma
tio
n, d
raw
ing
s, b
uild
info
rma
tio
n a
nd
te
st &
insp
ectio
n r
ep
ort
s to
pro
vid
e the
info
rma
tio
n n
ee
de
d to
sup
po
rt the
sa
fety
arg
um
ent th
at und
erp
ins T
ype
Air
wo
rthin
ess.
N/A
N/A
lE
nsuri
ng
tim
ely
up
da
te a
nd
co
mm
unic
atio
n o
f cha
ng
es to
the
Air
cra
ft D
ocum
ent
Se
t (A
DS
).
N/A
N/A
mT
he
pro
visio
n o
f m
od
ific
atio
ns n
ece
ssita
ted
by
in-s
erv
ice
exp
eri
ence
or
as
req
ue
ste
d b
y th
e D
Hs fo
r sa
fety
, o
pe
ratio
na
l, o
r e
co
no
mic
re
aso
ns.
Ris
k a
sse
ssm
ent a
nd
pre
dic
tio
n o
f e
ffe
ctive
ne
ss o
f a
ny
sa
fety
mo
dific
atio
ns.
Gre
ate
r q
ua
ntifica
tio
n o
f ri
sk to
pro
vid
e e
vid
ence
fo
r m
od
ific
atio
n b
usin
ess
ca
se
s.
n
The
co
mp
ete
ncy
asse
ssm
ent a
nd
sub
-de
leg
atio
n o
f e
lem
ents
of a
irw
ort
hin
ess
auth
ori
ty w
ithin
the
ir P
latfo
rm P
T to
re
leva
nt P
T s
taff, w
he
re n
ece
ssa
ry, b
y
me
ans o
f a
Lo
AA
. S
uch L
oA
As m
ay
be
giv
en o
nly
to
ind
ivid
ua
ls w
ho
re
quir
e
air
wo
rthin
ess a
uth
ori
ty to
alte
r A
pp
rove
d D
ata
and
the
AD
S w
itho
ut re
fere
nce
to
hig
he
r a
uth
ori
ty.
N/A
Inte
gra
ted
da
ta w
ill p
rovi
de
fe
ed
ba
ck o
n e
ffe
ctive
ne
ss o
f a
irw
ort
hin
ess d
ecis
ion
ma
kin
g.
oE
nsuri
ng
tha
t a
ll sub
-de
leg
atio
ns a
re r
evi
ew
ed
at le
ast a
nnua
lly.
N/A
N/A
RA
10
15
Ty
pe
Air
wo
rth
ine
ss
Au
tho
rity
Ro
les
an
d R
es
po
ns
ibilit
ies
162
8.6 Utility of the TASM for Continuing Airworthiness Activities
The TASM details how airworthiness is managed across type and continuing
airworthiness areas. These boundaries are somewhat more blurred than is the
case in the civil sector because of the way the industries have evolved – the
Tornado Continuing Airworthiness Management Exposition (Casey, 2013)
highlights how the Continuing Airworthiness Management Organisation (CAMO)
tasks have been distributed across a variety of organisational boundaries. In
particular many of the tasks have been contracted to industry but the MOD
organisation responsible for these contracts works for the TAA. Whilst the
background layer in the TASM broadly shows areas of responsibility, CAMO
tasks do in fact stretch across most areas of the TASM. Table 8-3 shows how
the TASM will assist the CAMO in undertaking its tasks. The CAM has
responsibility for more dynamic elements of the system compared to steadier
state activities undertaken by TAA staff. Modelling this system will at least
promote greater understanding of how the various components interact. Where
the TASM is more use to the TAA as a risk assessment tool, to the CAMO it is
more useful as a more general management tool. It may provide a useful check
on future system changes, allowing analysis of downstream implications of any
change prior to implementation. It could be argued that a knowledgeable and
experienced manager would have an instinctive handle on the implications of
change without use of the tool. The modelling process has shown a high level of
complexity exists so it is moot as to whether an intuitive decision making
process would be able to consider all factors in as complete a manner without a
model. The scope of management understanding of the control mechanisms
available to adjust the system is currently limited by experience.
163
Table 8-2 Potential CAMO Use of TASM
RA 4947 CAMO Tasks Use of TASM Version 1 Potential Future Uses
a
Develop and control a maintenance
programme including any applicable
reliability programme, proposing
amendments and additions to the
maintenance schedule to the TAA.
Qualitative understanding of the
airworthiness system to
understand any second and third
order implications of changes to
the maintenance programme.
Integrate reliability data into
TASM to provide enhanced
analysis of organisational or
human causes for repeat
arisings.
b Manage the embodiment of
modifications and repairs.
Assessment of the effect of
tasking modification and repairs
by generating an instantiation of
the system.
Incorporate reporting of
modification satisfaction into the
TASM to provide feedback as to
the succes of the plan.
c
Ensure that all maintenance is
carried out to the required standard
and in accordance with the
maintenance programme, and
released in accordance with MRP
Maintenance Certification
Regulation.
Qualitative understanding as to
whether the system as currently
constructed is capable of reliably
implementing the maintenance
programme - for example
assessing the resources
required for variations in the
programme.
Incorporation of resouce data
(e.g. manning spreadsheets) into
the model.
dEnsure that all applicable SI(T)s are
applied.
Assessment of the capability of
the system's ability to reliably
carry out SI(T)s without excessive
variability; understand all of the
upstream dependencies on the
SI(T) functions.
Incorporate SI(T) satisfaction
data into the TASM. Better
predictive capability for SI(T)
satisfaction rate.
e
Ensure that all faults reported, or
those discovered during scheduled
maintenance, are managed correctly
by a Military Maintenance
Organization or MRP/Mil Part 145
Approved Maintenance Organization.
Analysis of the maintenance
organisation's likelihood of
managing faults correctly.
Use predictive data to provide
indications of when organisations
may fail.
f
Co-ordinate scheduled maintenance,
the application of SI(T)s and the
replacement of service life limited
parts.
Provides a top level map of
these activities.
Leading perfomance indicators
to provide warning of potential
failures.
g
Manage and archive all continuing
airworthiness records and the
MF700/operator's technical log.
N/A N/A
h
Ensure that the weight and moment
statement reflects the current status
of the aircraft.
N/A N/A
i
Initiate and coordinate any necessary
actions and follow up activity
highlighted by an occurrence report.
Allows more effective and more
easily implemented
reccomendations to be
generated from occurrence
reports.
Incorporation of ASIMS data into
the TASM.
164
8.7 Utility of TASM for Duty Holder Activity
The TASM was designed as an airworthiness management tool rather than an
air safety tool. None the less it will provide a useful facility for the Duty Holder
and his staff engaged in air safety management; principally it will provide non-
airworthiness staff a better overview and understanding of the airworthiness
system and thus provide for greater challenge for the advice of specialists.
Table 8-3 Aviation Duty Holder Use of TASM
The duty holder is responsible for ‘holding’ the risk to life as a result of any
airworthiness issue that may be present in the system or may arise in the future.
For all of the reasons described in the Chapter 2 this responsibility can be
discharged with greater realism if these risks are analysed from resilience rather
than a traditional linear perspective. It is anticipated that the lack of clarity over
risk may provide significant concern and this risk assessment element is the key
area for further work to develop. However, there is a concern that linear
methods currently provide a spurious level of accuracy in their risk modelling;
Use of TASM Version 1 Potential Future Uses
a
Cease routine aviation operations if
RtL are identified that are not
demonstrably at least Tolerable and
ALARP.
More effective assessment of
risks using a resilience point of
view.
Potential for quantification of risk
analyses.
b
Establish and maintain an effective
ASMS that, wherever
possible,exploits the MOD’s existing
aviation regulatory structures,
publications and management
practices, in order to demonstrate an
acceptable means of compliance
with the requirements in RA1200.
Evolve ASMS into a resilience
engineering based system.
Introduce leading indicators to
increase effectiveness of the
ASMS - incorporated into the
model.
cPromote and lead by example a
questioning Air Safety culture.
Provide DH with an overview of
the system to allow more
effective questioning of the CAM
and the TAA.
Hold TAA and the CAM to
account using quantitative
performance indicators.
d
If necessary, challenge formally any
option or action that is proposed or
implemented by DH-facing
organizations that may result in the
activities for which they are
responsible not being Tolerable and
ALARP.
Provide a ready model which can
used to demonstrate the effect of
DH facing organisations on
airworthiness.
Quantify the effect of DH facing
actions, using intergrated data.
RA 1020 Roles & Responsibilities:
Aviation Duty Holder
165
duty holders may need to accept a greater degree of uncertainty around these
estimates. The difficulty is that the aim of the duty holder in dealing with any
airworthiness related risk to life is to ensure that risk has been reduced to
ALARP and at least within tolerable bounds. Greater uncertainty over
categorisation of risk to life may lead to greater conservatism in dealing with
emerging risks. Whilst this may be beneficial for safety it would be
disadvantageous from an operational perspective. To counter this problem it
should be emphasised that the model shows in greater detail than has been
described before the multiple layer (or loops) of control that provide damping
against harmful activity within the system. A specific insight that was gained
from the model is the degree of connectivity of few specific functions. These
functions are highlighted visually within the model. It would be possible to
conduct a similar exercise for flight safety using a FRAM model and perhaps to
link the 2 models together.
8.8 Potential Use for System Improvement
The literature review outlined successes for using FRAM in process
improvement activity in other industries. Experience has shown that the RAF
has discontinued the use of a variety of ‘lean’ methodologies in the years
following their initial introduction. These linear methodologies have often failed
to deal adequately with either the complexity or the variability of aircraft
maintenance operations. Criticisms have been levelled in that lean sought to
impose production line methodologies on maintenance, which was a poor fit
and as Carney (Carney, 2010) found, disadvantageous for safety. The FRAM
has a great potential to provide an alternative or complementary means of
achieving process improvement. For example, the TASM highlights the
variability in the way that maintenance is tasked; sometimes tasking is
generated through handovers, sometimes it is written on boards and sometimes
tasks are given verbally. A more detailed mapping of this element of the TASM
using FRAM worked through with a facilitated workshop using a variety of
personnel involved in the activity day-to-day, may assist in developing a more
efficient and safer process. This type of system improvement workshop ought to
166
become the de-facto response to any occurrence investigation. The current
system employed by the RAF provides for a hierarchical review and
implementation of recommendations made by occurrence investigators and
review groups. Whilst this accords with the need to vest decision making with
those ultimately responsible for the risk to life, it does remove decision making
further from those who may be best placed to understand the complexities of
the system. Further work is required on how best to implement FRAM into
decision making on occurrence report recommendations.
8.9 Potential for Further Development of the TASM
This project sought to test whether the FRAM could be applied to airworthiness
management and how useful a tool could be created. The version that is
presented in this report is an initial baseline version and whilst it ought to prove
a useful tool in its own right there much scope for further development.
Currently the model exists as two files; one a spreadsheet and another visio
drawing which can be manipulated interactively using the layers feature. It is
possible to embed Visio drawings within Microsoft SharePoint sites as used
within the military IT systems (MOSS). It is also possible to attach data files
drawn from excel to shapes within Visio Drawings. Figure 8-2 shows a potential
development pathway for the TASM. The envisaged end state for development
is tool hosted on standard desk top IT, providing a ‘dashboard’ type function to
show how the whole system is performing. It should be able to display output
from a variety of data sources such as LITS reports, manning information
spreadsheets, ASIMS data and quality audit reports.
167
Figure 8-2 TASM Development Pathway
8.9.1 Increased Model Fidelity
Throughout the development of the model to date there has been a continuous
set of assumptions and simplifications made regarding the operation of the real
world system. Chapter one discussed the nature of complexity and its inherent
incompressibility. With this in mind it is important to remember that the model
TORNADO GR4 AIRWORTHINESS SYSTEM MODEL
Functional Resonance Analysis Method
Type of Function External Variability
1 Name of Function Flight Servicing Human Number Name Aspect
Aspect Description of Aspect
Input Line Controller indicates task on boards 61Rectification & Line Control
Boards
Maintenance
Information for taskingSequence
Missed maintenance
requirementsMedium Medium
Potential to sway ETTO and cause
omissions and errorsINCREASE 12
Pre flight checks, fault reporting, engineering
management supervision
Output AC systems replenished (propulsion, mechanical) AC systems replenished (propulsion, mechanical) Sequence Omission or wrong fluid used High Medium Related fault reporting
AC visually inspected (Avionics, Electrical, Structure,
Mechanical, Crew Escape, Weapons, Propulsion)
AC visually inspected (Avionics, Electrical, Structure, Mechanical, Crew
Escape, Weapons, Propulsion)Sequence Omission High Medium
Aircrew pre-flight checks - feedback info.
Husbandry checks and Airworthiness Review
Any faults recordedAny faults recorded
Sequence Omission High MediumComparison of flt servicing fault reporting
across shifts/sqns etc
Husbandry jobs recorded in logHusbandry jobs recorded in log
Sequence Omission High LowComparison of flt servicing husbandry
reporting across shifts/sqns etc
Flight Servicing Certificate SignedFlight Servicing Certificate Signed
Wrong ObjectSign up for wrong tail number or
omit full informationMedium High Captured in ASIMS reports if found
Precondition Maintenance Activity complete 16Coordinate Maintenance
Documentation
F700 ready for flight
servicing and ac captain
(pre-flight checks)
Timing/DurationF700 not available for crew
walkMedium Medium
Unlikely to start flight servicing if
pre-condition not in place.NO CHANGE 8 N/A
AC available at groundcrew location 54 Operate AircraftReturn aircraft to
groundcrewWrong Object Parked in incorrect location Low High
Not possible to start flight servicing
without access to aircraftDECREASE 3 N/A
Resource Fuels & Lubricants (from supply chain) 10 Supply Chain
Part Delivered to
Corrective Maintenance,
Scheduled Maintenance,
Repair Aircraft,
replacement or life
limited parts, weapons
& role equipment, tools
& test equipment
Timing/DurationNot delivered in time to
meet requirementHigh Medium
Not possible to fully complete
flight servicing without necessary
consumables
INCREASE 18Pre flight checks, fault reporting, engineering
management supervision
Tools & Test Equipment 13Provide & Account for
Tools and Test EquipmentTools and TME Wrong Object Incorrect tool Medium Low
Increases likelihood of using unsafe
work-around.INCREASE 6
Approved data specifies tools + Pre flight
checks, fault reporting, engineering
management supervision
Authorised manpower 8Provide Authorised
Maintenance Personnel
Appropriately (to
requirement)
Authorised Maintenance
Personnel (record work
done, fuel, scheduled
maintenance, report
faults, conduct quality
tasks etc)
SequenceOmission of a required
authorised skillHigh Medium
Potential for unauthorised (not
competent) personnel to carry out
servicing; however likely to be
damped out by maintenance
tasking function.
INCREASE 18Pre flight checks, fault reporting, engineering
management supervision
Control Flight Servicing Schedule (approved data) 30Publish Approved Data
(Tech Manuals & Policy)Flight Servicing Notes Sequence Omission Medium High
Misleading or inaccurate
information causes variability.INCREASE 18
The F765 process allows for reporting
unsatisfactory features in the approved data
Supplementary Flight Servicing requirements (from defer
faults)17 Defer Faults
ADF or Lim entry to close
job card hence allows co-
ordination of
maintenance
documentation
Sequence
Element of ADF/Lim
insufficiently defined or
omitted
High Medium
Additional tasking within the flight
servicing increases human
performance issues.
INCREASE 18
Feedback to engineering management on any
inconsistencies with supplementary flight
servicing requirements
Time Daily/Weekly Flying Programme 60Plan Weekly-Daily Flying
Programme
Flying Programme (Fuel,
flight service, etc)Sequence
Inappropriate/unworkable
planMedium Medium
Insufficient time likely to cause
inappropriate ETTO.INCREASE 12
Feedback to engineering management as to
likely feasibility of the plan. Line controller is
experienced technician and is able to judge
plan.
Frequency of Upstream
Performance Variability
Amplitude of Upstream
Performance Variability
Possible effect on this
(downstream) Function
Step 4 - Consequences of the Analysis
Frequency of Output
Performance Variability
Amplitude of Output
Performance
Variability
Step 2 - Identification of Output Variability
Most Likely Dimension of
Upstream Variability
Description of Most Likely
Upstream Variability
Step 1 - Identify and Describe the Functions
Carried out by small
groups or individually.
Most Likely Dimension of
Output Variability
Description of Most Likely Output
Variability
Relatively high degree of potential
variability for human factors reasons:
(Identify key issues from Error
Management Info)
Variation due to:
Social/cultural factors such as normalised
behaviours.
Environmental Factors - lighting, climate,
shelter/hangarage.
Organisational Factors - Availability of
tools, test equipment, fuels, lubricants,
ground support equipment and authorised
manpower.
Internal Variability Outputs Upstream Function Possible effect on this (downstream)
Function Output Variability
Step 3 - Aggregation of Variability
Rough Downstream
Function Variability Score
Potential Damping Factors to counter upstream-
downstream couplingPotential Performance Indicators
TASMVisualisation
Tool
Occurrence
Investigations
Review of Existing
Hazard Log
TORNADO GR4 AIRWORTHINESS SYSTEM MODEL
Functional Resonance Analysis Method
Type of Function External Variability
1 Name of Function Flight Servicing Human Number Name Aspect
Aspect Description of Aspect
Input Line Controller indicates task on boards 61Rectification & Line Control
Boards
Maintenance
Information for taskingSequence
Missed maintenance
requirementsMedium Medium
Potential to sway ETTO and cause
omissions and errorsINCREASE 12
Pre flight checks, fault reporting, engineering
management supervision
Output AC systems replenished (propulsion, mechanical) AC systems replenished (propulsion, mechanical) Sequence Omission or wrong fluid used High Medium Related fault reporting
AC visually inspected (Avionics, Electrical, Structure,
Mechanical, Crew Escape, Weapons, Propulsion)
AC visually inspected (Avionics, Electrical, Structure, Mechanical, Crew
Escape, Weapons, Propulsion)Sequence Omission High Medium
Aircrew pre-flight checks - feedback info.
Husbandry checks and Airworthiness Review
Any faults recordedAny faults recorded
Sequence Omission High MediumComparison of flt servicing fault reporting
across shifts/sqns etc
Husbandry jobs recorded in logHusbandry jobs recorded in log
Sequence Omission High LowComparison of flt servicing husbandry
reporting across shifts/sqns etc
Flight Servicing Certificate SignedFlight Servicing Certificate Signed
Wrong ObjectSign up for wrong tail number or
omit full informationMedium High Captured in ASIMS reports if found
Precondition Maintenance Activity complete 16Coordinate Maintenance
Documentation
F700 ready for flight
servicing and ac captain
(pre-flight checks)
Timing/DurationF700 not available for crew
walkMedium Medium
Unlikely to start flight servicing if
pre-condition not in place.NO CHANGE 8 N/A
AC available at groundcrew location 54 Operate AircraftReturn aircraft to
groundcrewWrong Object Parked in incorrect location Low High
Not possible to start flight servicing
without access to aircraftDECREASE 3 N/A
Resource Fuels & Lubricants (from supply chain) 10 Supply Chain
Part Delivered to
Corrective Maintenance,
Scheduled Maintenance,
Repair Aircraft,
replacement or life
limited parts, weapons
& role equipment, tools
& test equipment
Timing/DurationNot delivered in time to
meet requirementHigh Medium
Not possible to fully complete
flight servicing without necessary
consumables
INCREASE 18Pre flight checks, fault reporting, engineering
management supervision
Tools & Test Equipment 13Provide & Account for
Tools and Test EquipmentTools and TME Wrong Object Incorrect tool Medium Low
Increases likelihood of using unsafe
work-around.INCREASE 6
Approved data specifies tools + Pre flight
checks, fault reporting, engineering
management supervision
Authorised manpower 8Provide Authorised
Maintenance Personnel
Appropriately (to
requirement)
Authorised Maintenance
Personnel (record work
done, fuel, scheduled
maintenance, report
faults, conduct quality
tasks etc)
SequenceOmission of a required
authorised skillHigh Medium
Potential for unauthorised (not
competent) personnel to carry out
servicing; however likely to be
damped out by maintenance
tasking function.
INCREASE 18Pre flight checks, fault reporting, engineering
management supervision
Control Flight Servicing Schedule (approved data) 30Publish Approved Data
(Tech Manuals & Policy)Flight Servicing Notes Sequence Omission Medium High
Misleading or inaccurate
information causes variability.INCREASE 18
The F765 process allows for reporting
unsatisfactory features in the approved data
Supplementary Flight Servicing requirements (from defer
faults)17 Defer Faults
ADF or Lim entry to close
job card hence allows co-
ordination of
maintenance
documentation
Sequence
Element of ADF/Lim
insufficiently defined or
omitted
High Medium
Additional tasking within the flight
servicing increases human
performance issues.
INCREASE 18
Feedback to engineering management on any
inconsistencies with supplementary flight
servicing requirements
Time Daily/Weekly Flying Programme 60Plan Weekly-Daily Flying
Programme
Flying Programme (Fuel,
flight service, etc)Sequence
Inappropriate/unworkable
planMedium Medium
Insufficient time likely to cause
inappropriate ETTO.INCREASE 12
Feedback to engineering management as to
likely feasibility of the plan. Line controller is
experienced technician and is able to judge
plan.
Frequency of Upstream
Performance Variability
Amplitude of Upstream
Performance Variability
Possible effect on this
(downstream) Function
Step 4 - Consequences of the Analysis
Frequency of Output
Performance Variability
Amplitude of Output
Performance
Variability
Step 2 - Identification of Output Variability
Most Likely Dimension of
Upstream Variability
Description of Most Likely
Upstream Variability
Step 1 - Identify and Describe the Functions
Carried out by small
groups or individually.
Most Likely Dimension of
Output Variability
Description of Most Likely Output
Variability
Relatively high degree of potential
variability for human factors reasons:
(Identify key issues from Error
Management Info)
Variation due to:
Social/cultural factors such as normalised
behaviours.
Environmental Factors - lighting, climate,
shelter/hangarage.
Organisational Factors - Availability of
tools, test equipment, fuels, lubricants,
ground support equipment and authorised
manpower.
Internal Variability Outputs Upstream Function Possible effect on this (downstream)
Function Output Variability
Step 3 - Aggregation of Variability
Rough Downstream
Function Variability Score
Potential Damping Factors to counter upstream-
downstream couplingPotential Performance Indicators
Hyperlink TASM to Visualisation tool
Hyperlink Safety Indicators to Tool
Develop Interactive
Sharepoint Dashboard Tool
Apply Bayesian Logic to
model to generate risk
assessment methodologies
168
can only provide a rough description of what is happening within the system or
how it likely to behave in any future scenarios. The only way that the fidelity of
the model can be improved is to exercise it against various scenarios and make
iterative adjustments. This process would need careful configuration control as
is currently exercised for the loss model and for risk registers.
8.9.2 Application of Bayesian and/or Fuzzy Logic
The inability to generate simple risk assessments is likely to be seen by users
as a key weakness of the FRAM approach. Whilst this level of uncertainty
potentially a more realistic assessment of the risk, it would be useful to be able
to more accurately assess risk in order that different courses of action can be
compared. For example, one solution to the configuration management issues
highlighted in chapter seven would be to further automate the data capture
process, perhaps with portable devices that could be used alongside the
aircraft. This would remove some of the higher levels of variability that are
provided by human elements of functions such as ‘scheduled maintenance’ and
‘record work done’. It would however require a substantial investment to
introduce such a capability. This would require a business case to allow public
funds to be committed. Existing processes (MOD, 2013) require quantitative risk
assessment to achieve this and generally use ‘waterfall diagrams’ to
demonstrate how risk is mitigated over time. There is therefore a clear
requirement to introduce some further elements of quantification into the FRAM
and the TASM. The currently most promising approach is that which has been
outlined by Slater (2013) who has developed a desktop interface to allow
development of Bayesian logic dependency diagrams. Slater’s tool does not
require a risk assessor to become competent themselves in Bayesian
mathematics. This approach uses FRAM as a framework on which Bayesian
decision nets can be constructed.
8.9.3 Expansion into Operational Safety Management
The entirety of the system for operating Tornado is captured within the ‘Operate
Aircraft’ function. It is recorded in this way because the aircrew interact with the
aircraft systems and hence affect the airworthiness of the aircraft in the short
169
term during a particular sortie or over the long term as patterns of usage affect
the condition of the aircraft systems. Clearly operating the aircraft is a human
function and is also heavily involved in most air accidents. So whilst outside of
the scope of this particular study, there is likely to be significant benefit in
developing a FRAM model to understand flight safety elements of aircraft
operations. This could be linked to the TASM to understand how airworthiness
and flight safety risks are interlinked.
8.10 Chapter Summary
This discussion centred on the applicability of resilience engineering concepts
to the practise of airworthiness engineering. It discussed how the framework
under which airworthiness related safety investigations are conducted could be
adapted to the new ideas. The increase of both realism and uncertainty in risk
assessment using resilience engineering techniques was discussed. The
potential for future development of the TASM was also described.
171
9 CONCLUSIONS
9.1 Summary
The project first reviewed the literature on the general background theories to
safety science and engineering. Three broad themes were identified as having
reached maturity; that of the technological age based on Boolean logic and
reliability studies; the age of human factors and then the age of the
organisational accident. More recent developments included a study of the
effects of complexity and control theory in order to understand safety. From
these roots resilience engineering has been highlighted by some as a new
paradigm in safety. The literature around resilience engineering was found to be
somewhat fractured however several key themes to the topic where identified
and discussed. Principally, a resilience engineering perspective views safety as
the system’s ability to perform under disturbed and potentially unexpected
conditions. Safety therefore becomes a control problem rather than a reliability
problem. Safety is an emergent property rather than a direct linear resultant
property determined by the reliability of the system’s components and their
mode of use. The Functional Resonance Analysis Method (FRAM) was
identified as having the greatest potential to operationalise these new theories
in existing airworthiness organisational systems. The methodology describes
the system in terms of its functions, linked by activities. Harmful activity may
emerge as a result of the non-linear combination variable functional outputs,
occasionally causing functional resonance which may propagate in the form of
uncontrollable output variability across the system. Using FRAM a spreadsheet
model was developed for Tornado Airworthiness; alongside this an interactive
visualisation tool was also developed. The model was based on a number of
interviews with personnel within the airworthiness system and on data from the
MOD’s Air Safety Information Management System, alongside a large amount
of policy documentation. This Tornado Airworthiness System Model (TASM)
was tested by taking the results from two separate incidents and describing the
scenarios in terms of functional resonance. This identified that the model was
consistent with both scenarios but also raised various questions over the
172
assumptions behind the investigations. The TASM was also used to investigate
the risk posed by the operation of components in excess of their cleared life on
the Tornado. This analysis highlighted that the model in its current form was not
able to quantify the risk in anything other than very general terms. However the
model did illustrate how the various factors were responsible for either forcing or
damping the variability of the functional output of the ‘replace life limited parts’
function within the model. This method of analysing risk scenarios provides
additional insight that traditional reporting techniques do not. Resilience
engineering and the FRAM in particular was shown to offer a great deal of
insight into how airworthiness may be more effectively managed. The research
objectives were:
Review the theoretical background to safety management and the
implications for airworthiness management.
Review the concepts of Resilience Engineering with an emphasis on
application to airworthiness management.
Establish a theoretical framework for a model of an airworthiness
management system.
Gather and use primary research data to establish and validate a model
of the airworthiness management system for the RAF Tornado Force.
Using the model, develop a tool to enhance the airworthiness
management system of the RAF Tornado Force.
All of these objectives were met and it can be concluded that the project has
produced an operationally useful tool which will enhance the management of
airworthiness across the RAF’s Tornado fleet using the latest safety thinking.
9.2 Recommendations
Whilst resilience engineering in general and the TASM in particular require
extensive development, the following specific recommendations are given with
respect to the RAF Tornado case study.
173
9.2.1 Manage Airworthiness as a Control Problem
Quantitative or probabilistic risk assessments are well suited to reliability
analysis of components or subsystems. Such analyses are of dubious validity
when considering complex systems and even more so where there is a large
element of human and organisational interaction. These cases apply to
airworthiness issues and as such it is better to combine reliability analyses with
a treatment of the achievement of airworthy systems as an ongoing control
problem.
9.2.2 Use the TASM to Control the Airworthiness System
Control of airworthiness systems can effectively be modelled by the Functional
Resonance Analysis Method, where harmful activity occurs when the output
from one function becomes coupled in a resonant manner with the aspect of
another function. The Tornado Airworthiness System Model (TASM) provides a
baseline model from which such analyses can be carried out.
The TASM will prove to be a powerful tool for occurrence investigation and
should be used a baseline from which to conduct such investigations.
9.2.3 Review Airworthiness Risk from a Resilience Perspective
Where it is necessary for the Tornado TAA and Duty Holder to sentence
emerging air safety risks that have any connection to airworthiness
management, the TASM should be used to review the risk from a resilience
engineering perspective alongside existing methodologies required by the MAA.
9.2.4 Use FRAM as a Means to Improve System Resilience and
Efficiency
Where incidents occur, the TASM should be the baseline investigative tool.
Other Quality and continuous improvement activity should use the TSAM and
FRAM as a means to seek out improvements in safety and in efficiency across
organisations involved in airworthiness. In particular FRAM can be used as an
alternative to linear ‘lean’ techniques when dealing with complex working
environments.
174
9.3 Potential for Further Research and Development
This project has provided a very initial look into resilience engineering with
respect to airworthiness. There is a large amount of further research that can be
conducted into this area. General themes should encompass:
The use of FRAM as a technique for investigating air accidents and air
safety occurrence reports.
The development of FRAM to produce quantitative and qualitative risk
assessments, particularly focussing on how it may be used as a
framework to develop Bayesian probability models.
Development of techniques, protocols and standards for conducting
FRAM workshops, whether for the analysis of safety issues or for the
purpose of improving quality or safety.
This project has created an initial version of the TSAM, which while useful, will
require extensive further development:
Integration of the FRAM spreadsheet as data attached to the functional
shapes within the visualisation tool to allow easier interpretation.
The visualisation tool into a Microsoft SharePoint site to allow further
Integration of development as an airworthiness ‘dashboard’.
Development of the leading safety indicators identified within the TASM.
Linking of existing and new data sources as leading safety indicators in
a TASM ‘dashboard’ to provide a mechanism for day-to-day
management of the airworthiness system and enhance the ability of both
the CAM and the TAA to take appropriate airworthiness decisions.
9.4 Concluding Remarks
This study has taken a new set of safety science concepts and has sought to
apply them to the management of airworthiness. This activity has been largely
successful although inevitably there will need to be a further continuous process
of iteration and improvement to the tools produced. The background to this
project was the questions posed by the Nimrod Review. It is clear that in the
light of the new safety paradigm described by resilience engineering, that in the
175
case of Nimrod, the airworthiness system had gradually slipped out of control
and that a variety of functions had begun to resonate with each other resulting
eventually in uncontrollable interaction between the fuel, mechanical and
electrical systems to produce the catastrophic loss of the aircraft and crew.
Resilience engineering and FRAM provide a basis for more effective future
control of the organisational, technological and human functions involved in
airworthiness. Better upstream management of airworthiness controls will
prevent some future pilot having to “fight with the controls” in the face of some
potential downstream catastrophe.
177
REFERENCES
Anon, (2011) ‘Supervision High Up on the Equator - the Puma Force in Kenya’, Air Clues, July [Online], Available at: http://www.raf.mod.uk/rafcms/mediafiles/29D67908_5056_A318_A8AFDA410071E0B8.pdf (Accessed: 1 December 2013).
Aitken, H. (2009) LITS Business Data Corruption, MOD: Internal, DES/WYT/595441/4 20 May 09.
Apostolakis, G. E. (2004) ‘How useful is quantitative risk assessment?’, Risk Analysis, vol. 24, no. 3, pp. 515-520.
Bagwell, G. (2011) 1 Gp ODH ALARP Statement - Operation of Components in Excess of Cleared Life, MOD Internal RESTRICTED, TOR 01.
Beauchamp, E. (2006) ‘Learning from Diversity: Model-Based Evaluation of Opportunities for Process (Re)-Design and Increasing Company Resilience’, The Second Resilience Engineering Symposium, Antibes – Juan-Les-Pins, France 8-10 November 2006: Resilience Engineering Association, pp. 23.
Belmonte, F., Schön, W., Heurley, L. and Capel, R. (2011) ‘Interdisciplinary safety analysis of complex socio-technological systems based on the functional resonance accident model: An application to railway traffic supervision’, Reliability Engineering & System Safety, vol. 96, no. 2, pp. 237-249.
Bendat, J. S. (1998) Nonlinear system techniques and applications, New York: Wiley.
Brooker, P. (2011) ‘Experts, Bayesian Belief Networks, rare events and aviation risk estimates’, Safety Science, vol. 49, no. 8–9, pp. 1142-1155.
Cambon, J., Guarnieri, F. and Groeneweg, J. (2006) ‘Towards a new tool for measuring Safety Management Systems performance’, Learning from Diversity: The Second Resilience Engineering Symposium, Antibes – Juan-Les-Pins, France 8-10 November 2006: Resilience Engineering Association, pp. 53.
Carney, P. (2010) Critical Analysis of the Airworthiness Impact of Lean Production Principles in a Depth Maintenance Organisation . MSc thesis, Cranfield University.
Casey, T. (2013) Tornado Continuing Airworthiness Management Exposition (CAME) MOD Internal RESTRICTED, CAMO/CERT/2012/018.
Cilliers, P. (2005) ‘Knowing complex systems’, in Richardson, K. (ed.) Managing Organizational Complexity: Philosophy, Theory, and Application, Greenwich, CT: ISCE Publishing, pp. 7-19.
178
Cooke, P. (2004) Panavia Tornado GR4 [Online], Available at: http://www.airliners.net/photo/UK---Air/Panavia-Tornado-GR4/0636414/L/ (Accessed 5 March 2014).
Coury, B., Kolly, J., Gormley, E. and Dietz, A. (2008) ‘The central role of principal issues in aviation accident investigation’, Proceedings of the Human Factors and Ergonomics Society Annual Meeting, Vol. 52, Sage Publications, pp. 99.
Crown Copyright (2009) 31 Squadron Tornado [Online], available at: http://www.raf.mod.uk/gallery/tornadogallery.cfm?start=1&viewmedia=4#pageContent (Accessed 5 March 2014).
de Carvalho, P. V. R. (2011) ‘The use of Functional Resonance Analysis Method (FRAM) in a mid-air collision to understand some characteristics of the air traffic management system resilience’, Reliability Engineering & System Safety, vol. 96, no. 11, pp. 1482-1498.
De Landre, J., Gibb, G. and Walters, N. (2006) ‘Using Incident Investigation Tools Proactively for Incident Prevention’, Meeting of the Australian and New Zealand Society of Air Safety Investigators. Australia: Australian and New Zealand Society of Air Safety Investigators [Online]. Available at: http://asasi.org/papers.htm (Accessed 13 November 2013).
Dekker, S. (2003) ‘When human error becomes a crime’, Human Factors and Aerospace Safety, vol. 3, pp. 83-92.
Dekker, S. (2005) ‘9 Why we need new accident models’, Contemporary issues in human factors and aviation safety, pp. 181.
Dekker, S., Cilliers, P. and Hofmeyr, J. (2011) ‘The complexity of failure: Implications of complexity theory for safety investigations’, Safety Science, vol. 49, no. 6, pp. 939-945.
Dudman, D., ( 2012) ‘No 1 Group Air Safety Management Plan’, 3rd ed., Royal Air Force Internal, Defence Intranet.
Edwards, J. R. D., Davey, J. and Armstrong, K. (2013) ‘Returning to the roots of culture: A review and re-conceptualisation of safety culture’, Safety Science, vol. 55, no. 0, pp. 70-80.
Espejo, R. (1989) ‘A cybernetic method to study organizations’, The Viable System Model: Interpretations and Applications of Stafford Beer’s VSM, pp. 361-382.
179
Freed and Priday, R. (2008) ‘Annex A to BP 1301 - Initial Report of Serious Occurrence or Fault’, MOD Internal.
Gale, I., Keeling, A. and Strasdin, S., (2013) ‘Perfect Storm’, Air Clues, July [Online], Available at: http://www.raf.mod.uk/rafcms/mediafiles/3AE4263C_5056_A318_A883FF5D10B24E91.pdf
Grøtan, T. O., Størseth, F. and Albrechtsen, E. (2011) ‘Scientific foundations of addressing risk in complex and dynamic environments’, Reliability Engineering & System Safety, vol. 96, no. 6, pp. 706-712.
Haddon-Cave, C. (2009) The Nimrod Review, London: The Stationary Office.
Hale, A. and Borys, D. (2013) ‘Working to rule or working safely? Part 2: The management of safety rules and procedures’, Safety Science, vol. 55, no. 0, pp. 222-231.
Heinrich, H. W., Petersen, D. and Roos, N. (1950) Industrial accident prevention, McGraw-Hill:New York.
Herrera, I. (2012) Proactive safety performance indicators. PhD thesis Norges teknisk-naturvitenskapelige universitet, Institutt for produksjons- og kvalitetsteknikk [Online]. Available at: http://urn.kb.se/resolve?urn=urn:nbn:no:ntnu:diva-16990.
Hitchens, D. (2003) Advanced Systems Thinking, Engineering and Management, 1st ed, Artech House: Norwood.
Hodson, C. J. (2008) Civil Airworthiness for a UAV Control Station. MSc thesis. University of York [Online]. Available at: http://www-users.cs.york.ac.uk/~mark/projects/cjh507_project.pdf
Hollnagel, E. (2011) Resilience engineering in practice: A guidebook, Farnham, Surrey: Ashgate Publishing.
Hollnagel, E. (2012) FRAM: The Functional Resonance Analysis Method Modelling Complex Socio-Technical Systems, Farnham, Surrey: Ashgate Publishing.
Hollnagel, E. (2014) The Functional Resonance Analysis Method, 20 March [Online] Available at: www.functionalresonance.com.
Hollnagel, E. and Woods, D.(2005) Joint cognitive systems: Foundations of cognitive systems engineering, NW: CRC Press.
Hollnagel, E., Woods, D. and Leveson, N. (2007) Resilience Engineering Concepts and Precepts, Farnham, Surrey: Ashgate Publishing.
180
Hounsgaard, J. (2013) Using FRAM as a Quality Improvement Tool in Health Care [Online], available at: http://functionalresonance.com/onewebmedia/FRAMily_2013_Hounsgaard.pdf
ICAO ( 2001) Annex 13 to the Convention on International Civil Aviation - Aircraft Accident and Incident Investigation, 9th ed., ICAO [Online]. Available at: http://www.cad.gov.rs/docs/udesi/an13_cons.pdf.
Jeffery, D. (2009) Tornado Configuration Control and Impact on Continued Airworthiness, QinetiQ RESTRICTED, QINETIQ/MS/SES/CR0902379/1.
Johansson, B. and Lindgren, M. (2008) ‘A quick and dirty evaluation of resilience enhancing properties in safety critical systems’, Proceedings of the third symposium on resilience engineering, Juan-les-Pins, France, pp133.
Johnson, C. and Holloway, C. (2004) ‘On the over-emphasis of human ‘error’ as a cause of aviation accidents: ‘systemic failures’ and ‘human error’ in US NTSB and Canadian TSB aviation reports 1996–2003’, Proceedings of the 22nd International System Safety Conference (ISSC). Providence, RI: Systems Safety Society, Citeseer .
Kelly, T. P. and McDermid, J. A. (1999) ‘A Systematic Approach to Safety Case Maintenance’, Computer Safety, Reliability and Security 18th International Conference, SAFECOMP’99. Tolouse, France: Springer, pp. 13-26.
Kontogiannis, T. and Malakis, S. (2012a) ‘Recursive modelling of loss of control in human and organizational processes: A systemic model for accident analysis’, Accident Analysis & Prevention, vol. 48, no. 0, pp. 303-316.
Kontogiannis, T. and Malakis, S. (2012b) ‘A systemic analysis of patterns of organizational breakdowns in accidents: A case from Helicopter Emergency Medical Service (HEMS) operations’, Reliability Engineering & System Safety, vol. 99, no. 0, pp. 193-208.
Le Coze, J. (2013) ‘New models for new times. An anti-dualist move’, Safety Science, vol. 59, no. 0, pp. 200-218.
Leonhardt, J., Macchi, L., Hollnagel, E. and Kirwan, B. (2009) A White Paper on Resilience Engineering for ATM, EUROCONTROL [Online], Available at: www.eurocontrol.int.
Leveson, N. (2011) Engineering a safer world: Systems thinking applied to safety, London: MIT Press.
Lloyd, E. and Tye, W. (1982) Systematic safety, London: Civil Aviation Authority.
181
Lundberg, J. (2008) ‘FRAM as a risk assessment method for nuclear fuel transportation’, 3rd IET International Conference on System Safety. 20 – 22 October. NEC, Birmingham: Institute of Engineering and Technology.
Luxhøj, J. T. (2003) Probabilistic Causal Analysis for System Safety Risk Assessments in Commercial Air Transport, Department of Industrial and Systems Engineering, Rutgers University [Online]. Available at: shemesh.larc.nasa.gov/ira03/p02-luxhoj.pdf
Luxhøj, J. T. and Williams, T. P. (1996) ‘Integrated decision support for aviation safety inspectors’, Finite Elements in Analysis and Design, vol. 23, no. 2–4, pp. 381-403.
MAA, (2011a) Air Safety Information Management System User Manual, MAA [Online]. Available at: http://www.maa.mod.uk/linkedfiles/occurrence_reporting/20111005asims_user_guide_v42_finalu.pdf.
MAA (2011b) Missing Rigging Pin, asor\Lossiemouth - RAF\XV(R) Sqn\Tornado\11\9110, MAA: Air Safety Information Management System (MOD Internal System).
MAA, (2012a) Gen1000 Series Regulatory Articles, 2nd ed MAA [Online]. Available at: http://www.maa.mod.uk/linkedfiles/regulation/gen1000seriesprint.pdf.
MAA, (2012b) MAA02: Military Aviation Authority Master Glossary, Issue 3 ed., ed MAA [Online]. Available at: http://www.maa.mod.uk/linkedfiles/regulation/maa02.pdf.
MAA, (2013a) RA 1205 – Air System Safety Cases, 2nd ed., ed MAA [Online]. Available at: http://www.maa.mod.uk/linkedfiles/regulation/gen1000seriesprint.pdf.
MAA, (2013b) RA 1210 – Ownership and Management of Operating Risk (Risk to Life, 2nd ed., ed MAA [Online]. Available at: http://www.maa.mod.uk/linkedfiles/regulation/gen1000seriesprint.pdf.
Madni, A. M. and Jackson, S. (2009) ‘Towards a Conceptual Framework for Resilience Engineering’, Systems Journal, IEEE, vol. 3, no. 2, pp. 181-191.
Manson, S. M. (2001) ‘Simplifying complexity: a review of complexity theory’, Geoforum, vol. 32, no. 3, pp. 405-414.
Mason, M. (2012) Tornado Weapon System Safety Case Report Issue 1, EFIPT-ABW/06/01/13/06, MOD: Internal (RESTRICTED).
182
McDonald, N. (2008) ‘Challenges facing Resilience Engineering as a Theoretical and Practical Project’, Proceedings of the third symposium on resilience engineering, Juan-les-Pins, France, pp205-2010
McKenzie, K. (2012) MR.2 XV230 in the circuit at Kinloss in 2000, available at: http://www.aeroflight.co.uk/wp-content/uploads/2010/03/XV230-02.jpg (Accessed 8 October 2013).
MOD (2007) Safety Management Requirements for Defence Systems, Defence Standard 00-56, Issue 4, MOD.
MOD (2013a) Tornado Local Instruction - Equipment Risk Management, LI BS0056 Version 1.2, MOD: Internal (RESTRICTED).
MOD (2013b) Tornado Continuous Airworthiness Management Exposition, CAMO/CERT/2012/018, MOD: Internal (RESTRICTED).
Nathanael, D. and Marmaras, N. (2006) ‘The interplay between work practices and prescription: a key issue for organizational resilience’, Proceedings of the second symposium on resilience engineering, Juan-les-Pins, France pp. 229.
Oliver, D., Kelliher, T. and Keegan Jr, J. (1997) Engineering Complex Systems, McGraw-Hill.
Oxstrand, J. and Sylvander, C. (2010) ‘Resilience engineering: Fancy talk for safety culture: A Nordic perspective on resilience engineering’, Resilient Control Systems (ISRCS) 2010 3rd International Symposium on, IEEE, pp. 135.
Pasman, H. J., Knegtering, B. and Rogers, W. J. (2013) ‘A holistic approach to control process safety risks: Possible ways forward’, Reliability Engineering and System Safety, vol. 117, pp. 21-29.
RAeS (2013) ‘The Way We Do Things Around Here’ Culture in The Aviation Maintenance and Engineering Environment, Royal Aeronautical Society [Online]. Available at: http://aerosociety.com/Assets/Docs/Events/728/728Programme.pdf (Accessed 8th March).
Rasmussen, J. (1997) ‘Risk management in a dynamic society: a modelling problem’, Safety Science, vol. 27, no. 2–3, pp. 183-213.
Reason, J. (1997) Managing the Risks of Organizational Accidents, 1st ed, Farnham, Surrey: Ashgate.
Reason, J. T. and Hobbs, A. (2003) Managing maintenance error: a practical guide, Farnham, Surrey: Ashgate.
183
SAE, (1996) Guidelines and Methods for Conducting the Safety Assessment Process on Civil Airborne Systems and Equipment, ARP476, 1st ed. Washington: Society of Automotive Engineers.
SAE (2010) Guidelines for Development of Civil Aircraft and Systems, ARP4754 Rev A, Washington: Society of Automotive Engineers.
Saleh, J. H., Marais, K. B., Bakolas, E. and Cowlagi, R. V. (2010) ‘Highlights from the literature on accident causation and system safety: Review of major ideas, recent contributions, and challenges’, Reliability Engineering & System Safety, vol. 95, no. 11, pp. 1105-1116.
Salmon, P. M., Cornelissen, M. and Trotter, M. J. (2012) ‘Systems-based accident analysis methods: A comparison of Accimap, HFACS, and STAMP’, Safety Science, vol. 50, no. 4, pp. 1158-1170.
Saurin, T. A. and Carim Junior, G. C. (2012) ‘A framework for identifying and analyzing sources of resilience and brittleness: A case study of two air taxi carriers’, International Journal of Industrial Ergonomics, vol. 42, no. 3, pp. 312-324.
Schafer, D. (2012) A Resilience Engineering Primer, Michigan State University [Online]. Available at: https://www.msu.edu/~tariq/Resilience%20engineering%20primer.pdf (Accessed 25 October 2013).
Shirali, G. A., Mohammadfam, I. and Ebrahimipour, V. (2013) ‘A new method for quantitative assessment of resilience engineering by PCA and NT approach: A case study in a process industry’, Reliability Engineering and System Safety, vol. 119, pp. 88-94.
Singleton, C. (2009) Tornado Asset Gateway Proof of Concept - Final Report, 20090519_TAGProofOfConceptFinalReport_R, MOD: Internal (RESTRICTED).
Slater, D., (2013) SIOPS - The New HAZOPS?, Cambrensis [Online]. Available at: http://www.cambrensis.org/wp-content/uploads/2012/05/A-System-Integrity-and-operability-Study.pdf. (Accessed 4 April 2014).
Stolker, R., Karydas, D. and Rouvroye, J. (2008) ‘A comprehensive approach to assess operational resilience’, Proceedings of the third symposium on resilience engineering, Juan-les-Pins, France, pp. 28.
Stoop, J. (2013) To Certify, to Investigate or to Engineer, that is the Question, Resilience Engineering Association [Online], Available at: http://www.resilience-engineering-association.org/download/resources/symposium/symposium-2013/Stoop%20(REA%202013).%20To%20certify,%20to%20investigate%20or%20to%20engineer,%20that%20is%20the%20question.pdf. (Accessed 4 April 2014).
184
Sugden, G. (2011) Tornado Loss Model and Loss Model Database - December 2011 Update, BAE-WAW-RP-TOR-TGP-5209, BAE Systems: Internal (RESTRICTED).
Vugrin, E. D., Camphouse, R. C. and Sunderland, D., Quantitative Resilience Analysis Through Control Design, SAND2009-5957, Livermore, CA: Sandia National Laboratories [Online]. Available at: http://prod.sandia.gov/techlib/access-control.cgi/2009/095957.pdf(Accessed 4 April 2014).
Wilson, E. S. (2012) The Interaction Of Organisational, Human and Technology Factors On The Effectiveness Of Safety Management Systems And Value Achieved From Deploying New Technology, PhD Thesis. University of New South Wales [Online]. Available at: unsworks.unsw.edu.au/fapi/datastream/unsworks:10843/SOURCE01 (Accessed 4 April 2014).
Wilson, E. (2008) ‘Toward a model of the impact organisation, human and technology factors have on the effectiveness of safety management systems’, Journal of Achievements in Materials and Manufacturing Engineering, vol. 31, no. 2, pp. 827-836.
Woltjer, R. (2007) ‘A systemic functional resonance analysis of the Alaska Airlines flight 261 accident’, Human Factors and Economic Aspects on Safety, pp. 83.
Woodbridge, K., (2012) Tornado Equipment Safety Management Plan, 8.0th ed., MOD: Internal (RESTRICTED).
Zarboutis, N. and Wright, P. (2006) ‘Using complexity theories to reveal emerged patterns that erode the resilience of complex systems’, Proceedings of the Second Symposium on Resilience Engineering, Juan-les-Pins, France, pp. 1999.
185
Appendix A –TORNADO AIRWORTHINESS FRAM MODEL
The following File was submitted electronically: TASM V1.xls
187
Appendix B – TORNADO AIRWORTHINESS MODEL
VISUALISATION
The following files were submitted electronically:
TASM Visualisation Tool V1.vis (Viso file)
TASM Visualisation Tool V1.pdf (large image of Viso file showing all
layers)
TASM Visualisation Tool V1 BLUEPRINT.pdf (large image of Viso file highlighting
connections)
188
Appendix C – PARTICIPANTS BRIEFING SHEET
RESILENCE ENGINEERING STUDY Thank you for agreeing to take part in this post graduate research study undertaken with Cranfield University. The aim is to improve the management of airworthiness in the RAF using Resilience Engineering principles. What is Resilience? Resilience is the intrinsic ability of a system to adjust its functioning prior to, during, or following changes and disturbances, so that it can sustain required operations under both expected and unexpected conditions. What is Resilience Engineering? Resilience Engineering is the practise of designing or modifying resilience into a system, whether the system is a piece of technology (such as a Tornado) or a complicated organisation (RAF and its contractors). It is a move away from ‘linear’ thinking which has produced overly simplistic models of safety such as the (in)famous ‘Swiss cheese’ or ‘bow ties’. It describes complex systems at a manageable level of detail without discarding critical connections. Principles 1. The orders and instructions we work to never quite match the real world. Individuals and organisations must therefore adjust what they do to match current demands and resources – this is generally an approximation.
2. Some adverse events can be attributed to a breakdown or malfunctioning of components and normal system functions, but others cannot. The latter can best be understood as the result of unexpected combinations of performance variability. 3. Safety management cannot be based exclusively on hindsight (occurrence investigations), nor rely on error tabulation and the calculation of failure probabilities (risk registers). Safety management must be proactive as well as reactive. 4. Safety cannot be isolated from the core business of producing aircraft, nor vice versa. Safety is the prerequisite for productivity, and productivity is the prerequisite for safety. Safety must therefore be achieved by improvements rather than constraining how we work with a multitude of ‘safety barriers’. How? – Understand Combinations of Performance Variability; Functional Resonance The study will map the whole socio-technical system that produces an airworthy Tornado. This includes everything from an AMM servicing a jet; to a design
189
engineer producing a modification; to the fleet planning office. The whole system comprises a variety of processes which are made up of a variety of functions (or activities). Functions are linked together by a variety of aspects – your subject matter expertise is needed to understand the different aspects of your function. Aspects of Functions using an aircraft take-off as an example Input – that which the function processes or transforms or that which starts the function. Clearance to take-off from ATC. Preconditions – Conditions that must exist before a function execution. Aircraft on the runway. Resources – that which the function needs or consumes to produce the output. Aircraft, fuel, etc. Control – How the function is monitored or controlled; plan, programme, instructions. Checklist. Time – temporal constraints affecting the function. Take-off slot. Output – that which is the result of the function, either an entity or a state change finishing time or duration. Aircraft becomes airborne. A Function and its Aspects:
A model built using the Functional Resonance Analysis Method Source: (Leonhardt et al., 2009)