Introduction to FRAM: A Method and its Principles · 2018-05-09 · for various kinds of analyses (reactive, proactive). A FRAM model represents the functions that sufficient and

© Erik Hollnagel, 2014

Introduction to FRAM:A Method and its Principles

Professor Erik Hollnagel

E-mail: [email protected]

University of Southern DenmarkInstitute of Regional Health ResearchOdense, denmark

Region of Southern DenmarkCentre for Quality

Middelfart, denmark


Models and methodsAn analysis of something inevitably involves some assumptions about how that something happens. These assumptions correspond to a model: a simplified explanation of how something can happen and of how the ‘world’ is organised. The organisation usually implies some kind of hierarchical ordering of layers, parts, or components (structural models).

The model defines what the method can be used for, and therefore also sets the limits of the method.

The FRAM is a method to develop a representation or model of how something happens. This model can then be the basis for various kinds of analyses (reactive, proactive). A FRAM model represents the functions that sufficient and necessary for an activity to take place – not when it goes wrong but when it goes right.


The causality credo

Adverse outcomes happen because something has gone wrong. Adverse outcomes therefore have causes, which can be found and treated.

All accidents can be prevented (zero harm).

Find the component that failed by reasoning

backwards from the final consequence.

Accidents result from a combination of active

failures (unsafe acts) and latent conditions (hazards).

Find the probability that components “break”, either alone or in simple combinations.

Look for combinations of failures and latent conditions that may constitute a risk.

Accident investigation Risk analysis


Common assumptions (~ 1970)

The failure probability of elements can be analysed/described individually

The order or sequence of events is predetermined and fixed

When combinations occur they can be described as linear (tractable, non-interacting)

The influence from context/conditions is limited and quantifiable

The function of each element is bimodal (true/false, work/fail)

System can be decomposed into meaningful elements (components, events)


Revised assumptions - 2014

While many adverse events can be attributed to failures and malfunctions of everyday functions, many others must be understood as the result of combinations of variability of everyday performance.

Risk and safety analyses should acknowledge the importance of variability of everyday performance and how this creates conditions that may lead to both positive and adverse outcomes.

Outcomes are determined by relations rather than by factors - performance variability rather than by failure probability.

The function of the system is not bimodal, but everyday performance is – and must be – variable.

Systems cannot be decomposed in a meaningful way (no natural elements or components)

CertificationI

P

C

O

R

TFAA

LubricationI

P

C

O

R

T

Mechanics

High workload

Grease

Maintenance oversight

I

P

C

O

R

T

Interval approvals

Horizontal stabilizer

movementI

P

C

O

R

TJackscrew up-down

movementI

P

C

O

R

T

Expertise

Controlledstabilizer

movement

Aircraft design

I

P

C

O

R

T

Aircraft design knowledge

Aircraft pitch control

I

P

C

O

R

T

Limiting stabilizer

movementI

P

C

O

R

T

Limitedstabilizer

movement

Aircraft

Lubrication

End-play checking

I

P

C

O

R

T

Allowableend-play

Jackscrew replacement

I

P

C

O

R

T

Excessiveend-play

High workload

Equipment Expertise

Interval approvals

Redundantdesign

Procedures

Procedures


Principles for FRAM

The principle of equivalence of successes and failures.

I

The principle of approximate adjustments.

II

The principle of emergence.III

The principle of functional resonance.

IV


I: Equivalence of success and failuresFailure is normally explained as a breakdown or malfunctioning of a system and/or its components.

Resilience Engineering and Safety-II recognises that individuals and organisations must adjust to the current conditions in everything they do. Because information, resources and time always are finite, the adjustments will always be approximate.

This view assumes that success and failure are of a fundamentally different nature (the ‘hypothesis of different causes’).

Performance adjustments

Unacceptable outcomes

Performance variability (approximate adjustments) is also the reason why things sometimes go wrong.

Acceptable outcomes

Performance variability (approximate adjustments) is the reason why everyday work is safe and effective.


The difference can be difficult to define

objectively.Action

Expected outcome

Unexpected outcome

2. An action that leads to the expected outcome, is seen as a correct action.

2. An action that leads to the expected outcome, is seen as a correct action.

3. An action that leads to unexpected outcomes, is classified as an “error”

3. An action that leads to unexpected outcomes, is classified as an “error”

4. In hindsight, the alternative “correct” action is identified.

4. In hindsight, the alternative “correct” action is identified.

Outcome of previous action

Actions and “errors”

1. An action is chosen to fit the current

situation.

1. An action is chosen to fit the current

situation.

"Knowledge and error flow from the same mental sources, only success can tell one from the other."(Ernst Mach, 1838-1916)


II: Approximate adjustments

Availability of resources (time, manpower, materials,

information, etc.) may be limited and uncertain.

People adjust what they doto match the situation.

Performance variability is inevitable, ubiquitous, and necessary.

Because of resource limitations, performance adjustments will always be approximate.

Performance variability is the reason why things sometimes go wrong.

Performance variability is the reason why everyday

work is safe and effective.


If thoroughness dominates, there may be too little time to carry out the actions.

If efficiency dominates, actions may be badly

prepared or wrong

Neglect pending actionsMiss new events

Miss pre-conditionsLook for expected results

Thoroughness: Time to thinkRecognising situation.Choosing and planning.

Efficiency: Time to doImplementing plans. Executing actions.

Efficiency-Thoroughness Trade-Off

Time & resources needed

Time & resources available


No time (or resources) to do it now

Some ETTO heuristics

Looks fineNot really important

Normally OK, no need to check

Will be checked by someone else

Can’t remember how to do it We always do it this way

Idiosyncratic (work related)

Has been checked by someone else

Cognitive (individual)

Judgement under uncertainty

Cognitive primitives (SM – FG)

Reactions to information input

overload and underload

Cognitive style

Collective (organisation)

Negative reporting

Reduce redundancy

Meet “production” targetsReduce

unnecessary cost

Double-bind

We must get this doneMust be ready in time

Must not use too much of X

I’ve done it millions of time before

This way is much quicker

It looks like X (so it probably is X)Reject conflicting

information

Confirmation bias


The wet floor

A mill employee slipped and fell on a wet floor and fractured his kneecap. For more than six years it had been the practice to wet down too great an area of floor space at one time and to delay unnecessarily the process of wiping up.

Slipping on the part of one or more employees was a daily occurrence. The ratio of no-injury slips to the injury was 1,800 to 1.(Heinrich, 1931)


III: Principle of emergence

The variability of normal performance is rarely large enough to be the cause of an accident in itself or even to constitute a malfunction. The variability from multiple functions may combine in unexpected ways, leading to consequences that are disproportionally large, hence produce non-linear effects. Both failures and normal performance are emergent rather than resultant phenomena, because neither can be attributed to or explained only by referring to the (mal)functions of specific components or parts.

Socio-technical systems are intractable because they change and develop in response to conditions and demands. It is therefore impossible to know all the couplings in the system, hence impossible to anticipate more than the regular events. The couplings are mostly useful, but can also constitute a risk.

The Small World Problem


The small world problem

Stanley Milgram (1933-1984)

Travers & Milgram (1969). An experimental study of the small world problem. Sociometry, 32(4), 425-443.

A “target person” (Boston) and three groups of “starting persons” were selected (Nebraska: n=296, Boston: n=100). Target was identified by name, address, occupation, place of work, college & graduation year, military service, wife’s maiden name, hometown. Each starter was given a document and asked to move it by mail toward the target, via first-name acquaintances, who was asked to repeat the procedure.

What is the probability that any two persons, selected arbitrarily from a large population, will now each other, or be linked via common acquaintances?


Stable vs. transient causes

Final effects are (relatively) stable

changes to some part of the system.

Effects are ‘real.’

Causes are assumed to be stable. Causes can be ‘found’ by backwards tracing from the effect. Causes are ‘real.’

Causes can be associated with components or functions that in some way have ‘failed.’ The ‘failure’ is either visible after the fact, or can be deduced from the facts.


Stable vs. transient causes

Final outcomes are (relatively) stable

changes to some part of the system.

Effects are ‘real.’

Causes represent a pattern that existed at one point in time. But they are inferred rather than ‘found.’ Causes are ‘elusive.’

Outcomes ‘emerge’ from transient (short-lived) intersections of conditions and events.

Outcomes cannot be traced back to specific components or functions. Outcomes are emergent because the conditions that can explain them were transient.


IV: Resonance

Natural oscillation

Forcing function

Natural oscillation +

forcing function

Time

Forcing function with same frequency as natural oscillation

Resonance, same frequency but increased amplitude

Natural frequency, fixed amplitude


Signal

Detection threshold

Stochastic resonance

Mixed signal + random noise

Stochastic resonance

Random noise

Detection threshold

Time

Stochastic resonance is the enhanced sensitivity of a device to a weak signal that occurs when random noise is added to the mix.


Performance variability

Time

Functional resonance

For each function, the others constitute the environment.

Every function has a normal weak, variability.

The pooled variability of the “environment” may lead to resonance, hence to a noticeable “signal”

Functional resonance is the detectable signal that emerges from the unintended interaction of the normal variabilities of many signals.


Tacoma Narrows Bridge

July 1, 1940

November 7, 1940


London Millennium Bridge

Opened June 10, 2000

Closed June 12, 2000.Reason: bridge swayed severely as people walked across it.

Reopened after reconstruction, January 2002


FRAM analysis steps

Propose ways to monitor and dampen performance variability (indicators, barriers, design / modification, etc.)4

Describe the actual / potential variability of 'foreground' functions and 'background' functions (context). Identify functional resonance based on potential / actual dependencies (couplings) among functions.

3

Complete the FRAM model by ensuring all defined aspects are described for at least two functions (as Output and as [Input, Precondition, Resource, Control, Time]).

2

Identify the essential functions in the event ('foreground' functions) – when things go right; characterise each using the six basic aspects. 1

Define the purpose of modelling and describe the situation being analysed. An event that has occurred (incident/accident), a possible future scenario (risk), the consequences of a design/modification.

0


Identifying Functions: General

PURPOSE: To find out what went wrong or malfunctioned (cause or root cause). Accident investigations start from the observed (adverse) outcome(s), and trace the developments backwards until an acceptable cause is found.

PURPOSE: A FRAM analysis aims to identify how the system should have functioned (or should function) for everything to succeed (i.e., everyday performance), and to understand the variability which alone or in combination prevented that (or may prevent that) from happening.

MODEL: A FRAM model describes a system’s functions and the potential couplings among functions. The model does not describe or depict an actual sequence of events, i.e., an accident scenario.

INSTANTIATION: An accident scenario can be the result of an instantiation of the model. The instantiation is a “map” of how functions are coupled, or may become coupled, under given – favourable or unfavourable - conditions.


Output

Resources (execution conditions)

Control

Input

Precondition

Time

Describing a FRAM function

That which activates the function and/or is used or transformed to produce the output. Constitutes the link to upstream functions.

That which is the result of the function. Constitutes the links

to downstream functions.

That which is needed or consumed by the function when it is active (matter,

energy, competence, software, manpower).

That which supervises or regulates the function. E.g., plans, procedures, guidelines

or other functions.

System conditions that must be fulfilled before a function can be carried out.

Temporal aspects that affect how the function is carried out (constraint, resource).


Describing the aspects

The aspects of a function are described using the FRAM Model Visualiser (FMV). The FMV provides a structured way of defining, editing, and revising functions.


Identifying Functions: Details

There is no single, correct level of description. A FRAM model will typically comprise functions described on different levels.

If there can be significant variability in a foreground function, then it is possible to go deeper into the analysis of that function, and possibly break it down into subfunctions.

The analysis may go beyond the boundaries of the system as initially defined. If some background function can vary and thereby affect foreground functions “inside” the system, then it should be considered a foreground function.

A FRAM analysis can in principle begin with any function. The analysis will show the need for other functions to be included, i.e., functions that are coupled or linked through various relations. FRAM defines six types of relations.

Where to begin

Level of description

Level of detail

System boundary (stop rule)

Functions are pragmatically labelled as being either foreground or background functions.

Foreground background


Foreground and background (functions)

FRAM uses a distinction between foreground and background functions, which may all affect performance variability. Foreground functions are directly associated with the activity being modelled and may vary significantly during a scenario.Background functions refer to common conditions that may vary more slowly. The distinction between foreground and background functions is relative rather than absolute.A ‘background’ function may be analysed further, and thereby becomes a ‘foreground’ function.

Both sets of functions should be calibrated as far as possible using information extracted from accident databases.


Why do functions vary?

The variability of the output can be a result of:

General principle: Variability of function Variability of output from function.

The performance of a function, hence the output, may also vary due to a combination of the three conditions: internal variability, external variability, and coupling.

The variability of the working environment, i.e., the conditions under which the function is carried out. This can be described as external or exogenous variability.

Influences from upstream functions, where the outputs from upstream functions (as input, precondition, resource, control, or time) may vary.

The variability of the function itself, i.e., a result of the nature of the function. This can be described as internal or endogenous variability.


Simple description of variability

In the FRAM, the variability of the output should be considered relative to its use by a downstream function. Output variability can be described in terms of timing and precision.

With regard to precision, outputs can be imprecise, precise, or acceptable – relative to a downstream function. An imprecise output is either incomplete, inaccurate, ambiguous or misleading so that it does not meet the requirements of downstream functions. A precise output corresponds to the requirements of the downstream functions. An acceptable output can be used by the downstream functions, but requires some adjustment or variability of the downstream functions. These may use additional time and resources, hence increase variability.

With regard to timing, outputs can be produced too early, on time, too late, or not at all. If there is a noticeable delay in the propagation, then the transmission of the output may be described as a function in its own right.


Upstream-downstream couplings

Upstream output variability Input Pre-condition Resource Control Time

Timing Too early

On time

Too late

Omission

Precision Imprecise

Acceptable

Precise


Move lift

RP

I

T

O

C

[Operating manual (35 pages)]

[Lift in tilt-back position]

[Instruction to move lift]

[Lift stored in work area]

Tilt lift toTilt-backposition

RP

I

T

O

C

Preparingwork

RP

I

T

O

C

A FRAM model of everyday operation

[Work planning]

[Competence in operation]

[Lift has been delivered]

[Tilt area clear]

[Platform lowered]

[Outriggers in folded position]


Move lift

RP

I

T

O

C

[Operating manual (35 pages)]

[Lift in tilt-back position]

[Instruction to move lift]

[Lift stored in work area]

Tilt lift toTilt-backposition

RP

I

T

O

C

Preparingwork

RP

I

T

O

C

A FRAM instantiation of the event

[Work planning]

[Competence in operation]

[Monsoon (rain)]

[Tilting lift took too long]

[Lift has been delivered]

[Lift delivered too early]

[Tilt area clear]

[Platform lowered]

[Outriggers in folded position]

[Monsoon rain]


A FRAM instantiation of the situation

TBD

TBD

TBD

TBD TBD

TBD

TBD

TBD

Documents

Introduction to FRAM: A Method and its Principles · 2018-05-09 · for various kinds of analyses (reactive, proactive). A FRAM model represents the functions that sufficient and