Upload
others
View
12
Download
0
Embed Size (px)
Citation preview
Tel: (+44) 01492 879813 Mob: (+44) 07984 [email protected]
1
Summing up
Andy Brazier
2
Overview
Where to go from here
Learn from accidents and incidents.
3
Required approach
Policy – includes human factors aims
Organising – responsibilities and competence
Developing procedures
Competency assurance
Management of change
Planning – ensure degree of effort is
commensurate with risk
Monitor, audit and review
Ensure human failures are properly considered
Ensure root causes are identified
Ensure human factors solutions to human factors problems
4
Develop a task analysis report
Identify critical tasks
Task analysis and HAZOP
Risk control measures
ALARP demonstration
Referred to from the COMAH report
As you may do for a QRA report.
5
Specific requirements
Competence assurance program
Ergonomic standards
Procedures
Interface design
Staffing level assessment
Fatigue assessment and management
Design and procurement procedures.
6
Demonstrate human factors risks are
ALARP
As Low As Reasonably Practicable
Presumption is that you will implement ‘good practice’ risk reduction measures
Need to demonstrate sacrifice is grossly disproportionate to the benefit
Risk reduction would be minimal
Would lead to greater risk else-where
Holistic approach
Risk of the whole facility.
7
Good practice
Examples:
HSC Approved Code of Practice (ACOP)
HSE guidance
Publications from other government departments
Standard (e.g. B.S. & ISO)
Trade association publications
Need to take into account
Individual and societal risks and concerns
Inherent safety, eliminate hazard, avoid risk
Minimal use of procedures and PPE
Clearly defined scope
8
Demonstrating ALARP
Answer these two questions
What more could be done?
Why have we not done it?
9
Could you automate more tasks?
May prevent operator errors
Reduces operators’ opportunities to maintain an understanding of how the plant operates
Introduces opportunities for maintenance errors.
10
Could you provide more automatic
protection?
Can mitigate effects of operating and maintenance errors
Operators can become over-reliant
Increases complexity of operation
Can create a culture where overrides and
inhibits are tolerated
Vulnerable to errors in calibration and failure to
reactivate after maintenance.
11
Could you employ more people?
Increases what can be done in high demand situations
May reduce pressure on teams under normal conditions (less violations?)
Makes training people easier
Increases opportunities for covering absence
You need to demonstrate that:
You have enough people to avoid and respond to major accidents
Staffing arrangements are optimum.
12
Could you provide more procedures?
Writing a procedure does not reduce risk
More procedures increases the likelihood that procedures are not used
For some tasks, a good procedure is a good way of minimising the likelihood of errors
A holistic system of good quality procedures and job aids is likely to be the best solution.
13
Could you provide more training?
More of the same or different
Increase understanding of the plat
More opportunities to practice infrequent tasks
You need to demonstrate that:
You know what competencies people need
That your staff have the necessary competencies.
14
Basis for enforcement
Significant gap between necessary measures and controls in place
Risk of recurrence following incident or near miss
Evidence of potentially serious risk from human factors issues
Lack of expertise where there is a substantive human factors issue.
15
Have enforced because of
Organisational change
Hours of work
Workload and staffing
Competence assurance
Human factors risk assessment for batch process
No appeals on noticed issued to date
16
Expectations
Appropriate balance between hardware and human issues
Ensure human contribution to major accidents is considered, not just personal safety
Include management, technical and support staff, as well as operators
Consider systems.
17
Process safety performance indicators
Effectiveness of training programme
Frequency of accidental releases
Process disturbances
Activations of protective devices
Time taken to detect and respond to events
Component malfunctions
Outstanding maintenance or inspection
Procedure reviews
Occurrences of staffing levels below minimum
Non-compliance of maximum working times
18
“An airline would not make the mistake of measuring air safety by looking at the
number of routine injuries occurring to it staff”
A. Hopkins - Lessons from Longford
19
Learn from accidents and incidents
Investigate, analyse and share
Cannot learn everything from your own Not
many significant events
Limited resources to investigate
Internal mindset
Look at one incident at a time
Don’t
Dismiss because hazards, equipment, controls are different
Skim through the headlines only
Focus on the last big one.
20
All major accidents
One or more ‘fatal errors’
Conditions that made the error likely
System failures contributing to the accident’s
likelihood and consequence
All accidents preceded by similar near misses
Management did not recognise the warning signs.
21
BP Texas CityProcess industry has, quite rightly, looked carefully at this accident
It seemed as if, to some people, the causes were novel and unheard of in the industry
I believe the reports actually reflect the current consensus of what causes major accidents.
22
Piper Alpha
Permit to work failures
Well established system
Compliant
Not working in practice
23
Procedures are essential but…
It is easy to be reassured that written systems and procedures are being used
No news is good news?
People think they are following the procedure but have not actually understood what is required
People think the procedure is only a guide
People daren’t say they don’t follow the procedure
Assume people will adapt & take short cuts
Audit what people do, not just the paper.
24
Chernobyl
Communication failures
Management secretive about design weaknesses
Operators did not challenge instructions.
25
Error is a natural part of communication
It is not what you say, it is what people think you mean
Some messages are taken literally
Other times people ‘read between the lines’
If people are not told about problems
They will make the wrong decisions
Will not understand why they need to follow procedures
More/better communication is required when unusual events are happening.
26
Clapham Junction
Technician errors
Highly trained
Experienced.
27
Training ≠ Competence
Training courses have limited impact
Most learning is achieved ‘on the job’
Needs to be planned
Trainees need to be supervised
Time served does not replace the need for competence assessment
Competent people still make mistakes
Given more complex and demanding tasks
Indispensable means less able to take a break.
28
Herald of Free Enterprise
Door left open
Ship’s Master did not know
Vulnerable design
29
Layers of protection
Understand
How many?
Are they independent?
Don’t assume they will work
Always obtain positive indications of operation
Make sure people understand their safety responsibilities
Learn from near misses
Not just failures, but also what prevented an accident
If you don’t act, people will assume all is safe.
30
Bhopal
Methyl Isocyante
Runaway reaction
Unable to contain vapours
31
Reduced throughput does not mean
reduced risk
Delaying maintenance
Reduced budget or staff
People get used to systems being inoperable
People are more interested in plants that make money
High rate is more likely to be steady state.
32
Mexico City
Fractured pipe
Slow response
Too late to prevent escalation
33
Detect → Diagnose → Respond
Have to succeed in all three stages
AND not OR gate logic
Prompt alarms
Competent people
Plant knowledge and understanding
Decision making
Resources
People
Equipment.
34
BP Texas City
People in the wrong place at the wrong time
Trailers in plant area
Area not cleared during start up
Slow to raise the alarm
A good safety record has its downside.
35
Generic Learning
Big accidents start small
Accidents occur most during unusual circumstances
If you haven’t got it, it can’t hurt you
Keep people away from hazards
Written systems & procedures provide poor risk control
Most learning is on the job
Error is a natural part of communication
People who are tired make more mistakes
Safety devices can create complacency
Don’t assume safety devices are working.
36
Generic Learning (cont.)
Everyone needs to act if they know something is unsafe
You need to challenge your emergency arrangements
People must be prepared to raise the alarm
Anyone who may have to deal with the consequences of an accident has to know what they are dealing with
Make sure you learn from near misses
All incidents have multiple causes and this should be seen in your investigations
Don’t overlook sabotage
Non-operational parts of the business can be hazardous
Don’t believe your safety is good (enough).
37
Conclusions
Before major accidents most managers didn’t
have particular concerns about safety
Not perfect, but did not foresee the risk
Reassured that systems were in place without having good evidence that they were effective
Only heard or listened to good news
The biggest risks occur because of the errors and poor judgements made by those managers
High reliability organisations expect failures High reliability organisations expect failures
and so work hard to avoid themand so work hard to avoid them
38