Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
Alarm Management: Real-Time Advanced Techniques
Bill HollifieldPAS Principal Alarm Management and HMI Consultant
Mary Kay O'Connor Process Safety Center International Symposium, October 2011
Slide 2© PAS 2010
About Bill Hollifield
• BSME, MBA
• Industry veteran of 35+ years
• Global experience in
• Alarm Management
• High Performance operator HMI
• Co-author and committee member for :
EPRI AlarmManagement Guidelines(Co-author)
ANSI/ISA 18.2
Alarm Mgt. StandardCommittee Member
EEMUA 191
EEMUA Industry Review Group MemberCommittee member
RP-1167 Alarm Mgt for Pipeline
Systems
Slide 3© PAS 2010
Presentation Contents
• Regulatory Implications of the ISA-18.2 Standard
• Implementing the Basics of Alarm Management
• Advanced Alarm Management Techniques
• Alarm Documentation in the Operator HMI
• Automated Alarm Audit and Enforcement
• Alarm Shelving
• State-Based Alarming
• Alarm Flood Suppression
• Summary and Questions
Slide 4© PAS 2010
The Alarm Problem in a Nutshell
• Thousands of Alarm Events Cannot be Evaluated By The Operator!
• Which alarms are safe to be ignored by the operator?
Alarms Per Day
0
1000
2000
3000
4000
5000
6000
- 8 Weeks -
Recorded
Max. Acceptable (300)
Manageable (150)
Alarms Per Operator Position
0
500
1000
1500
2000
2500
3000
3500
4000
1960 1970 1980 1990 2000
Configured
Alarms Per Operator Position
Slide 5© PAS 2010
ANSI/ISA 18.2 Standard
• Management of Alarm Systems for the Process Industries
• A vital and essential next step for alarm management
• Began in 2003
• Released June 2009
• It includes “the WHAT”:• A framework of alarm management life cycle steps and activities
• Mandatory and recommended practices
• Additional content will be published in follow-up “Technical Reports”
• It does not have “the HOW”• Detailed or specific “How to” guidance
• Work practice examples
• Specific method recommendations or details
Email me for a detailed white paper on understanding and
applying ISA-18.2
Slide 6© PAS 2010
ISA-18.2 Application
• Does ISA-18.2 Apply to You?
• YES – if you have a DCS, SCADA systems, PLCs, or Safety Systems, or anything where an operator responds to alarms!
• Petrochemical, Chemical, Refining, Platform, Pipelines, Power Plants, Pharmaceuticals, Mining & Metals. Also for continuous, batch, semi-batch, or discrete processes.
• Grandfathering
• ISA-18.2 states: “The practices and procedures of this standard shall be applied to existing systems in a reasonable time as determined by the owner/operator.”
• Why should you care? ISA standards are not enforceable (???)
Slide 7© PAS 2010
ISA-18.2 Regulatory Impact
• ISA-18.2 is a “recognized and generally accepted good engineering practice” (RAGAGEP)
• OSHA and other agencies (e.g. FDA, PHMSA) have “general duty” clauses
• “The employer shall document that equipment complies with recognized and generally accepted good engineering practices”
• Regulated industries must, at least, demonstrate that they are doing something “just as good or better” than a standard.
Standards become de facto regulations.
Slide 8© PAS 2010
ISA-18.2 Regulatory Impact
• OSHA Regional PSM Coordinators and the CSB (Chemical Safety Board) are internally distributing ISA-18.2 to their inspectors.
• The IEC is officially adopting ISA-18.2 as a combination international IEC/ISA standard (IEC 62682 Ed. 1.0)
• PHMSA is aware of ISA-18.2, and the API-1167 Recommended Practice for Pipeline Alarm Systems is in close alignment with ISA-18.2.
Standards become de facto regulations.
Slide 9© PAS 2010
ISA-18.2 Regulatory Impact
• From an OSHA Presentation, October 2009 at the ISA Expo
• ISA standards and ASME codes are specifically used as the basis of fines and enforcement actions!
OSHA takes RAGAGEP seriously!
Slide 10© PAS 2010
ISA-18.2 Regulatory Impact
• ANOTHER EXAMPLE:
• September 30, 2009: OSHA fines BP Texas City an additional $87 Million from the 2005 explosion
• PDF documents are available at www.galvestondailynews.com
• The OSHA documents reference RAGAGEP
• specifically citing failure to follow ISA Standards and ASME codes as the basis for the fines!
• So - time to get started on Alarm Management
Slide 11© PAS 2010
The PAS Seven Steps
• BASIC
• Step 1: Develop, Adopt and Maintain an Alarm Philosophy
• Step 2: Collect Data And Benchmark Your Systems
• Step 3: Perform “Bad Actor” Alarm Resolution
• Step 4: Perform Alarm Documentation and Rationalization
• ADVANCED
• Step 5: Implement Automated Alarm Audit and Enforcement
• Step 6: Implement Real Time Alarm Management
• Step 7: Control and Maintain Your Improved System
Slide 12© PAS 2010
Step 1: Alarm Philosophy
CONTENTS Of An Alarm Philosophy
1.0 Alarm Philosophy Introduction
2.0 Purpose and Use
3.0 Alarm Definition and Criteria
4.0 Alarm Annunciation and Response
4.1 Navigation and Alarm Response
4.2 Use of External Annunciators
4.3 Hardwired Switches
4.4 Annunciated Alarm Priority
5.0 Alarm System Performance
5.1 Alarm System Champion
5.2 Alarm System KPIs
5.3 Alarm Performance Report
6.0 Alarm Handling Methods
6.1 Nuisance Alarms
6.2 Alarm Shelving
6.3 State-Based Alarms
6.4 Alarm Flood Suppression
6.5 Operator Alert Systems
7.0 Alarm Rationalization
7.1 Areas of Impact and
Severity of Consequences
7.2 Maximum Time for Response
and Correction
7.3 Priority Matrix
7.4 Alarm Documentation
7.5 Alarm Trip Point Selection
7.6 The Focused D&R Option
8.0 Specific Alarm Design Considerations
8.1 Handling of Alarms from Instrument
Malfunctions
8.2 Alarms for Redundant Sensors and
Voting Systems
8.3 External Device Health and Status Alarms
8.4 ESD Systems
8.5 ESD Bypasses
8.6 Duplicate Alarms
8.7 Consequential Alarms
8.8 Pre-Alarms
8.9 Flammable and Toxic Gas Detectors
8.10 Safety Shower and Eyebath Actuation Alarms
8.11 Building-Related Alarms
8.12 Alarm Handling for Programs
8.13 Alarms to Initiate Manual Tasks
8.14 DCS System Status Alarms
8.15 Point and Program References to Alarms
8.16 Operator Messaging System
9.0 Management of Change
10.0 Training
11.0 Alarm Maintenance Workflow Process
Plus Appendices
CONTENTS Of An Alarm Philosophy
1.0 Alarm Philosophy Introduction
2.0 Purpose and Use
3.0 Alarm Definition and Criteria
4.0 Alarm Annunciation and Response
4.1 Navigation and Alarm Response
4.2 Use of External Annunciators
4.3 Hardwired Switches
4.4 Annunciated Alarm Priority
5.0 Alarm System Performance
5.1 Alarm System Champion
5.2 Alarm System KPIs
5.3 Alarm Performance Report
6.0 Alarm Handling Methods
6.1 Nuisance Alarms
6.2 Alarm Shelving
6.3 State-Based Alarms
6.4 Alarm Flood Suppression
6.5 Operator Alert Systems
7.0 Alarm Rationalization
7.1 Areas of Impact and
Severity of Consequences
7.2 Maximum Time for Response
and Correction
7.3 Priority Matrix
7.4 Alarm Documentation
7.5 Alarm Trip Point Selection
7.6 The Focused D&R Option
8.0 Specific Alarm Design Considerations
8.1 Handling of Alarms from Instrument
Malfunctions
8.2 Alarms for Redundant Sensors and
Voting Systems
8.3 External Device Health and Status Alarms
8.4 ESD Systems
8.5 ESD Bypasses
8.6 Duplicate Alarms
8.7 Consequential Alarms
8.8 Pre-Alarms
8.9 Flammable and Toxic Gas Detectors
8.10 Safety Shower and Eyebath Actuation Alarms
8.11 Building-Related Alarms
8.12 Alarm Handling for Programs
8.13 Alarms to Initiate Manual Tasks
8.14 DCS System Status Alarms
8.15 Point and Program References to Alarms
8.16 Operator Messaging System
9.0 Management of Change
10.0 Training
11.0 Alarm Maintenance Workflow Process
Plus Appendices
“We don’t need no
stinkin’ rules!”
An Alarm Philosophy is a comprehensive document on “how to do alarms right!”
It is required by ISA 18.2
Slide 13© PAS 2010
What is an alarm?
• Operator Action Is:
• Manipulation of the control system to effect process change
• Directing others to make changes or take actions
• Changing operating mode
• Manual changes
• Begin troubleshooting / analysis of a situation
• Contacting other people or groups regarding a situation
• Logging conditions for later examination, maintenance, or repair
• Operator Action is Not:
• Writing something down in a logbook
• Thinking “OK, That’s nice to know.”
• Thinking “OK, The next shift can deal with that tomorrow.”
• Thinking “OK, the system is working normally.”
An audible and/or visible means of indicating to the operator an equipment malfunction, process deviation, or abnormal condition requiring a response.
Do not use alarm systems for inappropriate things!
Slide 14© PAS 2010
Step 2: Alarm Analysis
Alarm Analysis - Specific Problem Identification
Top 10 Most Frequent Annunciated Alarms
0
20000
40000
60000
80000
100000
120000
140000
160000
180000
43M
V022.B
AD
PV
43M
V006.B
AD
PV
43M
V024.B
AD
PV
43P
AH
397.O
FF
NR
M
43M
V010.B
AD
PV
43M
V018.B
AD
PV
43M
V022.C
MD
DIS
43M
V010.C
MD
DIS
43M
V018.C
MD
DIS
43F
C155.P
VLO
Ala
rm C
ou
nt
0.0
10.0
20.0
30.0
40.0
50.0
60.0
70.0
80.0
90.0
100.0
Cu
mu
lati
ve %
Annunciated Alarms per 10 Minutes
0
100
200
300
400
500
600
700
- 42 Days -
Highest 10-
minute Rate =
852
Alarm Flood =
10+ in 10
minutes
Peak Exceed 700
On-going Analysis is required by ISA 18.2
Slide 15© PAS 2010
Alarm System Performance Targets (From ISA-18.2)
Alarm Performance Metrics per Operator Position
Based upon at least 30 days of data
Metric Target Value
Annunciated Alarms per Time:
Target Value: Very
Likely to be
Acceptable
Target Value:
Maximum
Manageable
Annunciated Alarms Per Day per Operator Position
~150 alarms per
day ~300 alarms per day
Annunciated Alarms Per Hour per Operator Position ~6 (average) ~12 (average)
Annunciated Alarms Per 10 Minutes per Operator
Position ~1 (average) ~2 (average)
Metric Target Value
Percentage of hours containing > 30 alarms ~ <1%
Percentage of 10-minute periods containing >5 alarms ~ <1%
Maximum number of alarms in a 10 minute period 10 or less
Percentage of time alarm system is in a flood
condition ~ <1%
Percentage contribution of the top 10 most frequent
alarms to the overall alarm load
~<1% to 5% maximum, with action plans to
address deficiencies.
Quantity of chattering and fleeting alarms
Zero, action plans to correct any that
occur.
Stale Alarms
Less than 5 present on any day, with
action plans to address
Annunciated or Configured Priority Distribution
3 priorities: ~80% P3, ~15% P2, ~5% P1 or
4 priorities: ~80% P3, ~15% P2, ~5% P1,
~<1% “Priority Critical.” Other special-
purpose priorities are excluded from the
calculations
Unauthorized Alarm Suppression
Zero alarms suppressed outside of
controlled or approved methodologies
Improper Alarm Attribute Change
Zero alarm attribute changes outside of
approved methodologies or MOC
Slide 16© PAS 2010
Top 10 Most Frequent Annunciated Alarms
0
20000
40000
60000
80000
100000
120000
140000
160000
180000
43M
V022.B
AD
PV
43M
V006.B
AD
PV
43M
V024.B
AD
PV
43P
AH
397.O
FF
NR
M
43M
V010.B
AD
PV
43M
V018.B
AD
PV
43M
V022.C
MD
DIS
43M
V010.C
MD
DIS
43M
V018.C
MD
DIS
43F
C155.P
VLO
Ala
rm C
ou
nt
0.0
10.0
20.0
30.0
40.0
50.0
60.0
70.0
80.0
90.0
100.0
Cu
mu
lati
ve %
Step 3: Fix Your “Bad Actor” Alarms!
• The “top 10” alarms usually make up 20% to 80% of the entire alarm system load
• Many types: Chattering, Fleeting, Frequent, Stale, Duplicate, Nuisance Diagnostic, etc.
• The methods are simple to learn and apply.
Exactly How To Solve
Them
Slide 17© PAS 2010
“Bad Actor” Alarms: Expected Gain
• Average system load improvement is ~60% from resolving Bad Actor alarms
PAS Bad Actor Alarm
Work Process Results
Baseline Alarms
Reduction from PAS Bad Actor
Recommendations % Reduction
System 1 339,521 325,423 95.8%
System 2 225,668 133,307 59.1%
System 3 414,887 333,395 80.4%
System 4 64,695 46,749 72.3%
System 5 93,848 71,372 76.1%
System 6 79,434 72,935 91.8%
System 7 482,375 413,094 85.6%
System 8 644,487 593,904 92.2%
System 9 183,312 77,417 42.2%
System 10 106,212 38,566 36.3%
System 11 91,686 29,188 31.8%
System 12 39,305 8,625 21.9%
System 13 33,115 22,646 68.4%
System 14 44,527 24,882 55.9%
System 15 58,049 51,782 89.2%
System 16 13598 4138 30.4%
System 17 21071 8516 40.4%
System 18 20739 13152 63.4%
System 19 5567 2247 40.4%
System 20 1271 868 68.3%
Slide 18© PAS 2010
Step 4: Documentation and Rationalization
• Ensures your actual alarms comply with your alarm philosophy
• Documents your alarms (Set Points, Causes, Consequences, Corrective Actions), creating a Master Alarm Database.
• Required by ISA-18.2Process History
Alarm and Control Configuration
SOPEOPHAZOPEtc…
Process History
0.0
0.2
0.4
0.6
0.8
1.0
1 3 5 7 9
11
13
15
17
19
21
23
25
27
29
31 2 4 6 8
10
12
14
16
Data Points
MW
Plant Experience & KnowledgeProcess, Equipment, Operations, Procedures
P&IDs and Operating Graphics
Alarm Statistical Analysis
ESD / APC Expertise
Fix problemswhile theyare small
Staged approaches
can save money
Slide 19© PAS 2010
Alarm Priority Determination
Typical Grid-Based Priority Determination:
Event costing >$100,000,
notification above Site
Manager level
Event costing $10,000
- $100,000,
notification at Site
Manager level
Event costing <$10,000,
notification only at
Department Head level
No lossCosts or
Value of
Production
Loss
No
effect
No
injury or
health
effect
NONE
Uncontained release of
hazardous materials with
major environmental
impact and 3rd party
impact. Exposed to life-
threatening hazard.
Disruption of basic
services. Impact involving
the community.
Catastrophic property
damage. Extensive
cleanup measures and
financial consequences.
Exposed to hazards
that may cause injury.
Hospitalizations and
medical first aid
possible. Damage
Claims.
Contamination
causes some non-
permanent damage.
Minimal exposure. No
impact. Does not cross
fence line. Contained
release. Little, if any,
clean up. Source
eliminated. Negligible
financial consequences.
Public or
Environment
Lost time injury, or
worker disabling, or
severe injuries, or life
threatening
Lost time recordable
but no permanent
disability. Reversible
health effects (such
as skin irritation).
Slight injury (first aid) or
health effect, no
disability, no lost time
recordable
Personnel
SEVEREMAJORMINORImpact
Category
Event costing >$100,000,
notification above Site
Manager level
Event costing $10,000
- $100,000,
notification at Site
Manager level
Event costing <$10,000,
notification only at
Department Head level
No lossCosts or
Value of
Production
Loss
No
effect
No
injury or
health
effect
NONE
Uncontained release of
hazardous materials with
major environmental
impact and 3rd party
impact. Exposed to life-
threatening hazard.
Disruption of basic
services. Impact involving
the community.
Catastrophic property
damage. Extensive
cleanup measures and
financial consequences.
Exposed to hazards
that may cause injury.
Hospitalizations and
medical first aid
possible. Damage
Claims.
Contamination
causes some non-
permanent damage.
Minimal exposure. No
impact. Does not cross
fence line. Contained
release. Little, if any,
clean up. Source
eliminated. Negligible
financial consequences.
Public or
Environment
Lost time injury, or
worker disabling, or
severe injuries, or life
threatening
Lost time recordable
but no permanent
disability. Reversible
health effects (such
as skin irritation).
Slight injury (first aid) or
health effect, no
disability, no lost time
recordable
Personnel
SEVEREMAJORMINORImpact
Category
Time Available to Respond> 30 Minutes
10 - 30 Minutes3 - 10 Minutes
<3 Minutes
Severity of consequence, plus:
Determines Alarm PrioritySevereMajorMinorNoneTime
Available
HIGHHIGHMEDNo Alarm<3 Min
MEDMEDLOWNo Alarm3-10 Min
MEDLOWLOWNo Alarm10-30 Min
Re-engineer the Alarm for UrgencyNo Alarm>30 Min
Severity of Consequences
Alarm Priority Determination
SevereMajorMinorNoneTime
Available
HIGHHIGHMEDNo Alarm<3 Min
MEDMEDLOWNo Alarm3-10 Min
MEDLOWLOWNo Alarm10-30 Min
Re-engineer the Alarm for UrgencyNo Alarm>30 Min
Severity of Consequences
Alarm Priority Determination
Alarms where operator action is the primary method by which harm to a person is avoided shall be configured at the highest DCS priority
Slide 20© PAS 2010
Advanced Alarm Management Techniques
Do the basics first!
Then, consider the advanced!
Slide 21© PAS 2010
Embed ALARM Information into the HMI
TI-468-02 Column Overhead Temperature
Alarm: PVHI Setting: 120 deg C Priority: 3
Class: Minor Financial Response Time: <15 min
Alarm Consequences: Alarm Causes: Corrective Actions:
Off-spec Production Excess steam Adjust base steam rate
Lowered efficiency Pressure excursion Check pressure and feed
parameters vs. SOP 468-1
Insufficient reflux Adjust reflux per
computation; check controller
for cascade mode
Feed composition variance Check feed composition
120.1deg C3
Slide 22© PAS 2010
Audit / Enforce Proper Alarm Settings
• Alarm Configuration security is often ineffective.
• “Alarm Creep” occurs after D&R unless positive steps are taken.
• Best Practice: Automatically audit alarm settings to ensure they are not improperly changed.
Summary of Changes in Alarms Needing Management
of Change (MOC)
Type of Change Quantity During Analysis Period
Alarm Enable State 79
Alarm Trip Points 181
Alarm Priority 92
Tag Range 121
Tag Execution
Status 175
Total 648
Average Per Day 5.6
Monitoring for Unauthorized Alarm
System Change is Required by ISA-18.2
Slide 23© PAS 2010
Alarm Audit and Enforce
Periodically audit alarm values from DCS, compare to Master Alarm database
Optional and with control:Enforce alarm settings to DCS
PlantStateSuiteServer or custom application
Master Alarm Database Automatically
Generate Exception Reports
Slide 24© PAS 2010
Alarm Suppression and Shelving
• Suppression: A control system capability to temporarily remove an alarm from service
• Operator initiated, usually to temporarily deal with nuisance or inappropriate alarms
• Is often POORLY MANAGED and can have SIGNIFICANT DANGEROUS CONSEQUENCES
• Paper-based systems DO NOT WORK
• Shelving: Alarm Suppression done RIGHT!
• Proper administrative control, appropriate restrictions
• Cannot be overlooked or forgotten about
• Properly tracked
• Call-up lists at shift-change time
• Limited duration of shelving
• Shelve individual alarms, not all alarms on a point
• Reasons and source for shelving are documented
Slide 25© PAS 2010
State-Based Alarming: Does One Size Fit All?
• IF Your Process:
• Makes Multiple Products or Grades
• Uses Multiple Differing Feedstocks
• Has Parallel Operating Trains
• Has Different Modes of Operation
• Runs at Different Rates
• Different plant states will cause nuisance or inappropriate alarms if alarm settings are not properly changed.
• State-based alarming technology, lets you have multiple alarm settings that are optimum and correct for all your operating states.
Detect Plant
State Change
Automatically
Alter Alarm
Settings to
Match New
State
Slide 26© PAS 2010
State-Based Alarming: Implementation
• Define the Operating States
• (Startup, Alternate Products, Alternate Feeds, Half Rates, Idle, etc.)
• Identify alarms that need different conditions (setpoints, logical conditions, priorities, suppression status)
• Record the alternate alarm settings for each state.
• Generally, only a few (<10%) of alarms in a system need state-based modification
• Implement the State Detection Logic and State Modification Mechanisms. (Create your own, or acquire from appropriate commercial source)
Slide 27© PAS 2010
State Detection Logic - ExampleCompressor Example States:
RUNNING (default) and
SHUTDOWN
The State is RUNNING when:
Flow >25 AND
Discharge Pressure > 50 AND
Amps > 20
Otherwise it is SHUTDOWN.
Use any reliable combination, at least 2
sensors are recommended.
Or, detect the SHUTDOWN state – your
choice based on the process and
equipment.
Optionally
include Operator
Confirmation of
State!
Slide 28© PAS 2010
Alarm Modifications – the Shutdown State
CompressorStates:
RUNNING (default) and
SHUTDOWN
When the State is SHUTDOWN, modify or suppress
the following alarms:
Low Flow
Low Discharge Pressure
High Suction Pressure
Low Oil Pressure
Low Amps
Low Speed
Several BAD VALUE alarms
…and so forth – the expected diagnostics plus
closely related, expected process alarms.
Method Acts
as Flood
Suppression
for
Equipment
Trips
Slide 29© PAS 2010
“Shutdown State” Alarm Settings
• “Shutdown” does not mean “turn off the alarms!”
• Safe design – don’t take any high energy alarms out of service.
• Needed to alert you to isolation failures, unexpected reactions and other hazards.
• Better design –tighten the high energy alarm settings to provide an early warning of isolation failures.
Full Rate State
Alarm Settings
Low Level: 5%
Shutdown State
Alarm Settings
High Level: 90%
High Pressure:
250 psig
Low Level: Not
Configured
High Level: 2%
High Pressure:
5 psig
Tank
405
Tank
405
Slide 30© PAS 2010
Requirements of Advanced Solutions
• Advanced (Real-time) Alarm Management Solutions must be coordinated.
• Each advanced function (Shelving, Audit/Enforce, State-based Alarming) should understand and react appropriately to what the other functions are doing.
• Avoid conflict and erroneous error reporting.
• Consider these issues as you decide whether to generate these capabilities by custom programming or acquire a commercial solution with a significant installed base!
Slide 31© PAS 2010
And Don’t Forget the Window to AlarmsLet’s do something about these very poor operator HMIs!
But that’s another topic entirely… on how to do an HMI right!
Slide 32© PAS 2010
Avoid getting to know…
…your regulatory inspectors really well
They just want to help you
Slide 33© PAS 2010
Summary
• Poorly performing alarm systems are contributing factors to major accidents and poor operating performance.
• Proper Alarm System Management and Alarm System Performance is essential to maximum-efficiency operations.
• Effective operator HMIs are a key factor in incident mitigation.
• The solutions to the problems are well known and fully documented.
ANSI/ISA 18.2
Management of
Alarm Systems for
the Process
Industries
“WHAT” “HOW”
Slide 34© PAS 2010
Questions
• Bill Hollifield ([email protected])
• www.pas.com
• +1.281.286.6565
Questions? E-mail me for white papers summarizing
ISA-18.2 and High Performance HMI.