Upload
userscribd2011
View
225
Download
0
Embed Size (px)
Citation preview
8/10/2019 Otc 18504
1/6
Copyright 2007, Offshore Technology Conference
This paper was prepared for presentation at the 2007 Offshore Technology Conference held inHouston, Texas, U.S.A., 30 April3 May 2007.
This paper was selected for presentation by an OTC Program Committee following review ofinformation contained in an abstract submitted by the author(s). Contents of the paper, aspresented, have not been reviewed by the Offshore Technology Conference and are subject tocorrection by the author(s). The material, as presented, does not necessarily reflect anyposition of the Offshore Technology Conference, its officers, or members. Papers presented atOTC are subject to publication review by Sponsor Society Committees of the OffshoreTechnology Conference. Electronic reproduction, distribution, or storage of any part of thispaper for commercial purposes without the written consent of the Offshore TechnologyConference is prohibited. Permission to reproduce in print is restricted to an abstract of notmore than 300 words; illustrations may not be copied. The abstract must contain conspicuousacknowledgment of where and by whom the paper was presented. Write Librarian, OTC, P.O.Box 833836, Richardson, TX 75083-3836, U.S.A., fax 01-972-952-9435.
AbstractThis paper shows how to deal properly with "Safety IntegrityLevels" (SIL) as per IEC 61508 [1] and 61511 [2] for "High
Integrity Protection Systems" (HIPS) which are more and
more extensively used in oil industry to replace traditional
protection systems. If IEC 61508/511 are rather efficient froman organizational point of view, some difficulties
unfortunately exist at definition and calculation levels. The
formulae proposed in part 6 of IEC 61508 are, for example,
not really tractable for actual industrial systems. This paperdescribes the probabilistic methods and tools that we have
developed in our company to overcome the above difficulties.
Three main conventional methods are investigated: "Fault
Trees" which, when properly handled, are very efficient forlow demand topside HIPS, markovian approach which is
interesting but tractable only for very small systems and
Monte Carlo simulation on behavioural models (Petri Nets or
AltaRica Data Flow formal language) which is efficient in any
cases. Results are given on simple examples in order to showthe principles of the various approaches. It is interesting to
notice that using those approaches is simpler than what is
proposed in the standards. Therefore, until the publication of
an updated version improving IEC 61508 part 6, it seemsbetter to replace it by sound conventional methods and tools
adapted to SIL calculations for production systems. We have
began to disseminate this approaches toward our contractors.
IntroductionIn the oil industry, the traditional protection systems defined
in API 14C are more and more often replaced by safety
instrumented systems: the so-called HIPS (High IntegrityProtection Systems). Therefore, according to IEC 61508 and
IEC 61511 Standards, their SILs (Safety Integrity Levels)
shall be calculated
Unfortunately, when using above standards some
difficulties arises [3, 4]. They often remain ignored by those
who perform SIL studies and the main ones are the next:
1. insufficient failure taxonomy and definitions,2. tests and maintenance procedures handling,3. introduction of the Safe failure Fraction (SFF) which
is not a relevant concept,
4. probability of Failure on Demand (PFD) andProbability of Failure per Hour (PFH) Calculations.After presenting briefly the 3 first problems, the 4
th one
will be detailed more in depth to show what we have done to
cope with the various SIL assessment problems encountered in
the oil industry:1. topside HIPS easily tested and maintained,2. subsea HIPS difficult to test and maintain,3. preventive HIPS.
According to the standards topside and subsea HIPS areso-called "low demand mode" safety instrumented systems
(SIS) while preventive HIPS are so-called "continuous" mode
SIS. This paper is mainly focused on methods and tools
devoted to low demand mode HIPS.
Failure taxonomyIn IEC 61508 and 61511 standards the failures are split into
dangerousor safeand detectedor undetected.This is a littledifferent of the classical failure taxonomy:
safeversus unsafe,
revealedversus hidden,
time dependantversus on demand.If the dangerous failure definition is very similar to the
classical unsafefailure (i.e. a failure which tends to inhibit the
safety function) this is not the case for the safefailure. In thestandards it is only a failure which is not dangerous when in
the traditional approach this is a failure which tends to
anticipate the safety action.
The classification "detected versus undetected" of the
standard is similar to "revealedversus hidden". The problem is
that the users reading the standards too quickly thought thatthey can assimilates straightforwardly revealed failures with
safe failures. Of course this is generally not true.
Among the third class of failures, only the time dependanfailures are recognized by the standards. The true "on demand
failure" are completely ignored and, even worse, are hidden
behind the term Probability of Failure on Demand (PFD
which encompasses only time dependant failures occurred
during the test interval. This is a big problem as those failures
OTC 18504
High-Integrity Protection Systems (HIPS): Methods and Tools for Efficient SafetyIntegrity Levels Analysis and CalculationsJean-Pierre Signoret, Total
8/10/2019 Otc 18504
2/6
2 OTC 18504
which are likely to arise each time a demand (including tests)
produces a change in the states of some items (ex. rupture of
the spring of a relay, blockage of a valve, ...) cannot bedetected by any test.
Low demand versus continuous demand modesThe standards identify two modes of functioning: SIS working
in low demand mode of operation and SIS working in highdemandor continuousmodes. The calculation of the so-called
Probability of failure on Demand (PFD) is required for thefirst ones when the calculation of the so-called Probability of
Failure per Hour(PFH) is required for the second ones.
When the demand frequency is low compared to the test
frequency, a failure occurring during the test interval is likelyto be detected and repaired before the occurrence of a demand.
As the SIS behaves almost independently of the demand from
the Equipment Under Control (EUC), the probability of
accident is equal to the SIS "average unavailability"
multiplied by the demand frequency. Then, the PFD of thestandards is simply the traditional unavailability of the
classical approach.When the demand increases until becoming of the same
order or higher than the test frequency, there are almost no
chances to detect and repair a failure before a demand occurs.
If the demand becomes continuous, the probability goes even
to 0. Then, an accident occurs as soon as the SIS fails and theprobability of accident is equal to the unreliabilityF(T) of the
SIS over [0, T]. Contrary to above, this cannot be directly
assimilated to the PFH as per the standards and it is more
difficult to find a sound equivalent in the classical approach.The simplest way may be is to consider PFH = F(T)/T.
Then, except the use of a new name for a classical
parameter, there is no problem for low demand mode as aclassical approach may be used. This is more difficult for high
demand or continuous mode where the standards introduce thenew PFH concept which has no clear mathematical definition.
Therefore, it is a good idea to come back to the sound
probabilistic concepts of unavailability and reliability whenassessing the SIL of safety instrumented systems and this is
what we do in our Company when dealing with our HIPS.
Models and toolsGeneralities
Probabilistic calculations are described in part 6 of IEC61508 which gives a list of simplified formulae for some
particular cases and describes some examples about the
mixing of several components to model systems.
Unfortunately the method used to establish the formula is notprovided nor the underlying hypotheses under which formulaeare valid. This would not be a problem as part 6 is only
informative and its content is not intended to cope with all
problems encountered and there is no obligation to use it. The
problem arises because, instead of considering this part assimple information, a lot of users use it as if it was normative.
They trust that they just have to apply it to obtain relevant
results and even worse some providers have developedsoftwares based on that. Then everybody, without the tiniest
idea of what a probability may be, allows himself to perform
SIL calculations ... This is very dangerous indeed!
What is presented in part 6 doesn't reflect the state of the
art in probabilistic calculations for industrial systems. This is
not really a method of analysis and the underlyingmathematical background and hypothesis are not clearly
stated. Using them without understanding the hypothesis is
likely to produce non conservative results and this is not
acceptable from a safety point of view.
Three years ago, we have noticed that the SIL studies fromour contractors were very poor and have diagnosed that the
common cause failurewas part 6 of IEC 61508. This is whywe have decided to develop a sound methodology from the
method and tools currently in use in house since the early
eighties:
fault treeapproach because it is a method widelyused by most of our reliability contractors,
markovian approach because it is sometimesknown by our contractors,
behaviouralmodelling(Petri nets or AltaRica DFlanguage) andMonte Carlosimulationbecause i
is solving all the difficulties encountered.
Single component analysis
For a single component with a dangerous undetected failure
rate and a test interval, IEC 61508 part 6 gives the
traditional widely used formula:
PFDavg /2 (1)
This very simple formula is valid only when the underlying
hypothesis is met: EUC stopped both during tests and
maintenance. Unfortunately this is almost never true for actuaindustrial systems for which the use of formula 1 leads to non
conservative results ...In fact a lot of other parameters have to
be taken into consideration to properly model components as
actually used in industry. For example:
: repair rate,
: on demandfailure probability,
: test staggering,
: test duration.With the above parameter, PFDavg becomes:
PFDavg /2 +/ + /(.) + / (2)
This is more complex than formula n1! Test staggering has
no effect on the average but other parameters may be
considered like test coverage or human errors. A thoroughanalysis of the component is needed to identify which
parameters to handle according to the actual study.
Figure 1 : PFD(t) of a single component
As shown on figure 1, PFDavg is not a good representation of
the component behaviour because its unavailability PFD(t) is a
0 .0 1 00 0 2 00 0 3 00 0 4 00 0 5 00 0 6 00 0 7 00 0 8 00 0
0.0e+0
2.0e-2
6.0e-2
1.0e-1
1.4e-1
1.8e-1
2.2e-1
Time
= 5. 10-5 h-1
= 0.01
= 0.05
= 4380 h
= 2190 h
= 10 h
SIL0
SIL1
PFDavg
8.12 10-2
0 .0 1 00 0 2 00 0 3 00 0 4 00 0 5 00 0 6 00 0 7 00 0 8 00 0
0.0e+0
2.0e-2
6.0e-2
1.0e-1
1.4e-1
1.8e-1
2.2e-1
Time
= 5. 10-5 h-1
= 0.01
= 0.05
= 4380 h
= 2190 h
= 10 h
SIL0
SIL1
PFDavg
8.12 10-2
8/10/2019 Otc 18504
3/6
OTC 18504 3
time dependant saw-tooth curve which may spread over
several SIL zones. On figure 1, 29% of the time is spent in
"SIL0" when PFDavg gives SIL1. If an averageis a very goodaggregated parameter for a cloud of dots, it may give
misleading indications for continuous curves. On the figure
above, 3.5 months are spent in "SIL0" before each of the tests.
Figure 2 shows in detail what happens when a test is
performed. The jump corresponds to the on demandfailuredue to the test itself. After that, the test is performed and its
duration is . At the end of the test, there are two possibilities:either the component is available or in revealed failure state
(unavailable). The competition between these two situations
gives the decreasing part of the curve. It reaches its minimum
for the MTTR (i.e. 1/) and after that increases again as
shown on figure 1.
Figure 2 : Detail of the test zone
On figure 2 the component remains available for its safetyfunction during the tests but if it is tested off line it would be
unavailable over the whole test duration and a contribution
/would be added to PFDavg as shown on formula 2. Thismay be the main contributor to PFDavg and this is obviously
forgotten when using formula 1. Of course, methods and tools
are needed to draw the previous curves and this is what we are
going to explain now.
Fault Tree (FT) approach
Most of our HIPS are HIPPS (High Integrity Pressure
Protection Systems) operating in low demand mode and
PFDavg has to be calculated according to the standards. Faulttree approach just designed for unavailability calculations
seems to be the right tool to do that. Nevertheless, this shall
be done cautiously because this works only if the leaves (i.e.the failures events) are independent. A strong warning shall be
done here: PFDavg of individual leaves cannot be combinedthrough a FT to calculate the PFDavg of the top event.
Formulae like 1 or 2 shall not be used directly in fault tree
calculation even if it is a common practice implemented insome FT software packages which misled their users in
achieving bad calculations. This is very dangerous as resultsare more and more non conservative as fault toleranceincreases (and higher SIL are targeted) .
Fortunately, PFDavg can be very easily assessed just by
averaging the instantaneous unavailability PFD(t) of the topevent over the relevant period [0,T]. As shown on figure 3,
this may be done just by using the instantaneous
unavailabilities PFDi(t) of each leaves [5].
Various sources of dependencies have to be considered:
limited number of repair teams. This is generallynegligible for safety systems which are reliableand have priority for repair (i.e. the probability to
have to repair 2 safety failures at the same time is
low),
repair at the second failure. This is a strongdependency which cannot be managed by FT,
reliability calculation. This induces strongdependencies between all the leaves and, except
in particular cases, FT are not able to perform
genuine reliability calculations.Therefore, in oil industry, the fault tree approach is mainly
efficient for low demand topside HIPS. It must be usedcautiously for preventive topside HIPS and shall be discarded
for subsea HIPS.
Figure 3: Example of Fault tree
Figure 3 illustrates a very simple system made of 3 identicalcomponents working in 2oo3. Only dangerous undetected
failure rate () and test interval () have been modelled as it isenough to draw two important conclusions:
the difference between PFDavg and the maximumPFD is big ( 2.5 times),
the equivalent failure rate of the system isobviously not constant between tests.
On figure 3, the three components are tested at the same time
but it is interesting to see what happen when tests are
staggered.
Figure 4: Test staggering effect
As shown on this figure, staggering the tests makes PFDavg
and PFDmax decreasing. This is due to two different effects:
the maximum decreases because the tests aremore homogeneously distributed along the time,
the average decreases because the common causefailures(CCF) test frequency has been multipliedby three.
Therefore, staggering the tests is a best way to improve
PFDavg (i.e. SIL), to decrease the spreading of the saw-toothcurve and to diminish the impact of common cause failures
This very important characteristic is completely missed by
IEC 61508 part 6. All what we have presented above has beenintroduced in a special SIL menu of ARALIA Workshop
which is the software that is used in our office.
5. e-3
0. 1000. 2000. 3000. 4000. 5000.
5. e-2
0. 1000. 2000. 3000. 4000. 5000.
5. e-2
0. 1000. 2000. 3000. 4000. 5000.
5. e-2
0. 1000. 2000. 3000. 4000. 5000.
Max : 3.5e-2
Mean : 1.4e-2
= 10%
1 32
CCF
TOP
2oo3
0. 1000. 2000. 3000. 4000. 5000.
2. e-2
1. e-2
PFD(t)
= 1.10-4
= 1000
5. e-3
0. 1000. 2000. 3000. 4000. 5000.
5. e-3
0. 1000. 2000. 3000. 4000. 5000.
5. e-2
0. 1000. 2000. 3000. 4000. 5000.
5. e-2
0. 1000. 2000. 3000. 4000. 5000.
5. e-2
0. 1000. 2000. 3000. 4000. 5000.
5. e-2
0. 1000. 2000. 3000. 4000. 5000.
5. e-2
0. 1000. 2000. 3000. 4000. 5000.
5. e-2
0. 1000. 2000. 3000. 4000. 5000.
Max : 3.5e-2
Mean : 1.4e-2
= 10%
1 32
CCF
TOP
2oo32oo3
0. 1000. 2000. 3000. 4000. 5000.
2. e-2
1. e-2
PFD(t)
0. 1000. 2000. 3000. 4000. 5000.
2. e-2
1. e-2
PFD(t)
0. 1000. 2000. 3000. 4000. 5000.
2. e-2
1. e-2
PFD(t)
= 1.10-4
= 1000
5. e-2
0. 1000. 2000. 3000. 4000. 5000.
= 10%
1 32
CCF
TOP
2oo3
= 1.10-4
= 1000
0. 1000. 2000. 3000. 40005000.
1. e-2
2. e-2
PFD(t)
0 1000 2000 3000 4000. 5000.
5. e-2 5. e-2
0 1000 2000 3000 4000. 5000.
2. e-3
0 1000 2000 3000 4000. 5000.
Max : 1.4e-2
Mean : 7.3e-3
5. e-2
0. 1000. 2000. 3000. 4000. 5000.
5. e-2
0. 1000. 2000. 3000. 4000. 5000.
= 10%
1 32
CCF
TOP
2oo32oo3
= 1.10-4
= 1000
0. 1000. 2000. 3000. 40005000.
1. e-2
2. e-2
PFD(t)
0. 1000. 2000. 3000. 40005000.
1. e-2
2. e-2
PFD(t)
0 1000 2000 3000 4000. 5000.
5. e-2
0 1000 2000 3000 4000. 5000.
5. e-2 5. e-2
0 1000 2000 3000 4000. 5000.
5. e-2
0 1000 2000 3000 4000. 5000.
2. e-3
0 1000 2000 3000 4000. 5000.
2. e-3
0 1000 2000 3000 4000. 5000.
Max : 1.4e-2
Mean : 7.3e-3
6000 6200 6400 6600 6800 7000 7200
5.0e-2
1.0e-1
1.5e-1
2.0e-1
2.5e-1
PFD(t)
Time
6000 6200 6400 6600 6800 7000 7200
5.0e-2
1.0e-1
1.5e-1
2.0e-1
2.5e-1
PFD(t)
Time
8/10/2019 Otc 18504
4/6
8/10/2019 Otc 18504
5/6
OTC 18504 5
token(small circle in black) in the various places(represented
by circles). It is currently running:
from this state the sensor may fail by itself () or by acommon cause failure (message ?DCC received from
another sub PN),
when failed, it enters in a waiting-for-detection state,
the failure is detected only when a rig reach the
location above the subsea platform and performs atest (a token arrives in the place Rig),
when the failure is detected, it has to wait to berepaired until a rig is available to do that (message?StR),
then the repair is started and, when finished, thesensor becomes available again.
Figure 8: PN of a subsea sensor
Figure 9 shows an example of sub Petri nets as they are
actually input into the Petri Net module of our GRIF software
package which implements generalized stochastic Petri netswhich have been enhanced thanks to the use of predicatesand
assertions. This is a very powerful tool that we are using both
for our RAM (Reliability, Availability and Maintainability)and SIL calculations.
Figure 9: PN with predicates and assertions
When using such sub Petri nets, it is rather easy to build step
by step the Petri Net modelling the behaviour of a whole
safety instrumented system like this on figure 6. Of course,
results obtained in this way gives curves which are lesssmooth than those obtained by analytical ways (FT, Markov)
because only few and well chosen points can be calculated.
Anyway, figure 10 is similar to figure 7. On this curve the
90% confidence bounds of the simulation have beenrepresented and we can see that the Monte Carlo simulation is
rather accurate. It has to be noted that this curve has been
drawn only to assess the maximum PFD. The PFDavg which
is straightforwardly calculated just by estimating the time
spent in the failed state gives the same results as fault-tree and
markovian approaches.
Figure 10: Results from Monte Carlo Simulation
The above approach is very powerful for SIL calculations buunfortunately some analysts are reluctant to handle PN
(especially when they think that using the simplistic
calculations of IEC 61508 part 6 is sufficient to do that!). Thiis why we had developed five years ago a tool allowing hiding
Petri nets behind reliability block diagram (RBD) thanks to
the use of libraries of pre-established sub models [9]. Then
we have developed a library of periodically tested componentsto use this tool for SIL calculations. The principle is verysimple:
building a model like the RBD on figure 6,
attributing the relevant sub PN model to each moduleby picking it from the library,
launching the calculations to obtain the results.The overall Petri Net is automatically generated and calculatedand it is not even necessary to have heard about PN to use this
tool! Of course, it is always possible to look at the generated
PN if we want to. Used on the simple HIPS example, this
leads exactly to the same results as presented on figure 10.
ConclusionThe problems encountered when using IEC 61508 and IEC61511 standards may be easily overcame provided that
relevant methods and tools are used. Fortunately, thesemethods and tools exist and some of them have begun to be
developed a long time ago in the early eighties and even in the
seventies. Our company has adapted several reliabilitysoftware packages, ARALIA Workshop, GRIF and
COMBAVA to perform SIL calculations according to what is
presented in this paper. This constitutes a powerful set of tool
able to manage any SIL calculations on sound bases and wehave began their dissemination toward our contractors.
It is rather interesting to notice that, most of the time, it is
easier to perform rigorous calculations by using the righ
methods and tools than trying to apply ad-hoc formulae likethose presented in the standards. Then, until the publication of
updated versions of the standards improving IEC 61508 part 6
it seems better, to forget it and replace it by more accurate andefficient methods like those presented in this paper which
would help to the purpose of SIL calculations of oi
production systems.
References
1. IEC 61508: "Functional safety of electric/ electronicprogrammable electronic safety-related systems. Parts 1-7"(1998, 2000)
PFDavg
PFD(t)
PFDavg
PFD(t)
W
Failure
WaitR
End of Rep
= 0
D
Detection
Start Rep.
Running
Waiting
Failure
detected
Repair
Rig on
location
Rig
?DCC
DCC
?StR
!nbF=nbF+1!nbF=nbF+1
!nbF=nbF-1
?EoR
= 0
= 0 = 0
W
Failure
WaitR
End of Rep
= 0
D
Detection
Start Rep.
Running
Waiting
Failure
detected
Repair
Rig on
location
Rig
?DCC
DCC
?StR
!nbF=nbF+1!nbF=nbF+1
!nbF=nbF-1
?EoR
= 0
= 0 = 0
8/10/2019 Otc 18504
6/6
6 OTC 18504
2. IEC 61511: "Functional safety. Safety Instrumented systems for theprocess sector. Parts 1-3". (2003)
3. Signoret, J-P: "Managing risks in HIPS by making SIL calculationseffective". Published in the proceedings of the seminar
IQPC2006, Aberdeen, Great Britain, (2006).
4. Dutuit,Y., Innal, F., Rauzy, A., Signoret, J-P: "An attempt tounderstand better and apply some recommendations of IEC
61508 standard". Published in the proceedings of theinternational seminar ESREIDA, Trondheim, Norway (2006).
5. Rauzy, A., Dutuit, Y., Signoret, J-P: "Assessment of safety integrity
levels with fault trees". Published in the proceedings of theinternational conferenceESREL, Estoril, Portugal (2006).
6. Signoret, J-P: "Modeling the behavior of complex industrial
systems with stochastic Petri nets". Published in the proceedings
of the international conference ESREL 1998, TrondheimNorway. (1998)
7. Dutuit,Y., Signoret, J-P: "Tutorial on dynamic system modelling byusing stochastic Petri nets and Monte Carlo simulation"Presented at the international conferences Konbin03, Gdansk
Poland and ESREL 2003, Maastricht, the Netherland. (2003)
8. Arnold, A. Griffault, A., Rauzy, A., Point, G. : "TheAltaRica language and its semantics". FundamentaInformaticae, 34 (2000) 109.
9. Signoret, J-P, Chabot, J-L, Hutinet, T.: "Hiding a stochastic Petr
net behind a reliability block diagram". Published in theproceedings of the international conference ESREL, LyonFrance(2002)