Upload
peach5
View
217
Download
0
Embed Size (px)
Citation preview
7/30/2019 Reliability Engineering Lec Notes #6
1/21
46
SYSTEM RELIABILITY MODELING AND PREDICTION -
STATIC METHODS
Reliability Block Diagrams
A reliability block diagram is a graphical procedure which describes the system
operation in terms of successful "signal" transmission between the system units.
1 2
1
2
Two Unit Series System Two Unit ActiveParallel System
Consider a system which consists of two units both of which must function for the system
to function (series system). Assume component failures are statistically independent and
let
A1 : Unit#1 functions at time t; P(A1) = R1(t)
A2 : Unit#2 functions at time t; P(A2) = R2(t)
A : system functions at time t; P(A) = R(t)
Then
P(A) = P(A1A2) = P(A1)P(A2)
=> R(t) = R1(t)R2(t)
If the system functions when either Unit#1 or Unit#2 functions (active-parallel system),
then
P(A) = P(A1 + A2) = P(A1) + P(A2) P(A1)P(A2)
=> R(t) = R1(t) + R2(t) R1(t)R2(t)
If units are identical with constant failure rate
Rseries(t) = e2t
7/30/2019 Reliability Engineering Lec Notes #6
2/21
47
Rparallel(t) = 2 et e2t
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 20
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
._. R(t)=e2 t
__ R(t)=2e t
e2 t
... R(t)=e
t
*t
R(t)
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 20
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
__ F(t)=1e t
+e2 t
... F(t)=12e t
._ . F(t)=1e2 t
*t
F(t)
In general,
R(t) =N
n=1 Rn(t) for N series system
R(t) = 1 N
n=1 [1 Rn(t)] for N parallel(active) system
R(t) =N
n=M N!n! (N n)! [R(t)]n[1 R(t)]Nn for M out ofN(good) system
with N identical units
7/30/2019 Reliability Engineering Lec Notes #6
3/21
48
For a system with N identical, randomly failing units
MTTFseries =
0
dt eNt = 1N
MTTFparallel =
0
dt {1 [1 et]N} = 11
0
dy 1 yN
1 y
=1
1
0
dyN1
n=0 yn = 1
N1
n=0 1n + 1 =
1
N
n=1 1n
MTTFMoutofN =
0
dtN
n=M N!n! (N n)! ent[1 et]Nn
=N
n=M N!n! (N n)!
0
dt ent[1 et]Nn
=1
N
n=M
N!
n! (N n)!
1
0
dy (1 y)n1yNn
Let
In =
1
0
dy (1 y)n1yNn.Then
In = n 1N n + 1
1
0dy (1 y)n2yNn+1 = n 1N n + 1 In1 with I1 =
1
0dy yN1 = 1N
=> In =(n 1)!
N(N 1) . . . (N n + 1)= (n 1)!
(N n)!N!
=> MTTFMoutofN =1
N
n=M 1n
7/30/2019 Reliability Engineering Lec Notes #6
4/21
49
Example 1
A system consists of 7 units connected as shown in the following reliability block dia-
gram. Units 1 through 4 are different (with 2,3, and 4 in active-parallel) and 3 units of
type 5 constitute a 2-out-of-3 system. IfRi(t) (i = 1, 2, . . . , 5) denotes the reliability func-tion of each unit as a function of time, find the reliability function for the system.
1
2
3
4
5
5
5
Solution
R234(t) = R2(t) + R3(t) + R4(t) R2(t)R4(t) R3(t)R4(t) + R2(t)R3(t)R4(t)
R1234(t) = R1(t) + R234(t) R1(t)R234(t)
Also, since for an M-out-of-N system (good) with identical unit
R(M)N(t) =N
n=M N!n! (N n)! [R(t)]n[1 R(t)]Nn,
we have
R(55)5(t)= 3R25(t)[1 R5(t)]+ R
35(t) = 3R
25(t) 2R
35(t)
and the reliability function for the system is
Rsys(t)= R1234(t)R(55)5(t)
7/30/2019 Reliability Engineering Lec Notes #6
5/21
50
Example 2
Find the R(t) and MMTF for the system whose reliability diagram is given below. In cal-
culating MTTF, assume all components are identical and fail randomly with failure rate
.
1
4
2
3
5
Solution
Rsys = R4R(sys|4) + R4R(sys|4)
R(sys|4) = 1 (1 R2)(1 R3)(1 R5)
R(sys|4) = R1(R2 + R3 R2R3)
Rsys = R4[1 (1 R2)(1 R3)(1 R5)] + (1 R4)R1(R2 + R3 R2R3)
If all components are identical and fail randomly with failure rate
Rsys (t) = et[1 (1 et)3] + (1 et)et(2et e2t)
=> Rsys (t) = 5 e2t 6 e3t + 2 e4t
=> MTTF=
0dt Rsys(t) =
1
We could have obtained the same results by choosing the "keystone element" as unit 3,
i.e.
Rsys = R3R(sys|3) + R3R(sys|3)
R(sys|3) = R1 + R4 R1R4
7/30/2019 Reliability Engineering Lec Notes #6
6/21
51
1
4
2
5
Reliability block diagram for Example 2 with component 3 failed
R(sys|3) = R2R(sys|23) + R2)R(sys|23)
= R2 (R1 + R4 R1R4) + R2R4R5
=> Rsys = R3(R1 + R4 R1R4) + (1 R3)[R2 (R1 + R4 R1R4) + (1 R2)R4R5]
= R4[1 (1 R2)(1 R3)(1 R5)] + (1 R4)R1(R2 + R3 R2R3)
Note that if the link were not present
RL = R4R5 + R1(R2 + R3 R2R3)(1 R4R5)
and for all components identical and failing randomly with failure rate
RL (t) = 3 e2t e3t 2 e4t + e5t
=> MTTFL =
0
dt Rsys(t) = 1315which shows that the presence of the link improves the system reliability.
Failure Modes and Effects Analysis (FMEA)
The FMEA was first developed by the aerospace industry in mid 60s. The FMEA
analysis
7/30/2019 Reliability Engineering Lec Notes #6
7/21
52
describes inherent causes of events that lead to system failure,
determines their consequences, and,
devises methods to minimize their occurrence or recurrence.
There are basically two types of FMEA:
Design FMEA is used to evaluate the failure modes and their effects for a product before
it is released to production and is normally applied at the component and subsystem lev-
els. Its objectives are:
Identify failure modes and rank them according to their effect on the product perfor-
mance.
Identify design actions to eliminate potential failure modes or reduce the occurrence
of the respective failures.
Document the rationale behind product design changes.
Process FMEA is used to analyze manufacturing and assembly processes. Its objectives
are to identify:
failure modes that can be associated with manufacturing and assembly process defi-
ciencies,
highly critical process characteristics that may cause the occurrence of particular
failure modes,
sources of manufacturing/assembly process variations.
An example of FMEA for transportation applications (using SEA J1739 FMEA Proce-
dure) is given below. The design controls are:
1. Prevent the failure cause/mechanism or mode from occurring or reduce rate of
occurrence
2. Detect the failure cause/mechanism and lead to corrective actions
3. Detect the failure mode.
7/30/2019 Reliability Engineering Lec Notes #6
8/21
53
7/30/2019 Reliability Engineering Lec Notes #6
9/21
54
RPM: Risk Priority Number
Occurrence Rating Scale
7/30/2019 Reliability Engineering Lec Notes #6
10/21
55
Severity Rating Scale
7/30/2019 Reliability Engineering Lec Notes #6
11/21
56
Detection Rating Scale
Some limitations of FMEA:
Limited insight into probabilistic system behavior.
FMEA is performed for only 1 failure at a time. There may be multiple failure
modes with comparable likelihoods.
Limited insight into the functional relationships between components
Time element in system operation cannot be represented.
7/30/2019 Reliability Engineering Lec Notes #6
12/21
57
Fault Tree/Event Tree Methodology
Fault-trees are logic diagrams that link primary or secondary faults (Basic Events) to
an undesirable event (Top Event).
Example 1
Construct a fault-tree with Top Event "Circuit breaker does not open upon demand" for
the system below:
Control CircuitA
Control CircuitB
Relay A
Relay B
TripCoil
CircuitBreaker
Solution
a
Circuit BreakerMechanism Fails
Closed
Voltage PresentAcross the Trip
Coil
b
Circuit BreakerDoes Not Open
Relay A ContactsStay Closed
Relay B ContactsStay Closed
c d
Relay AFails Closed Control
Circuit AFails On
ControlCircuit BFails On
Relay BFails Closed
7/30/2019 Reliability Engineering Lec Notes #6
13/21
58
Example 2
Construct a fault-tree with Top Event "Latch does not trip" for the system below: HydraulicControl A
Actuator A
HydraulicControl B
Actuator B Linkage Solution
a
b
c d
Latch Does NotTrip
LinkageFails
Extended
Actuators Failto Retract
Actuator AFails to Retract
Actuator BFails to Retract
Actuator AFails
Extended
HydraulicControl A
Fails Extended
HydraulicControl B
Fails Extended
Actuator BFails
Extended
7/30/2019 Reliability Engineering Lec Notes #6
14/21
59
Example 3
For the system of Example 1 find an expression which yields the probability of Top
Event occurrence in terms of the probability of basic event occurrence.
Solution
Let
A: Circuit breaker mechanism fails closed
B: Relay A fails closed
C: Control circuit A fails on
D: Relay B fails closed
E: Control circuit B fails on
a
A
TopEvent
b
c d
B C D E
Then
a
=A
+b c
=B
+C
b = c d d= D + E
which gives
a = A + b = A + cd= A + (B + C)(D + E).
7/30/2019 Reliability Engineering Lec Notes #6
15/21
60
From the rules of Boolean Algebra (or Event Algebra) given in Appendix B:
A + (B + C)(D + E) = A + B (D + E) + C(D + E) (Distributive Law)
= A + BD + BE+ CD + CE (Distributive Law)
Each A, BD, BE, CD, CEis called a cut set(in this case also a minimal cut set). Then
P(a) = P[A] + P[BD] + P[CD] + P[BE] + P[CE] P[BCE]
P[CD(BE+ CE)] P[BD(CD + BE+ CE)] P[A(BD + CD + BE+ CE)]
(using the Commutative and Idempotent Laws)
= P[A] + P[BD] P[ABD] + P[CD] P[ACD] P[BCD] +
P[ABCD] + P[BE] P[ABE] + P[CE] P[ACE] P[BCE] +
P[ABCE] P[BDE] + P[ABDE] P[CDE] + P[ACDE] +
P[BCDE] P[ABCDE]
(using the Associative, Distributive and Idempotent Laws).
It is often reasonable to assume that P[A], P[BD], P[CD], P[BE] and P[CE] are much
larger that the other probabilities (i.e. rare event approximation ) which implies that Top
Event probability is the sum of minimal cut set probabilities, i.e.
P(a) P[A] + P[BD] + P[BE] + P[CD] + P[CE].
Statistical Importance
Statistical importance is a measure of the significance of a given basic event to the Top
Event. IfX is the event of interest, then one definition of statistical importance (Im) is
Im =Pr(Minimal Cut Sets Containing X)
Pr(Top Event)
Example 4
If P(A)=0.001/demand, P(B)=Pr(D)=0.001/demand and P(C)=P(E)=0.005/demand in
Example 2, use the rare event approximation to identify the component that needs most
frequent inspection to prevent the Top Event "Circuit breaker does not open upon
7/30/2019 Reliability Engineering Lec Notes #6
16/21
61
demand".
Solution
This component can be identified as the one with the highest statistical importance to the
Top Event. Then using the rare event approximation from Example 2,
Im(A) =P(A)
P[A] + P[BD] + P[BE] + P[CD] + P[CE]
=0. 001
0. 001 + (0. 001)2 + 2(0. 001)(0. 005) + (0. 005)2= 0. 9653
Im(B)=P(BD) + P(BE)
P[A] + P[BD] + P[BE] + P[CD] + P[CE]
= (0. 001)2
+ (0. 001)(0. 005)0. 001 + (0. 001)2 + 2(0. 001)(0. 005) + (0. 005)2
= 0. 0058
Im(C) =P(CD)+ P(CE)
P[A] + P[BD] + P[BE] + P[CD] + P[CE]
=(0. 001)(0. 005) + (0. 005)2
0. 001 + (0. 001)2 + 2(0. 001)(0. 005) + (0. 005)2= 0. 0290
Im(D)=P(BD)+ P(CD)
P[A]
+P[BD]
+P[BE]
+P[CD]
+P[CE]
=(0. 001)2 + (0. 001)(0. 005)
0. 001 + (0. 001)2 + 2(0. 001)(0. 005) + (0. 005)2= 0. 0058
Im(E) =P(BE) + P(CE)
P[A] + P[BD] + P[BE] + P[CD] + P[CE]
=(0. 001)(0. 005) + (0. 005)2
0. 001 + (0. 001)2 + 2(0. 001)(0. 005) + (0. 005)2= 0. 0290
The results show that the circuit breaker mechanism should be inspected most frequently.Note that we have assumed that the events B, C, D, Eare statistically independent as per
given data.
7/30/2019 Reliability Engineering Lec Notes #6
17/21
62
Event-Trees
Event-trees are used to identify the possible outcomes of a given initiating event and also
to quantify the probability of their occurrence. Event-trees are often used in conjunction
with fault-trees to quantify branch probabilities as illustrated in the example below forfire readiness.
Initiating Evacuation Fire Fire SequenceEvent Containment Control Probability
I
S1
F1
S2
S3
F3
F2S3
F3
S2
S3
F3
F2
S3
F3
P(I)P(S1|I)P(S2|IS1)P(S3|IS1S2)
P(I)P(S1|I)P(S2|IS1)P(F3|IS1S2)
P(I)P(S1|I)P(F2|IS1)P(S3|IS1F2)
P(I)P(S1|I)P(F3|IS1)P(F3|IS1F2)
P(I)P(F1|I)P(S2|IF1)P(S3|IF1S2)
P(I)P(F1|I)P(S2|IF1)P(F3|IF1S2)
P(I)P(F1|I)P(F2|IF1)P(S3|IF1F2)
P(I)P(F1|I)P(F2|IF1)P(F3|IF1F2)
Success (S)
Failure (F)
7/30/2019 Reliability Engineering Lec Notes #6
18/21
63
Root Cause Analysis
Root causes are the most basic causes that can be reasonably identified by experts and
can be corrected so as to minimize their recurrence. Several structured techniques are
used for root cause analysis, including change analysis, barrier analysis, events and causalfactors analysis, tree diagrams, management oversight and risk tree analysis (MORT) and
fishbone diagrams. Some other less structured approaches are process control charts,
trend analyses and Pareto diagrams. Root cause analysis consists of three steps:
1. Determine ev ents and causal factors
2. Code and document root causes
3. Generate recommendations
An example using the tree approach to Step 1 is given below. Step 2 consists of follow-
ing each path to the top event to determine its relevance for the particular incident (e.g by
asking "if not?"). Once root causes are identified corrective and preventive recommenda-
tions are made. For more information on root cause analysis see Ref.[6].
Aerosol Inhalation WhileSpray Painting
Personnel Procedures Material orEquipment
Poor work pactices
Inattention
Lack of supervision
No written procedures
Verbal instructions unclear
Work procedure inadequate
Defective mask
Inadequate ventilation
Statistically Dependent Failures
Statistically dependent failures are defined as events in which the probability of each fail-
ure is dependent on the occurrence of other failures. In general, statistically dependent
7/30/2019 Reliability Engineering Lec Notes #6
19/21
64
failures are handled using Markov models which we will discuss in Dynamic Methods.
However, in systems with redundant identical components static techniques may be used.
We will illustrate the factor methodfor a 2 component parallel system. For generaliza-
tion of the factor method and other methods see Ref.[2].
Consider a 2-component parallel system where each component can individually fail
with rate R or fail due to common cause (e.g. loss of power) with rate C. Then the reli-
ability function for the system is
R(t)= eCt1
1 eRt
2= eCt
2eRt e2Rt
Let
=C
C+ R
C
.
Then C= and R = (1 ) and
R(t)= e t2e(1)t e2(1)t
= 2et e(2)t= et
2 e(1)t
.
The factor method assumes that tis small enough that
et 1 t
and
e(1)t 1 (1 )t.
Then
R(t)= 1 t (1 )(t)2
or
F(t)= 1 R(t)= t+ (1 )(t)2.
Note that since can be interpreted as the probability that component failure occurs to
the common cause event, then the first term gives the probability of system failure due to
common cause event and the second term gives the probability of system failure due to
the non-common cause failure of the components.
New Static Methods
While the fault-tree/event-tree approach is perhaps the most commonly used tech-
nique for system reliability modeling, construction of fault-trees is difficult when the sys-
tem operation involves control loop action. Some alternative techniques that have been
7/30/2019 Reliability Engineering Lec Notes #6
20/21
65
proposed include influence diagrams, directed graphs (digraphs) and the GO-FLOW
methodology. Since the digraph approach can be used to simplify fault-tree construction,
we will illustrate this technique through a simple example. For influence diagrams see
Ref.[7] and for the GO-FLOW methodology see the Supplementary Material under
Course Notes on the web.
Consider the pressure tank system shown below. The switch is normally closed and the
motor drives a pump which feeds air into the tank. The air is discharged through the dis-
charge valve at periodic intervals. A timer set to these intervals opens the contacts before
an overpressure condition occurs and pumping stops. If the timer contacts fail-closed, the
operator observes from the pressure gauge that the tank pressure is high and manually
opens the switch. There are 2 control loops:
Loop 1: Tank, pressure gauge, operator, switch.
Loop 2: Tank, relief valve.
Digraph for the
Pressure Tank SystemPressure Tank System
The digraph is a tool to describe the cause-effect relationships between system compo-
nents and variables. A digraph consists of nodes which represent the system variables
and components and edges which connect the nodes. The numbers represent the direc-
tion and the qualitative magnitude of the gains between the variables. The gains multiply
at the nodes. For example, in tank pressure Ptank increases, the gauge pressure Pgauge
increases (+1 into Pgauge) which alerts the Operator(+1) who then opens the switch (+1)
and reduces the current Iswitch to the switch (-1 +1 = -1). With the switch open, current
7/30/2019 Reliability Engineering Lec Notes #6
21/21
66
Ipump to the pump motor decreases (-1 +1 = -1) and tank pressure stops increasing (-1 +1 = -1). Subsequently,if everything works as designed, an increase in Ptank leads a
decrease through the action of the feedback loop. The fault tree is constructed by consid-
ering the events that cause the loops to lead to the top event.
Pressure TankRupture
Zero Gainthrough Loop 1
OF SFCGS
TFC
RVS
Zero Gainthrough Loop 2
TFC: Timer Contacts Fail ClosedGS: Gauge Stuck
RVS: Relief valve StuckOF: Operator FailsSFC: Switch Fails Closed