Reliability Engineering Lec Notes #6

  • Upload
    peach5

  • View
    217

  • Download
    0

Embed Size (px)

Citation preview

  • 7/30/2019 Reliability Engineering Lec Notes #6

    1/21

    46

    SYSTEM RELIABILITY MODELING AND PREDICTION -

    STATIC METHODS

    Reliability Block Diagrams

    A reliability block diagram is a graphical procedure which describes the system

    operation in terms of successful "signal" transmission between the system units.

    1 2

    1

    2

    Two Unit Series System Two Unit ActiveParallel System

    Consider a system which consists of two units both of which must function for the system

    to function (series system). Assume component failures are statistically independent and

    let

    A1 : Unit#1 functions at time t; P(A1) = R1(t)

    A2 : Unit#2 functions at time t; P(A2) = R2(t)

    A : system functions at time t; P(A) = R(t)

    Then

    P(A) = P(A1A2) = P(A1)P(A2)

    => R(t) = R1(t)R2(t)

    If the system functions when either Unit#1 or Unit#2 functions (active-parallel system),

    then

    P(A) = P(A1 + A2) = P(A1) + P(A2) P(A1)P(A2)

    => R(t) = R1(t) + R2(t) R1(t)R2(t)

    If units are identical with constant failure rate

    Rseries(t) = e2t

  • 7/30/2019 Reliability Engineering Lec Notes #6

    2/21

    47

    Rparallel(t) = 2 et e2t

    0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 20

    0.1

    0.2

    0.3

    0.4

    0.5

    0.6

    0.7

    0.8

    0.9

    1

    ._. R(t)=e2 t

    __ R(t)=2e t

    e2 t

    ... R(t)=e

    t

    *t

    R(t)

    0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 20

    0.1

    0.2

    0.3

    0.4

    0.5

    0.6

    0.7

    0.8

    0.9

    1

    __ F(t)=1e t

    +e2 t

    ... F(t)=12e t

    ._ . F(t)=1e2 t

    *t

    F(t)

    In general,

    R(t) =N

    n=1 Rn(t) for N series system

    R(t) = 1 N

    n=1 [1 Rn(t)] for N parallel(active) system

    R(t) =N

    n=M N!n! (N n)! [R(t)]n[1 R(t)]Nn for M out ofN(good) system

    with N identical units

  • 7/30/2019 Reliability Engineering Lec Notes #6

    3/21

    48

    For a system with N identical, randomly failing units

    MTTFseries =

    0

    dt eNt = 1N

    MTTFparallel =

    0

    dt {1 [1 et]N} = 11

    0

    dy 1 yN

    1 y

    =1

    1

    0

    dyN1

    n=0 yn = 1

    N1

    n=0 1n + 1 =

    1

    N

    n=1 1n

    MTTFMoutofN =

    0

    dtN

    n=M N!n! (N n)! ent[1 et]Nn

    =N

    n=M N!n! (N n)!

    0

    dt ent[1 et]Nn

    =1

    N

    n=M

    N!

    n! (N n)!

    1

    0

    dy (1 y)n1yNn

    Let

    In =

    1

    0

    dy (1 y)n1yNn.Then

    In = n 1N n + 1

    1

    0dy (1 y)n2yNn+1 = n 1N n + 1 In1 with I1 =

    1

    0dy yN1 = 1N

    => In =(n 1)!

    N(N 1) . . . (N n + 1)= (n 1)!

    (N n)!N!

    => MTTFMoutofN =1

    N

    n=M 1n

  • 7/30/2019 Reliability Engineering Lec Notes #6

    4/21

    49

    Example 1

    A system consists of 7 units connected as shown in the following reliability block dia-

    gram. Units 1 through 4 are different (with 2,3, and 4 in active-parallel) and 3 units of

    type 5 constitute a 2-out-of-3 system. IfRi(t) (i = 1, 2, . . . , 5) denotes the reliability func-tion of each unit as a function of time, find the reliability function for the system.

    1

    2

    3

    4

    5

    5

    5

    Solution

    R234(t) = R2(t) + R3(t) + R4(t) R2(t)R4(t) R3(t)R4(t) + R2(t)R3(t)R4(t)

    R1234(t) = R1(t) + R234(t) R1(t)R234(t)

    Also, since for an M-out-of-N system (good) with identical unit

    R(M)N(t) =N

    n=M N!n! (N n)! [R(t)]n[1 R(t)]Nn,

    we have

    R(55)5(t)= 3R25(t)[1 R5(t)]+ R

    35(t) = 3R

    25(t) 2R

    35(t)

    and the reliability function for the system is

    Rsys(t)= R1234(t)R(55)5(t)

  • 7/30/2019 Reliability Engineering Lec Notes #6

    5/21

    50

    Example 2

    Find the R(t) and MMTF for the system whose reliability diagram is given below. In cal-

    culating MTTF, assume all components are identical and fail randomly with failure rate

    .

    1

    4

    2

    3

    5

    Solution

    Rsys = R4R(sys|4) + R4R(sys|4)

    R(sys|4) = 1 (1 R2)(1 R3)(1 R5)

    R(sys|4) = R1(R2 + R3 R2R3)

    Rsys = R4[1 (1 R2)(1 R3)(1 R5)] + (1 R4)R1(R2 + R3 R2R3)

    If all components are identical and fail randomly with failure rate

    Rsys (t) = et[1 (1 et)3] + (1 et)et(2et e2t)

    => Rsys (t) = 5 e2t 6 e3t + 2 e4t

    => MTTF=

    0dt Rsys(t) =

    1

    We could have obtained the same results by choosing the "keystone element" as unit 3,

    i.e.

    Rsys = R3R(sys|3) + R3R(sys|3)

    R(sys|3) = R1 + R4 R1R4

  • 7/30/2019 Reliability Engineering Lec Notes #6

    6/21

    51

    1

    4

    2

    5

    Reliability block diagram for Example 2 with component 3 failed

    R(sys|3) = R2R(sys|23) + R2)R(sys|23)

    = R2 (R1 + R4 R1R4) + R2R4R5

    => Rsys = R3(R1 + R4 R1R4) + (1 R3)[R2 (R1 + R4 R1R4) + (1 R2)R4R5]

    = R4[1 (1 R2)(1 R3)(1 R5)] + (1 R4)R1(R2 + R3 R2R3)

    Note that if the link were not present

    RL = R4R5 + R1(R2 + R3 R2R3)(1 R4R5)

    and for all components identical and failing randomly with failure rate

    RL (t) = 3 e2t e3t 2 e4t + e5t

    => MTTFL =

    0

    dt Rsys(t) = 1315which shows that the presence of the link improves the system reliability.

    Failure Modes and Effects Analysis (FMEA)

    The FMEA was first developed by the aerospace industry in mid 60s. The FMEA

    analysis

  • 7/30/2019 Reliability Engineering Lec Notes #6

    7/21

    52

    describes inherent causes of events that lead to system failure,

    determines their consequences, and,

    devises methods to minimize their occurrence or recurrence.

    There are basically two types of FMEA:

    Design FMEA is used to evaluate the failure modes and their effects for a product before

    it is released to production and is normally applied at the component and subsystem lev-

    els. Its objectives are:

    Identify failure modes and rank them according to their effect on the product perfor-

    mance.

    Identify design actions to eliminate potential failure modes or reduce the occurrence

    of the respective failures.

    Document the rationale behind product design changes.

    Process FMEA is used to analyze manufacturing and assembly processes. Its objectives

    are to identify:

    failure modes that can be associated with manufacturing and assembly process defi-

    ciencies,

    highly critical process characteristics that may cause the occurrence of particular

    failure modes,

    sources of manufacturing/assembly process variations.

    An example of FMEA for transportation applications (using SEA J1739 FMEA Proce-

    dure) is given below. The design controls are:

    1. Prevent the failure cause/mechanism or mode from occurring or reduce rate of

    occurrence

    2. Detect the failure cause/mechanism and lead to corrective actions

    3. Detect the failure mode.

  • 7/30/2019 Reliability Engineering Lec Notes #6

    8/21

    53

  • 7/30/2019 Reliability Engineering Lec Notes #6

    9/21

    54

    RPM: Risk Priority Number

    Occurrence Rating Scale

  • 7/30/2019 Reliability Engineering Lec Notes #6

    10/21

    55

    Severity Rating Scale

  • 7/30/2019 Reliability Engineering Lec Notes #6

    11/21

    56

    Detection Rating Scale

    Some limitations of FMEA:

    Limited insight into probabilistic system behavior.

    FMEA is performed for only 1 failure at a time. There may be multiple failure

    modes with comparable likelihoods.

    Limited insight into the functional relationships between components

    Time element in system operation cannot be represented.

  • 7/30/2019 Reliability Engineering Lec Notes #6

    12/21

    57

    Fault Tree/Event Tree Methodology

    Fault-trees are logic diagrams that link primary or secondary faults (Basic Events) to

    an undesirable event (Top Event).

    Example 1

    Construct a fault-tree with Top Event "Circuit breaker does not open upon demand" for

    the system below:

    Control CircuitA

    Control CircuitB

    Relay A

    Relay B

    TripCoil

    CircuitBreaker

    Solution

    a

    Circuit BreakerMechanism Fails

    Closed

    Voltage PresentAcross the Trip

    Coil

    b

    Circuit BreakerDoes Not Open

    Relay A ContactsStay Closed

    Relay B ContactsStay Closed

    c d

    Relay AFails Closed Control

    Circuit AFails On

    ControlCircuit BFails On

    Relay BFails Closed

  • 7/30/2019 Reliability Engineering Lec Notes #6

    13/21

    58

    Example 2

    Construct a fault-tree with Top Event "Latch does not trip" for the system below: HydraulicControl A

    Actuator A

    HydraulicControl B

    Actuator B Linkage Solution

    a

    b

    c d

    Latch Does NotTrip

    LinkageFails

    Extended

    Actuators Failto Retract

    Actuator AFails to Retract

    Actuator BFails to Retract

    Actuator AFails

    Extended

    HydraulicControl A

    Fails Extended

    HydraulicControl B

    Fails Extended

    Actuator BFails

    Extended

  • 7/30/2019 Reliability Engineering Lec Notes #6

    14/21

    59

    Example 3

    For the system of Example 1 find an expression which yields the probability of Top

    Event occurrence in terms of the probability of basic event occurrence.

    Solution

    Let

    A: Circuit breaker mechanism fails closed

    B: Relay A fails closed

    C: Control circuit A fails on

    D: Relay B fails closed

    E: Control circuit B fails on

    a

    A

    TopEvent

    b

    c d

    B C D E

    Then

    a

    =A

    +b c

    =B

    +C

    b = c d d= D + E

    which gives

    a = A + b = A + cd= A + (B + C)(D + E).

  • 7/30/2019 Reliability Engineering Lec Notes #6

    15/21

    60

    From the rules of Boolean Algebra (or Event Algebra) given in Appendix B:

    A + (B + C)(D + E) = A + B (D + E) + C(D + E) (Distributive Law)

    = A + BD + BE+ CD + CE (Distributive Law)

    Each A, BD, BE, CD, CEis called a cut set(in this case also a minimal cut set). Then

    P(a) = P[A] + P[BD] + P[CD] + P[BE] + P[CE] P[BCE]

    P[CD(BE+ CE)] P[BD(CD + BE+ CE)] P[A(BD + CD + BE+ CE)]

    (using the Commutative and Idempotent Laws)

    = P[A] + P[BD] P[ABD] + P[CD] P[ACD] P[BCD] +

    P[ABCD] + P[BE] P[ABE] + P[CE] P[ACE] P[BCE] +

    P[ABCE] P[BDE] + P[ABDE] P[CDE] + P[ACDE] +

    P[BCDE] P[ABCDE]

    (using the Associative, Distributive and Idempotent Laws).

    It is often reasonable to assume that P[A], P[BD], P[CD], P[BE] and P[CE] are much

    larger that the other probabilities (i.e. rare event approximation ) which implies that Top

    Event probability is the sum of minimal cut set probabilities, i.e.

    P(a) P[A] + P[BD] + P[BE] + P[CD] + P[CE].

    Statistical Importance

    Statistical importance is a measure of the significance of a given basic event to the Top

    Event. IfX is the event of interest, then one definition of statistical importance (Im) is

    Im =Pr(Minimal Cut Sets Containing X)

    Pr(Top Event)

    Example 4

    If P(A)=0.001/demand, P(B)=Pr(D)=0.001/demand and P(C)=P(E)=0.005/demand in

    Example 2, use the rare event approximation to identify the component that needs most

    frequent inspection to prevent the Top Event "Circuit breaker does not open upon

  • 7/30/2019 Reliability Engineering Lec Notes #6

    16/21

    61

    demand".

    Solution

    This component can be identified as the one with the highest statistical importance to the

    Top Event. Then using the rare event approximation from Example 2,

    Im(A) =P(A)

    P[A] + P[BD] + P[BE] + P[CD] + P[CE]

    =0. 001

    0. 001 + (0. 001)2 + 2(0. 001)(0. 005) + (0. 005)2= 0. 9653

    Im(B)=P(BD) + P(BE)

    P[A] + P[BD] + P[BE] + P[CD] + P[CE]

    = (0. 001)2

    + (0. 001)(0. 005)0. 001 + (0. 001)2 + 2(0. 001)(0. 005) + (0. 005)2

    = 0. 0058

    Im(C) =P(CD)+ P(CE)

    P[A] + P[BD] + P[BE] + P[CD] + P[CE]

    =(0. 001)(0. 005) + (0. 005)2

    0. 001 + (0. 001)2 + 2(0. 001)(0. 005) + (0. 005)2= 0. 0290

    Im(D)=P(BD)+ P(CD)

    P[A]

    +P[BD]

    +P[BE]

    +P[CD]

    +P[CE]

    =(0. 001)2 + (0. 001)(0. 005)

    0. 001 + (0. 001)2 + 2(0. 001)(0. 005) + (0. 005)2= 0. 0058

    Im(E) =P(BE) + P(CE)

    P[A] + P[BD] + P[BE] + P[CD] + P[CE]

    =(0. 001)(0. 005) + (0. 005)2

    0. 001 + (0. 001)2 + 2(0. 001)(0. 005) + (0. 005)2= 0. 0290

    The results show that the circuit breaker mechanism should be inspected most frequently.Note that we have assumed that the events B, C, D, Eare statistically independent as per

    given data.

  • 7/30/2019 Reliability Engineering Lec Notes #6

    17/21

    62

    Event-Trees

    Event-trees are used to identify the possible outcomes of a given initiating event and also

    to quantify the probability of their occurrence. Event-trees are often used in conjunction

    with fault-trees to quantify branch probabilities as illustrated in the example below forfire readiness.

    Initiating Evacuation Fire Fire SequenceEvent Containment Control Probability

    I

    S1

    F1

    S2

    S3

    F3

    F2S3

    F3

    S2

    S3

    F3

    F2

    S3

    F3

    P(I)P(S1|I)P(S2|IS1)P(S3|IS1S2)

    P(I)P(S1|I)P(S2|IS1)P(F3|IS1S2)

    P(I)P(S1|I)P(F2|IS1)P(S3|IS1F2)

    P(I)P(S1|I)P(F3|IS1)P(F3|IS1F2)

    P(I)P(F1|I)P(S2|IF1)P(S3|IF1S2)

    P(I)P(F1|I)P(S2|IF1)P(F3|IF1S2)

    P(I)P(F1|I)P(F2|IF1)P(S3|IF1F2)

    P(I)P(F1|I)P(F2|IF1)P(F3|IF1F2)

    Success (S)

    Failure (F)

  • 7/30/2019 Reliability Engineering Lec Notes #6

    18/21

    63

    Root Cause Analysis

    Root causes are the most basic causes that can be reasonably identified by experts and

    can be corrected so as to minimize their recurrence. Several structured techniques are

    used for root cause analysis, including change analysis, barrier analysis, events and causalfactors analysis, tree diagrams, management oversight and risk tree analysis (MORT) and

    fishbone diagrams. Some other less structured approaches are process control charts,

    trend analyses and Pareto diagrams. Root cause analysis consists of three steps:

    1. Determine ev ents and causal factors

    2. Code and document root causes

    3. Generate recommendations

    An example using the tree approach to Step 1 is given below. Step 2 consists of follow-

    ing each path to the top event to determine its relevance for the particular incident (e.g by

    asking "if not?"). Once root causes are identified corrective and preventive recommenda-

    tions are made. For more information on root cause analysis see Ref.[6].

    Aerosol Inhalation WhileSpray Painting

    Personnel Procedures Material orEquipment

    Poor work pactices

    Inattention

    Lack of supervision

    No written procedures

    Verbal instructions unclear

    Work procedure inadequate

    Defective mask

    Inadequate ventilation

    Statistically Dependent Failures

    Statistically dependent failures are defined as events in which the probability of each fail-

    ure is dependent on the occurrence of other failures. In general, statistically dependent

  • 7/30/2019 Reliability Engineering Lec Notes #6

    19/21

    64

    failures are handled using Markov models which we will discuss in Dynamic Methods.

    However, in systems with redundant identical components static techniques may be used.

    We will illustrate the factor methodfor a 2 component parallel system. For generaliza-

    tion of the factor method and other methods see Ref.[2].

    Consider a 2-component parallel system where each component can individually fail

    with rate R or fail due to common cause (e.g. loss of power) with rate C. Then the reli-

    ability function for the system is

    R(t)= eCt1

    1 eRt

    2= eCt

    2eRt e2Rt

    Let

    =C

    C+ R

    C

    .

    Then C= and R = (1 ) and

    R(t)= e t2e(1)t e2(1)t

    = 2et e(2)t= et

    2 e(1)t

    .

    The factor method assumes that tis small enough that

    et 1 t

    and

    e(1)t 1 (1 )t.

    Then

    R(t)= 1 t (1 )(t)2

    or

    F(t)= 1 R(t)= t+ (1 )(t)2.

    Note that since can be interpreted as the probability that component failure occurs to

    the common cause event, then the first term gives the probability of system failure due to

    common cause event and the second term gives the probability of system failure due to

    the non-common cause failure of the components.

    New Static Methods

    While the fault-tree/event-tree approach is perhaps the most commonly used tech-

    nique for system reliability modeling, construction of fault-trees is difficult when the sys-

    tem operation involves control loop action. Some alternative techniques that have been

  • 7/30/2019 Reliability Engineering Lec Notes #6

    20/21

    65

    proposed include influence diagrams, directed graphs (digraphs) and the GO-FLOW

    methodology. Since the digraph approach can be used to simplify fault-tree construction,

    we will illustrate this technique through a simple example. For influence diagrams see

    Ref.[7] and for the GO-FLOW methodology see the Supplementary Material under

    Course Notes on the web.

    Consider the pressure tank system shown below. The switch is normally closed and the

    motor drives a pump which feeds air into the tank. The air is discharged through the dis-

    charge valve at periodic intervals. A timer set to these intervals opens the contacts before

    an overpressure condition occurs and pumping stops. If the timer contacts fail-closed, the

    operator observes from the pressure gauge that the tank pressure is high and manually

    opens the switch. There are 2 control loops:

    Loop 1: Tank, pressure gauge, operator, switch.

    Loop 2: Tank, relief valve.

    Digraph for the

    Pressure Tank SystemPressure Tank System

    The digraph is a tool to describe the cause-effect relationships between system compo-

    nents and variables. A digraph consists of nodes which represent the system variables

    and components and edges which connect the nodes. The numbers represent the direc-

    tion and the qualitative magnitude of the gains between the variables. The gains multiply

    at the nodes. For example, in tank pressure Ptank increases, the gauge pressure Pgauge

    increases (+1 into Pgauge) which alerts the Operator(+1) who then opens the switch (+1)

    and reduces the current Iswitch to the switch (-1 +1 = -1). With the switch open, current

  • 7/30/2019 Reliability Engineering Lec Notes #6

    21/21

    66

    Ipump to the pump motor decreases (-1 +1 = -1) and tank pressure stops increasing (-1 +1 = -1). Subsequently,if everything works as designed, an increase in Ptank leads a

    decrease through the action of the feedback loop. The fault tree is constructed by consid-

    ering the events that cause the loops to lead to the top event.

    Pressure TankRupture

    Zero Gainthrough Loop 1

    OF SFCGS

    TFC

    RVS

    Zero Gainthrough Loop 2

    TFC: Timer Contacts Fail ClosedGS: Gauge Stuck

    RVS: Relief valve StuckOF: Operator FailsSFC: Switch Fails Closed