2011 04 05 Safety and Reliability Patterns

  • Upload
    skgadde

  • View
    217

  • Download
    0

Embed Size (px)

Citation preview

  • 7/30/2019 2011 04 05 Safety and Reliability Patterns

    1/16

  • 7/30/2019 2011 04 05 Safety and Reliability Patterns

    2/16

    About safety and reliability

    Reliability: a measure of up-time, availability or the probability of

    successful computation

    Often measured with, for example, MTBF

    Safety: safe system does not incur too much risk to persons or

    equipment.

    Safety is distinct from reliability, however, safety systems must be

    reliable. Patterns that are actually related to reliability are often

    called safety patters.

    Accident occur because oferrors orfailures Errors are (systematic and) always present but may not be visible all

    the time.

    Failures include, for example, bit flips and breaking of hardware, so

    they are not always present in the system.

    2

    4/6/2011

  • 7/30/2019 2011 04 05 Safety and Reliability Patterns

    3/16

    About errors and failures

    Different means to handle errors and failures

    Failures (random): homogenous redundancy (copies)

    Errors: heterogeneous redundancy

    Author: safety-critical system must contain and properly manage

    redundancy.

    what can be achieved with redundancy is reliability.

    Ability to detect failures and enter safe-state.

    Ability to continue providing the service even in presence of failures.

    The safety and reliability patterns presented in the book are all

    quite well-known and presented in numerous publications.

    The patterns to be work-shopped are more related to developing

    safety systems

    Division of responsibilities between safety and basic control systems.

    3

    4/6/2011

  • 7/30/2019 2011 04 05 Safety and Reliability Patterns

    4/16

    Ones complement pattern

    Problem: how to detect data corruption for small set of data that

    can be caused by, for example, EMI or heat.

    Solution: the data is stored twice once in normal format and

    once in ones complement format. When the data is read, the

    ones complement format can be inverted back to normal andcompared to the original.

    If the values do not match, error processing can be initiated.

    No way to decide which one is correct

    - Uses twice as much memory for storage- Small performance hit.

    4

    4/6/2011

  • 7/30/2019 2011 04 05 Safety and Reliability Patterns

    5/16

  • 7/30/2019 2011 04 05 Safety and Reliability Patterns

    6/16

    CRC pattern

    Problem: how to detect data corruption in large data sets that can

    be caused by, for example, EMI or heat.

    Solution: a CRC value is calculated from the data and stored in

    addition to the actual data. When the data is read, the CRC value

    can be re-calculated and compared to the original CRC. If the values do not match, error processing can be initiated.

    - Good detection of single and multiple bit errors

    - Small performance hit.

    6

    4/6/2011

  • 7/30/2019 2011 04 05 Safety and Reliability Patterns

    7/16

  • 7/30/2019 2011 04 05 Safety and Reliability Patterns

    8/16

  • 7/30/2019 2011 04 05 Safety and Reliability Patterns

    9/16

    Smart Data pattern

    9

    4/6/2011

  • 7/30/2019 2011 04 05 Safety and Reliability Patterns

    10/16

    Channel pattern

    A channel is an architectural structure that processes data from

    raw acquisition through a series of processing steps to physical

    actuation. (end-to-end processing)

    A basis for a set of patterns including, for example, protected

    single channel pattern, Dual channel pattern and more..

    In the book, the author does not state explicit problem or

    consequences because channel is treated as a base for the

    other channel-related patterns.

    10

    4/6/2011

  • 7/30/2019 2011 04 05 Safety and Reliability Patterns

    11/16

    Channel pattern

    11

    4/6/2011

  • 7/30/2019 2011 04 05 Safety and Reliability Patterns

    12/16

    Protected single channel pattern

    Problem: How to improve reliability without having to increase

    development and hardware costs as much as with real

    redundancy.

    Solution: a single channel is used to process the data from

    sensors to actuators. The reliability is enhanced through additionof checks at key points in the channel

    - Not able to continue functioning in the presence of faults (but

    may be able to detect faults and enter safe-state).

    12

    4/6/2011

  • 7/30/2019 2011 04 05 Safety and Reliability Patterns

    13/16

  • 7/30/2019 2011 04 05 Safety and Reliability Patterns

    14/16

    Dual channel pattern

    Problem: How to improve reliability and provide protection

    against single-point faults.

    Solution: reliability can be improved by offering multiple channels.

    If the channels are identical (homogeneous redundancy), the

    pattern can address random faults. If the channels use differentdesign or implementation, the pattern can address both random

    and systematic faults. Depending on which pattern is used, it may

    enter safe-state or switch to another channel when a faults is

    detected.

    - Logic is needed to manage the channels and to determine which

    will be active.

    - The logic may be a single point of failure?)

    - Costs

    14

    4/6/2011

  • 7/30/2019 2011 04 05 Safety and Reliability Patterns

    15/16

    Dual channel pattern

    15

    4/6/2011

  • 7/30/2019 2011 04 05 Safety and Reliability Patterns

    16/16

    More channel patterns

    Homogeneous redundancy pattern

    Uses two identical channels

    Heterogeneous redundancy pattern

    Uses dual channels of different designs or impelementations

    Triple modular redundancy (TMR) pattern

    Uses three channels of typically identical designs and a voting

    mechanism to decide the actual output.

    Can continue to deliver services in the presence of a fault, providedthat the fault is isolated within a channel.

    16

    4/6/2011