Upload
ashton-pierce
View
214
Download
1
Tags:
Embed Size (px)
Citation preview
4-15-99 1
Steven F. MatternScience and Engineering Associates, Inc.(505) 346-9839
4-15-99 2
Todays TopicsTodays Topics
Software Safety AnalysisSoftware Safety Analysis
– A Historical Perspective– A Personal Perspective
Elements of Sound Safety EngineeringElements of Sound Safety Engineering
Structured Software SafetyStructured Software Safety
Object Oriented Software Safety
Products Produced (examples)
4-15-99 3
Failures…A Performance IssueFailures…A Performance Issue
UndesiredUndesiredPerformancePerformance
UnintendedUnintendedPerformancePerformance
Fault-InducedFault-InducedPerformancePerformance
UnsafeUnsafePerformancePerformance
4-15-99 4
Undesired Event AnalysisUndesired Event Analysis
• System Safety Engineering– Hazards– Hazard Causes– Hazard Mitigation and Fault Avoidance
• Reliability Engineering– Faults and Failure Modes– Fault Pathways to Undesired Events– Fault Detection, Fault Tolerance, and Fault Recovery
• Operational Effectiveness– Faults and Failure Conditions– Fault Pathways to Undesired Events– Undesired Event Mitigation or Fail Safe/Fail Operational
““CAUSED BY”CAUSED BY”a Combination of:a Combination of:
HardwareHardware
SoftwareSoftware
Human ErrorHuman Error
Software-InfluencedSoftware-InfluencedHuman ErrorHuman Error
HMIHMIData InputData Input
4-15-99 5
UNDESIREDEVENT
Software HardwareHumanError
Software FailureModes and/orCausal Factors
System-LevelUndesired Eventsand/or Conditions
THIS WAY
...VIA System SafetyEngineering
Analysis Techniques
BRIDGINGBetween System-Level
Events and theSystem Hardware, Software, and the
Operator/Maintainer...
Top-Down ApproachTop-Down Approach
4-15-99 6
ElementsElements of System Safety
• Definition of Safety-Critical Functions• Tailored Safety Requirements & Guidelines• Identification of System/Subsystem Hazards & Failure
Modes• Determination of System-Level Effects• Categorization of Hazard Severity & Likelihood• Identification of Hazard Causes (HW/SW/Human Interaction)
• Derivation of Functional Hazard Mitigation Requirements• Determination of Safety Requirements Implementation• Determination of Residual Safety Risk• Final Categorization Hazard of Severity & Likelihood
4-15-99 7
Sources of System Safety Requirements
Generic SoftwareSafety Requirements
Guidelines and Specifications
Derived Functional Safety Requirements
Lessons Learned
Design SpecificationsSimilar System Analysis
User Inputs
Systems EngineeringAutomated Environment Hazard Causal
Factor Analysis
SYSTEM SAFETY REQUIREMENTSSYSTEM SAFETY REQUIREMENTS
System & SubsystemHazard Analysis
System Requirements
System Architecture
Initial Constraints Functional Requirements
4-15-99 8
Causal Factors
H/W S/W HE S/WIHE
Causal Factors
H/W S/W HE S/WIHE
Causal Factors
H/W S/W HE S/WIHE
Causal Factors
H/W S/W HE S/WIHE
ROOTHAZARD
Failure ModeA
Failure ModeB
Software-Influenced
Human Error
Failure ModeC
Failure ModeD
Initial Depth of Analysis
To the Depth Requiredto Mitigate Effectively
~~
PDR-Level
CDR-Level
To The Depth RequiredTo The Depth Required
4-15-99 9
Methods For Causal Factor Methods For Causal Factor
AnalysesAnalyses
• Safety-Critical Functions Analysis• System States/Modes Analysis• Hazards Analysis (System & Subsystem Level)
• Fault Tree Analysis (Usually Limited to System-Level Hazards)
• Failure Modes & Effects Criticality Analysis• Hybrid Event Trees or Reliability Block Diagrams• Software Data Flow Analysis• Software Functional Flow Analysis• Interface Analysis
(Hardware/Software/Operator/Maintainer)
Tailored For Customer, Program, & Environment
4-15-99 10
A REQUIRED FUNCTION DOES NOT OCCURFailure of the software to perform a required function; that is, the function is never executed, or no output is produced.
AN UNDESIRED EVENT OCCURSThe software performs a function not required. (i.e. getting the wrong answer, issuing the wrongcontrol instruction, or doing the right thing but under inappropriate conditions).
AN INCORRECT SEQUENCE OF REQUIRED EVENTSThe software possesses sequencing problems. For example, failing to ensure that two eventshappen at the same time, at different times, or in a particular order.
TIMING FAILURES IN EVENT SEQUENCESThe software exceeds maximum time constraints between events, fails to ensure minimum time constraints between events or possesses duration failures.
AN INCORRECT RESPONSE TO A SAFETY-CRITICAL EVENTThe software fails to recognize a hazardous condition requiring corrective action, fails to initiatea fault tolerant response to a recognized safety-critical function, or produces the wrong responseto a hazardous condition or failure mode.
Potential Influence of SoftwarePotential Influence of Software
4-15-99 11
Hazards(HAZ)
Hazards(HAZ)
Software Req(SWR)
Software Req(SWR)
CSCI Scenario(SCE)
SegmentScenario (SSC)
SegmentScenario (SSC)
SegmentBehavior (SBE)
SegmentBehavior (SBE)
Interface(CIM/CID)
Interface(CIM/CID)
Testing(VER_SWR)
Testing(VER_SWR)
Object Oriented DesignObject Oriented Design
4-15-99 12
MCS SCE00820Activate a Scheduled Mission
CUI SCE00???Begin Resupply Mission
SAS SCE00899Perform DockedResupply (Fwd Veh)
#5 <Commence _Execution _Order>
#1 MCS< Commence_ Execution_Command>
#2 Notify Crew to Begin Resupply Mission (CUI)
#3 Uses SAS SCE00882
#1 VMG <Dock_Deploy _Notification>
#2 Uses SAS SCE00868
#1 <Inventory_Request> (RPC and ACU)
#2 <Ammunition_ Inventory _ Data> (RPC)
SAS SCE00882Develop Resupply Op Request (Fwd Veh)
SAS SCE00868Obtain Vehicle Inventory
2
1
VMG SCE02057Deploy SPH
#6 <Dock_Deploy_Notification
3
1. Can not find the Crew Notification within CUI to begin resupply mission. SCE00899 sends message…but to where? SAS to CUI interfaces describes Resupply_Authorization_Request, but it does not appear in the CUI SRS
2. This Scenario uses the term “USES”. Is “Uses” the same as “Invokes”?3. It is unsure if SCE00868 and SCE00871 are processed concurrently or are they processed in series4. This message is used to request resupply guidances. Resupply guidances are provided by the POC, and
include substitution rules, end-state guidance, controlled supply rate, and resupply thresholds. However, there is no development of this message in the TDA SRS…No message received and no message provided to SAS
5. AFT SCE01132 does not receive this message from RPC SCE00498, does not obtain inventory data from any know source, nor does it send the data back to RPC SCE00498
6. Same comment as #57. Assume the command to Retrieve Projectile Inventory Data is processed internal to RPC SCE004988. ACU SCE02338 Has not yet been developed
Notes:Notes:
RPC SCE00489Provide Inventory Data
AFT SCE01132Provides Propellant Inventory Data
#1 <Inventory_ Request>
#2 <Propellant_Inventory _Request> (AFT)
#3 Retrieve the Projectile Inventory Data
#4 <Propellant_Inventory_Request>
#5 <Ammunition_Inventory_Data
5
6
7
ACU SCE02338Perform Resupply Fuel Transfer (Sender)
8
Safety Analysis of ScenariosSafety Analysis of Scenarios
4-15-99 13
QUESTIONS/DISCUSSION
1. IFC SCE00598, Step #6 sends the <Warning_Order> to both RPC SCE00391 and RPC SCE00493.Additionally, <Warning_Order> is not listed in the IFC SRS Input/Output Tables.
Steps for Scenario Execute A Fire Command (SCE00598)# Stimulus
(As Req.) Scenario StepsResponse(As Req.)
5 IFC determines whether a check firing condition exists.6 IFC sends ammunition order to RPC. [RPC:
Warning_Order]
7 [RPC:Ammunition_Order_Data]
RPC sends projectile weights and preliminary propellanttemperature for all rounds contained in the ammunitionorder to IFC.
RPC SCE00493 shows receipt of <Ammunition_Order> in Step #1 while RPC SCE00391 showsreceipt of <Ammunition_Order> as the scenario stimulus. Do we need to make the SCE00391 Step #1the Rx of <Ammunition_Order>? Finally, RPC SCE00391 Step #1 indicates that the scenario“Invokes RPC SCE00493 raising an apparent conflict with scenario activation in response to<Ammunition_Order>.
Steps for Scenario SPH Supply Multiple Rounds for Firing (SCE00391)# Stimulus
(As Req.) Scenario StepsResponse(As Req.)
1. Invoke Scenario [SCE00493] "SPH Select Inventory forFiring"
2. Verify the ability to support the fire mission event
Contextual Applicability to HazardsContextual Applicability to Hazards
4-15-99 14
Test Requirements Based Upon
Causes & Mitigation
UNDESIREDEVENT
Failure ModeA
Failure ModeB
Failure ModeC
Failure ModeD
HardwareFailure
HumanError
SoftwareError
TimingError
SequencingError
AlgorithmError
• What Causes, Initiates, or What Causes, Initiates, or Influences The Undesired Event ?Influences The Undesired Event ?
• What Requirements Have BeenWhat Requirements Have BeenImplemented To Mitigate ?Implemented To Mitigate ?
• How Will Testing Be AccomplishedHow Will Testing Be Accomplishedto Prove Implementation ofto Prove Implementation ofMitigation Requirements ?Mitigation Requirements ?
• Testing for Functionality and LossTesting for Functionality and Lossof Functionality.of Functionality.
• How Will The System Respond toHow Will The System Respond toFailure ? Fault Tolerance/RecoveryFailure ? Fault Tolerance/Recovery
4-15-99 15
Hazards(SHZ)
Hazards(SHZ)
CSCI Scenario(SCE)
Testing(VER_SWR)
Testing(VER_SWR)
Hazards(HAZ)
Hazards(HAZ) CSCI Scenario
(SCE)CSCI Scenario
(SCE)CSCI Scenario
(SCE)CSCI Scenario
(SCE)CSCI Scenario
(SCE)Testing
(VER_SWR)
Testing(VER_SWR)Testing
(VER_SWR)
Testing(VER_SWR)Testing
(VER_SWR)
Testing(VER_SWR)
ContextContext DetailDetailCompletenessCompleteness
SegmentSegmentElementElementCSCICSCI
Testing Scenario CompletenessTesting Scenario Completeness
4-15-99 16
System SafetyRequirement Software Requirement Description
AffectedCSCI
Test ProcedureRequirement Test Results
Software Requirements Traceability Matrix
1.2.3.1
1.2.3.3
1.3.4.1
The WCS Shall Monitor the Status of the JWS Missiles That Are Powered up.
The WCS Shall Safe and Deselect Any JWS Missile That Fails BIT
The JWS Missile Shall Withhold Active RF Emissions Until Terminal Impact Phase
System Safety Requirement Test or Analysis Activity
Verified Date Comments
System Safety Requirements Verification Matrix
The JWS WCS Shall Include Processing Elements to Verify IPL Results
The JWS WCS Shall Include Software Elements to Safe the RM During Unsafe States
The JWS Missile Shall Include Software to Prevent Active RF Until Terminal Phase
JWS WCS
JWS WCS
JWS Missile
FTH
WDC
FIC
VU0012
VU10002
VU234003
FAIL
PASS
PASS
5/7/02
5/9/02
6/3/02
Failed 1ST Attempt
See Test Log 1001
None
AN/ SWY-1/ 2/ 3Safety Program
HAZARD ACTIONREPORT
HAR Number:Date:
FROM: TO: Commander Naval Surface Warfare Center, Dahlgren Division Code G71 (J . Bowden) 17320 Dahlgren Road Dahlgren, VA 22448-5001
ACTION ADDRESS:
SYSTEM: COMPONENT:
SYSTEM PHASE OR OPERATION:
HAZARD DESCRIPTION:
SUMMARY OF FAILURE MODES AND CAUSAL FACTORS:
HAZARD CONTROL/ MITIGATION ACTION(S):
REFERENCES:
ORIGINATOR: TELEPHONE:
FINAL RESOLUTION:
HAZARD RESOLVED & HAR CLOSED:
USE ADDITIONAL SHEETS AS NECESSARY
PAGE _ OF _
HRI:
PROGRAM MANAGER DATE
PRINCIPAL FOR SAfETY DATE
Evidence of Hazard ControlEvidence of Hazard Control
4-15-99 17
EXAMPLE
1.0 Man Machine Interface (MMI) 1.1 PROCESS CANCELLATION
1.1.1 The System shall be designed such that the operator may exit from a potentially unsafe state with a single action1.1.2 Exiting from an unsafe state or condition shall place the system in a known safe state, report the failure, and display the system status to the operator.
1.2 SAFETY-CRITICAL PROCESS INITIATION1.2.1 Safety-Critical operator displays, legends, and other interface functions shall be clear concise, and unambiguous.1.2.2 Safety-Critical displays shall be duplicated, where possible, on separate display devices.1.2.3 Safety-Critical alerts to the crew shall be readily distinguishable from routine alerts.1.2.4 Upon detection of an unsafe state, the system shall alert the operator to the anomaly detected, the action taken, and the resulting system configuration and status.
1.3 OPERATOR ENTRY ERRORS1.3.1 The software shall be capable of detecting improper operator entries, or sequence of entries, and thereby prevent the the execution of safety-critical functions.1.3.2 The software shall alert the operator to an erroneous entry.1.3.3 Operator alerts shall indicate the error and corrective action.1.3.4 After operator corrective action to an erroneous entry, the software shall provide positive confirmation of a valid data entry, and a real-time indication that the system is functioning properly.1.3.5 Safety-Critical functions which require several seconds or longer to process shall provide a status indicator to the operator during processing.
1.4 POSITIVE FEEDBACK1.4.1 Software control of safety-critical functions shall have positive feedback indicators to the operator to provide assurance that the system is functioning properly.
1.5 SAFETY-CRITICAL ALERTS1.5.1 The operator shall not be able to clear a safety-critical alert without taking corrective action.
Initial Software SafetyConstraints
4-15-99 18
Safety-Critical Function AnalysisSafety-Critical Function Analysis
System Function Ramification of Failure Relevant SubSystemsTactical & Technical SoftwareData Displays & ControlsControl (Armament)
Tactical & Technical SoftwareData Displays & ControlsTrainingMaintenance
Fail to define baseline data - Results in weaponfiring into friendly forces
Tactical & Technical SoftwareData Displays & ControlsGeneral ProcessingMass MemorySensor/Actuator InterfaceServo Amplifier
Fail to define baseline data - Results in weaponfiring into friendly forces (dependent on modesand states of the weapon system)
Fail to protect self from friendly forces
Fire projectile into friendly forces
Tactical & Technical SoftwareData Display & ControlsGeneral ProcessingMass MemorySensor/Actuator InterfaceServo Amplifier
Sensors & Peripherals
Tactical & Technical Software
1. Engagement Control A. Check-Fire; Manual B. Cease-Fire; Automatic
Safety SignificanceFail to break engagement of an incorrectly designated target; Fire projectile into friendlyForces if incorrectly identified friendly target as hostile
Safety-Critical Function
2. Mode Control A. Tactical: move, shoot, resupply B. Support: training, test, & maintenance
Inadvertent weapon firingSafety-Critical Function
3. Initialization, Re-Initialization
4. Re-Configuration
Safety-Critical Function
Safety-Critical Function
Safety-Critical Function
Safety-Critical Function5. Transmit Friend Signal To Friendly Forces
6. Target Integrity
Improper actions leading to designation of incorrect target. Fail to accomplish offensive ordefensive operation. Fail to protect self or friendly forces
Tactical & Technical SoftwareData Displays & Controls Safety-Critical Function7. Receive, Verify, & Process
Battlefield Information
ExampleExample
2.Determine
Ramifications ofFunction Failure
2.Determine
Ramifications ofFunction Failure
3.Determine
SubsystemsImpacted
3.Determine
SubsystemsImpacted
4. Determine
SafetySignificance
1.DefineSystem
Function
4-15-99 19
2.Define Primary
Capabilities of theSystem State
2.Define Primary
Capabilities of theSystem State
3.Identify Prohibited
Functions of theSystem State
3.Identify Prohibited
Functions of theSystem State
4. Identify EssentialFunctions of the
System State
1.Define System
States and Substates
Safe States & Modes AnalysisSafe States & Modes Analysis
SYSTEM STATE PRIMARY CAPABILITIESPROHIBITEDFUNCTIONS
1. Unpowered State
2. Maintenance State
3. Test State
4. Training State
5. Tactical State
Weapon System Is Shut Down and Powered off
Perform Intrusive Fault Protection and Isolation. Perform Servicing, Repair, And LRU Replacement Actions, Etc
Test WCS software and hardware end-to-end using synthetic training targets and Operational Software
Perform WCS Operator Training using Simulated Environments Using Training Software. No Live-fire, Simulated Target
Weapon System Active for Live-fire
System Operation of any kind
Weapon Control System Test, Training, and Tactical Operation
Tactical Links InstalledArm/Enable Switch On
Tactical Links InstalledArm/Enable Switch On
Training, Maintenance, and Test Modes
ESSENTIALFUNCTIONS
Weapon System Physical Security
Primary Power On
Break-Engage ActivePrimary Power OnTraining Software Load
Primary Power OnSim Software LoadedFire Command Cleared
Primary Power On
ExampleExample
4-15-99 20
LTEController
UnitLCR
LEE LRE LBE LSPEvent Queue
Processing UnitLCR
Event QueueProcessing Unit
LCR
Bearing/Jam LineEvent Queue
Processing LCR
SectorProcessing Unit
LCR
LGD LTP LME
Get Data UnitLCR
Processing UnitLCR
Modification Event Processing LCR
LNE LNW LRW LSDTrack Not
Eligible UnitLCR
#1 SolutionProcessing Unit LCR
#2 SolutionProcessing Unit LCR
Save Data UnitLCR
A40
B40
C40
Functional Analysis of SoftwareFunctional Analysis of SoftwareArchitectureArchitecture
• Graphical Representation Graphical Representation for Each Hazard or Failurefor Each Hazard or FailureModeMode
• Links Specific Software Links Specific Software Modules to Undesired Modules to Undesired EventsEvents
• Safety Implication/Effects Safety Implication/Effects Communicated to theCommunicated to theDesign/Domain ExpertDesign/Domain Expert
Example AnalysisExample Analysis
4-15-99 21
Do SIL’s Make Sense?Do SIL’s Make Sense?
• Definition of Safety-Critical Functions
• Tailored Safety Requirements & Guidelines
• Identification of System/Subsystem Hazards & Failure Modes
• Determination of System-Level Effects
• Categorization of Hazard Severity & Likelihood
• Identification of Hazard Causes (HW/SW/Human Interaction)
• Derivation of Functional Hazard Mitigation Requirements
• Determination of Safety Requirements Implementation
• Determination of Residual Safety Risk
• Final Categorization Hazard of Severity & Likelihood
Elements of System Safety
4-15-99 22
The Role of SIL’sThe Role of SIL’s
• Forces the Functional Linking of Software Architecture to Undesired Events or Hazards of the System
• Forces “Cause Analysis” to be Accomplished to Prove Software Influences or Causes the Hazard to Initiate
• Forces the Software Development Team to Interact with the Safety Team
• Forces a Defined Protocol of Software Development Activity in the Design, Code, Test, V&V, and Configuration Management for each of the SIL Categories
4-15-99 23
Issues To ConsiderIssues To Consider
• Solving Problems = High VisibilityPreventing Problems = Low Visibility
• Size and Complexity of the Software
• Software Development Life Cycle, Tools/Techniques– Structured Design
– Object Oriented Design
• Interfaces (Hardware, Software, Human)– Functionality of Interface– Ramification of Loss of Interface
• Role of Systems Engineering
• Requirements versus Recommendations/Considerations
• Budgets for “Specialty Engineering”