View
4
Download
0
Category
Preview:
Citation preview
Intro to STPA(System-Theoretic Process Analysis)
John Thomas
Any questions? Email me: Thomas@STAMP-consulting.com
STPA Benefits
• Identification of potential causes of system hazards involving complex interactions among
– Hardware
– Software
– Humans
• More complete and less costly hazard analysis
• Building safety in from the beginning of aircraft development
– Enhance model-based system engineering
– Document functional design
– Identify software safety requirements
– Manage change analysis more cost effectively
– Identify unintended functions (unknown unknowns) early
Boeing 787 Lithium Battery Fires• 2013 – 2014
• Reliability analysis predicted 10 million flight hours between battery failures• Two fires caused by battery failures
in 52,000 flight hours
• Does not include 3 other less-reported incidents of smoke in battery compartment
©
A Challenge: Getting accurate failure estimates
4
Boeing 787 Lithium Battery Fires• A module monitors for smoke
in the battery bay, will activate fans and valves for venting
• Power management system detects rapid battery discharge, will begin power shedding.
• Result: power management system shut down electronics including ventilation system.
• Smoke vented to cabin
Another Challenge: Accidents without a component failure
© 6
A new view
Controlled Process
Process
Model (beliefs)
Control
Actions Feedback
Controller
• Provides another way to think about accidents• Forms foundation for STAMP/STPA• For each system we discuss, let’s consider how this applies
Controlled Process
Control
Actions Feedback
STAMP: basic control loop
• Control actions are provided to affect a controlled process
• Feedback may be used to monitor the process
• Process model (beliefs) formed based on feedback and other information
• Control algorithm determines appropriate control actions given current beliefs
Controller
Captures software errors, human errors, flawed requirements,… 8
Process
Model
Control
Algorithm
Bombardier Learjet 60 Accident
• Tires disintegrated on takeoff, pilots tried to abort
• Automation ignored pilot commands for reverse thrusters• The tire explosion
damaged landing gear sensors
• Software believed aircraft in flight
• Automation increased thrust
• Aircraft was destroyed© John Thomas
10
Bombardier Learjet 60 Accident
• Tires disintegrated on takeoff, pilots tried to abort
• Automation ignored pilot commands for reverse thrusters• The tire explosion
damaged landing gear sensors
• Software believed aircraft in flight
• Automation increased thrust
• Aircraft was destroyed© John Thomas
11
The automation operated exactly as designed!
Bombardier Learjet 60 Accident
• NTSB Recommendations
• “Identify the deficiencies in Learjet's system safety analyses, both for the original Learjet 60 design and for the modifications after the 2001 accident, thatfailed to properly address thethrust reverser systemdesignflaws related to this accident,and require Learjet to perform a system safety assessment in accordance with 14 Code of Federal Regulations 25.1309 for all other systems that also rely on air-ground signal integrity and ensure that hazards resulting from a loss of signal integrity are appropriately mitigated to fully comply with this regulation.
• Revise available safety assessment guidance (such as Advisory Circular 25.1309-1A) for manufacturers to adequately address the deficiencies identified in Safety Recommendation [6], require that designated engineering representatives and their Federal Aviation Administration (FAA) mentors are trained on this methodology, and modify FAA design oversight procedures to ensure that manufacturers are performing system safety analyses for all new or modified designs that effectively identify and properly mitigate hazards for all phases of flight, including foreseeable events during those phases (such as a rejected takeoff).
• https://www.ntsb.gov/news/events/Pages/Runway_Overrun_During_Rejected_Takeoff_Global_Exec_Aviation_Bombardier_Learjet_60_N999LJ_Columbia_South_Carolina_September.aspx
© John Thomas
13
A new view
Controlled Process
Control
Actions Feedback
Controller
• Provides another way to think about accidents• Forms foundation for STAMP/STPA• For each system we discuss, let’s consider how this applies
Process
Model
Control
Algorithm
Thomas, 2017
Unmanned Predator-B Crash (US CBP)See http://bit.ly/2m7d2Qs
PPO-1 (Flight)
GA-ASI Pilots
Engine controller
Engine
Navigation mode, flight status, etc.
Autonomous/manual modeWaypoints
Predator B Aircraft
Camera Operator
PPO-2 (Camera)
Navigation
Imaging equipment
Propulsion condition
A/P
Control Surfaces
Autonomous/manual modeWaypoints
Propulsion condition
Iris control, etc.
Iris control, etc.
High-level control structure (inadequate control and feedback highlighted)
21
Four types of unsafe control actions:1) Control commands required for
safety are not given2) Unsafe ones are given3) Potentially safe commands but given
too early, too late4) Control action stops too soon or
applied too long
Controlled Process
Process
Model
Control
Actions Feedback
Controller
Control
Algorithm
John Thomas (Leveson, 2012)
A/P on/offA/P pitch mode
A/P lateral modeA/P targetsF/D on/off
Autopilot and Flight Director System (AFDS)
Flight Crew
Speedbrakes
Flaps
Landing Gear
Pilot direct control only
Elevators
Ailerons/Flaperons
Trim
Pilot direct control or Autopilot
A/P mode, statusF/D guidance
Pitch commandsRoll commands
Trim commands
Position, status
Thomas, 2017
Software-hardware
interactions
A/P on/offA/P pitch mode
A/P lateral modeA/P targetsF/D on/off
Autopilot and Flight Director System (AFDS)
Flight Crew
Speedbrakes
Flaps
Landing Gear
Pilot direct control only
Elevators
Ailerons/Flaperons
Trim
Pilot direct control or Autopilot
A/P mode, statusF/D guidance
Pitch commandsRoll commands
Trim commands
Position, status
Thomas, 2017
Human-automation interactions
A/P on/offA/P pitch mode
A/P lateral modeA/P targetsF/D on/off
Autopilot and Flight Director System (AFDS)
Speedbrakes
Flaps
Landing Gear
Pilot direct control only
Elevators
Ailerons/Flaperons
Trim
Pilot direct control or Autopilot
A/P mode, statusF/D guidance
Pitch commandsRoll commands
Trim commands
Position, status
Thomas, 2017
Flight Crew
Human-hardware
interactions
ExampleSafetyControlStructure
Control
25(Leveson, 2012)
STAMP and STPA
Theory (safety, security, etc. is a control problem)
CAST Accident Analysis
MethodologySTPA
Hazard Analysis
STAMP
©
26
STPASystems Theoretic Process Analysis
©
System-Theoretic Process Analysis (STPA)
• Identify system accidents, hazards
• Draw functional control structure
• Identify unsafe control actions
• Identify accident scenarios
28
(Leveson, 2012)
Accidents and hazards
• A-1. Loss of life or serious injury to people• A-2. Damage to the aircraft or objects outside the
aircraft
• Example Aircraft-level Hazards:– H-1: Aircraft violate minimum separation standards– H-2: Controlled flight into terrain– H-3: Uncontrolled flight – H-4: Loss of airframe integrity– H-5: Aircraft environment not suitable for humans
• E.g. exceeds limits for temperature, oxygen, attitude, rate of movement, etc.
– Etc.
System-Theoretic Process Analysis (STPA)
• Identify system accidents, hazards
• Draw functional control structure
• Identify unsafe control actions
• Identify accident scenarios
32
(Leveson, 2012)
Flight Crew
Physical processes
Control structure
Automated Controllers
Co
ntr
ol,
Au
tho
rity
Thomas, 2017
Air Traffic Control
System-Theoretic Process Analysis (STPA)
• Identify system accidents, hazards
• Draw functional control structure
• Identify unsafe control actions
• Identify accident scenarios
34
(Leveson, 2012)
Cmd X
Flight Crew
Physical processes
Automated Controllers
Not provided
causes hazard
Providing causes hazard
Too early, too late, out
of order
Stopped too soon, applied too long
Unsafe Control Actions (UCA)
Thomas, 2017
Not provided causes hazard
Providing causes hazard
Too early, too late, out of
order
Stopped too soon, applied
too long
Cmd
Generating requirements and constraints
Controller X shall provide CMD when D
Controller X shall not
provide CMD when E
Controller X shall provide CMD within Y seconds of F
Controller X shall stop
providing CMD within Z
seconds of G
Controller functional safety requirements
High-level responsibilities
Controller X will need to collect information from components Y, Z
Controller X will need to detect A and respond with B
Etc.
Thomas, 2017
AC 23.1309-1E: “It is necessary to consider the possibility of requirement, design, and implementation errors in order to comply with the requirements of § 23.1309.”
STPA can identify these errors so they can be fixed
Cmd X
Flight Crew
Physical processes
Automated Controllers
Identify Accident Scenarios
What could cause Unsafe Control
Actions?
Scenarios
Controller incorrectly believes X because …
Controller control algorithm does not enforce Y because …
Incorrect feedback Z received because …
Sensor failure causes…
Etc.
Thomas, 2017
Cmd X
Flight Crew
Physical processes
Automated Controllers
Identify Accident Scenarios
Control actions not executed or
not followed properly
Scenarios
Cmd sent but not received because…
Cmd received but ignored because…
Actuator failurecauses…
Thomas, 2017
Design decisions and recommendations
Design decisions
Component A will need to respond within B seconds to avoid C
Component F will need to automatically operate within G seconds when H
Etc.
Scenarios
Recommendations
Controller X should takeinto consideration D to prevent E
Component I and J should be operated at the same time to prevent K
Etc.
Rationale and assumptions identified
Every recommendation and decision is traceable
Thomas, 2017
Cmd X
Flight Crew
Physical processes
Automated Controllers
Not provided
causes hazard
Providing causes hazard
Too early, too late, out
of order
Stopped too soon, applied too long
Unsafe Control Actions (UCA)
Thomas, 2017
High-level responsibilities
Crew will need to detect A and respond with B
Crew will need information about Y, Z
Etc.
Cmd X
Flight Crew
Automated Controllers
Not provided
causes hazard
Providing causes hazard
Too early, too late, out
of order
Stopped too soon, applied too long
Unsafe Control Actions (UCA)
Thomas, 2017
Crew shall provide CMD
when D
Crew shall not provide CMD
when E
Crew shall provide CMD
within Y secondsof F
Crew shall stop providing CMD
within Z seconds of G
Crew procedures
Cmd X
Flight Crew
Physical processes
Automated Controllers
Identify Accident Scenarios
What could cause Unsafe Control
Actions?
Scenarios
Crew responded to failure in A by …
Crew incorrectly believes X because …
Crew does not perform Y because …
Crew received incorrect feedback Z because …
Etc.
Thomas, 2017
AC 23.1309-1E: “The additional hazards to be minimized include those caused by inappropriate actions by a crewmember in response to the failure, or those that could occur after a failure.”
STPA captures crew actions with or without failures
Cmd X
Flight Crew
Physical processes
Automated Controllers
Identify Accident Scenarios
Control actions not executed or
not followed properly
Thomas, 2017
Scenarios
Crew cmd sent but not received because…
Crew cmd received but ignored because…
Actuator failurecauses…
Design decisions and recommendations
Design decisions
Crew must be notified of A within B seconds to avoid C
Component F should operate automatically when H
Etc.
Scenarios
Rationale and assumptions identified
Thomas, 2017
14 CFR 23.1309 (d): “Systems and controls, including indications and annunciations, must be designed to minimize crew errors which could create additional hazards.”
STPA provides the “how”
Design decisions and recommendations
Design decisions
Crew must be notified of A within B seconds to avoid C
Component F should operate automatically when H
Etc.
Scenarios
Recommendations
Crew X should take into consideration D to prevent E
Crew should operate I and J at the same time to prevent K
Etc.
Rationale and assumptions identified
Every recommendation and decision is traceable
Thomas, 2017
Traceability is maintained throughout
46
System-level Accidents,
Hazards
Unsafe Control Actions
High-level responsibilities
Controller functional safety requirements
(automation)
Scenarios
Design Decisions
Design Recommendations
Thomas, 2017
Procedures (humans)
Analysis
Analysis Outputs
Short Examples
Embraer applications
• Embraer Air Management System
• Identified 700+ design recommendations to eliminate or mitigate hazards (satisfy the safety constraints).
Traditionally captured with existing processes
Traditionally captured in advanced stages
Captured only with STPA
Embraer Aircraft Smoke Control System analysis
Recommended