32
Risk Informed Design and Test On NASA’s Constellation Program John V. Turner, PhD Constellation Program Risk Manager Used with permission

Turner.john

  • Upload
    nasapmc

  • View
    13.271

  • Download
    0

Embed Size (px)

DESCRIPTION

 

Citation preview

Page 1: Turner.john

Risk Informed Design and TestOn

NASA’s Constellation Program

John V. Turner, PhDConstellation Program Risk Manager

Used with permission

Page 2: Turner.john

NASA CxP John V. Turner, PMC 2009Page 2

Program Goals• NASA identified goals for the CxP related to ISS Support and

Lunar Exploration– Intent is to lay groundwork for Mars exploration as well

• Exploration Systems Architecture Study conducted to develop exploration systems architecture to support these missions

• Constellation program chartered to develop and field this architecture

Page 3: Turner.john

NASA CxP John V. Turner, PMC 2009Page 3

The Challenge

• Develop an architecture that optimally meets goals and objectives, within cost and schedule, and with acceptable safety and mission success riskRisk Informed Design: aims to support design activities

in identifying acceptable and optimal safety Risk Informed Test: aims to support test activities in

identifying ways to best reduce uncertainty and risk (uncover defects in design, manufacturing, and processing prior to IOC)

Page 4: Turner.john

NASA CxP John V. Turner, PMC 2009Page 4

Risk Timeline – ISS Mission

Ground Ops

First StageIgnition Staging

Second Stage

MECO

Orbit OpsDocked at ISS

Mission Elapsed Time

Crew Ingress A Leading Risk!

8-10 Minutes 180-210 Days!

Entry / Landing / Rescue

Timeframes and Intensity are illustrative – not to actual scale.

• Risk changes in character, intensity, source over time• Risk prevention and mitigation must be considered in

every system and activity across all mission phases• Understanding the integrated implications of system risks

is critical to success

Page 5: Turner.john

NASA CxP John V. Turner, PMC 2009Page 5

Sources of Failure• Where do Defects Enter into the flight equipment and operations

that result in failure?• Defects can arise through

– Actual system design flaws– Inadequate testing to uncover defects– Manufacturing errors– Integration or processing errors– Bad decisions during real time

• Note: History indicates that manufacturing, integration and processing are very significant defect sources

• Goal: Put in place processes that identify and eliminate defects leading to failure

Design TestManufacturing Integration /

Processing Operations.

RI Design RI Test Robust Quality Assurance MissionOperations

DefectSource

Mitigation

Page 6: Turner.john

NASA CxP John V. Turner, PMC 2009Page 6

Risk Informed Design (RID)

• Probabilistic Requirements to drive risk performance in the design

• Loss of Crew (LOC) and Loss of Mission (LOM) risk factored into significant design and planning trades

• Risk assessment embedded in Integrated Design Analysis Cycles to inform all key analysis tasks• “Zero Based Design”

• Risk Informed Test Plans• Focus additional analysis and test resources on High

risk / High Uncertainty areas

6

Page 7: Turner.john

NASA CxP John V. Turner, PMC 2009Page 7

RID Approach• Premise: Risk is a design commodity like mass or power

• Qualitative and Quantitative risk analyses expose dominant risk contributors and support design and planning trades to assign critical design commodities (mass, volume, power, cost, etc.)

• Iterative systems engineering design cycles incorporate risk in trade space and identify design solutions that are risk informed

• Risk analysis considers all significant failure types, including: functional, phenomenological, software, human reliability, common cause, and external or environmental events,

• Complexity and fidelity of analysis consistent with the available data and information during each design cycle

Page 8: Turner.john

NASA CxP John V. Turner, PMC 2009Page 8

“Zero Based Design”1. Early design concepts are defined with minimally required

functionality to perform the mission and no redundancy– Focus on implementing “Key Driving Requirements” vs

establishing a fully functional, acceptably safe, or highly reliable design.

– Risk analyses are performed during this phase to understand the risk vulnerabilities of this “zero based design” (ZBD).

Page 9: Turner.john

NASA CxP John V. Turner, PMC 2009Page 9

2. Prioritize design enhancements with a focus on enhanced functionality and LOC risk. – Focus: “Make the design work”, “Make the design safe” – Identify optimal use of design commodities, cost, and

schedule to reduce risk – with priority on diversity vs simple redundancy.

– Major Premise: Simple redundancy is one option to improve safety and reliability. It is not the only option. It is not always the safest or most cost effective option.

– Compare different investment portfolios using FOMs derived from key risk commodities, including LOC risk

– Goal: Spend scarce risk mitigation resources (mass, power, volume, cost) most effectively to maximally address risk

“Zero Based Design”

Page 10: Turner.john

NASA CxP John V. Turner, PMC 2009Page 10

3. Finally, additional enhancements are considered which more fully address functional requirements and focus on reliability and loss of mission (LOM) risk. – A portfolio approach to comparing investments is again

used– Ensures that the final design iteration produces a

vehicle that better meets functional requirements, safely, reliably, and within budget.

“Zero Based Design”

Page 11: Turner.john

NASA CxP John V. Turner, PMC 2009Page 11

Zero Based Design Summary• “Build-Up” approach from the zero based design to a

risk balanced system design, its complexity, and the existence of each system element. – Rationale exists to justify resource allocations such as: mass,

power, and assures that affirmative rationale is used for the cost.

– Build up approach lessens the likelihood of having to make dramatic design changes later in the design cycle to resolve critical commodity shortfalls and get back “in the box.”

• This approach is described in detail in two NESC reports:– “Crew Exploration Vehicle Smart Buyer Design Team Final

Report”

– “DDT&E Considerations for Safe and Reliable Human Rated Spacecraft Systems”

Page 12: Turner.john

NASA CxP John V. Turner, PMC 2009Page 12

Results

• Program– Original ESAS Loss of Crew (LOC) and Loss of Mission (LOM)

requirements were derived using initial architecture trade study that included conceptual design concepts and underestimated certain significant risk drivers

– Requirements have been adjusted based on current design and environments, improved analysis and a better understanding of what is challenging yet achievable

– CxP architecture currently meeting mission level LOC and LOM requirements

• Orion Project– Orion early design conducted prior to inauguration of RID activities– Began RID design cycles in late 2007

– Significant design changes (4X improvement in LOC, 3X in LOM))– Implemented Apollo 13 Low Power Emergency Return Capability– improvements in safety and mission success while resolving

mass challenges

Page 13: Turner.john

NASA CxP John V. Turner, PMC 2009Page 13

Results

• Ares Project − Ares conducted RID design trades early in the DDTE process

and incorporated design changes in multiple subsystems− Ares I risk analysis currently projects significant improvement in

reliability of previous manned launch systems• Altair Project

– Conducted ZBD approach from project initiation, completed LOM and LOC risk buyback design iterations from Zero Based design configuration

– Significantly Improved safety and mission success and developed stronger design concept to enter next stage of design

Page 14: Turner.john

NASA CxP John V. Turner, PMC 2009Page 14

Some Lessons Learned• RID brings designers and analysts together early to evaluate sources of

risk, the integrated implications of risks, and the efficacy of different design implementations in maximizing safety and mission success

• RID drives designers toward dissimilar or functional redundancy vis traditional redundant system approach – Reduced weight penalty incurred by traditional method

• Requirements are met more effectively wrt use of design commodities– design features can be prioritized to determine where reductions are best

applied in the event of mass issues• Risk Informed Campaign Analysis provides insight into “program” vs

“mission” success as a function of system design issues• Evaluating DRM LOM requires strong understanding of operational

flexibility – forces early operations criteria development and operations driven design

• Current methods for evaluating Maturity Growth require improvement– Assumed maturity for design analysis– Need better way to address maturity growth and determine early mission

risk

Page 15: Turner.john

NASA CxP John V. Turner, PMC 2009Page 15

Some Lessons Learned• The tools used to model LOC and LOM should evolve from early concept

development to verification phase– Simple, historical data driven models early– Conventional Linked Fault Tree / Event Tree models later– Models increase in complexity and fidelity with the design

• Application of Qualitative Top Down Functional Modeling to identify significant hazards that should drive both the Integrated Hazard Analysis and PRA Master Logic Diagrams

• Consistency and Visibility are Critical!– Models, – Data– Methods– Tools

• Three types of risk analysis to support RID– Mission Risk Models– Hazard Quantification– Focused assessments and trades– Different methods potentially used for each

Page 16: Turner.john

NASA CxP John V. Turner, PMC 2009Page 16

Risk Informed Design Continuum

SRR SDR PDR CDREarly Concept

Exploration

Define initial mission architecture

Define Requirements

Preliminary Design

Early Design DetailedDesign

TBD

Verification

Risk Analysis Fidelity

Des

ign

Ana

lysi

s C

ycle

s

Simple models…………………………………………………………………………………………....Complex Models

Heritage and surrogate data………………………………………………………………………Test ./ Demonstrated Data

Architecture trades……………….………………Design Improvement…………………………………..Verification

Design Fidelity

CxP

16

Page 17: Turner.john

NASA CxP John V. Turner, PMC 2009Page 17

Architecture Trade Studies

0.00 1.00 2.00 3.00

Reference Missions

Architecture 2Architecture 9Architecture 4Architecture 7Architecture 1Architecture 3Architecture 8Architecture 5

Architecture 10Architecture 6

Ris

k FO

M

Mars Mission Architecture Risk Assessment

Systems ReliabilityEntry / LandingMars Orbit InsertionLaunch / IntegrationTrans Mars InjectionMars AscentTrans Earth InjectionOther Hazards

Example Only – Not Real Data

Page 18: Turner.john

NASA CxP John V. Turner, PMC 2009Page 18

Architecture and System Level Assessments

18

Example Only – Not Real Data

Page 19: Turner.john

NASA CxP John V. Turner, PMC 2009Page 19

LOC Uncertainty Results

19

Example Only – Not Real Data

Page 20: Turner.john

NASA CxP John V. Turner, PMC 2009Page 20

Prioritizing Design Mitigation

Example Only – Not Real Data

Page 21: Turner.john

NASA CxP John V. Turner, PMC 2009Page 21

Mission Success Depends Upon a Combination of Many Variables

Launch:• Time increment

between launches

• Launch Availability

• Launch Probability

• Order of Launches

LEO Loiter:• LEO Loiter Duration

• Ascent Rendezvous Opportunities

• TLI Windows

Vehicle Reliability:• LOM/LOC

Target Characteristics:• Redundant Landing Sites

• Multiple opportunities to access a select landing site

• Lighting constraints at target

Launch Strategy:• Two launch

• Single Launch

Vehicle Performance:• Orbital Mechanics Variation

Tolerance

• Additional Propulsive Capability

• Vehicle Life

• Launch Mass Constraints

Page 22: Turner.john

NASA CxP John V. Turner, PMC 2009Page 22

RITOS Overview• RITOS Objective: Elicit expert opinion and historical data related to top

program flight risk drivers in order to:1. Better understand the risks and associated uncertainties2. Identify potential mitigations and/or controls and effective test

and verification strategies3. Qualitatively assess the adequacy of the currently planned

mitigations/controls and test and verification activities

• RITOS Approach: – Identify top program risk drivers based on SR&QA products,

history, and judgment– Elicit expert opinion and historical data related to the risk driver– Assess currently planned mitigation/control and test and

verification strategies based on elicitation results, historical data, and judgment

– Provide recommendations to T&V for enhancing currently planned approach to risk driver mitigation/control and test and verification

Page 23: Turner.john

NASA CxP John V. Turner, PMC 2009Page 23

SR&QA Scope♦Are planned analysis and test (ground/flight)

adequate to characterize and burn down risk?• Type, scope, and fidelity of tests• Frequency of tests

♦ Is the plan executable?• Budget

− Enough $• Schedule

− Fabrication / Integration / Need dates− Test, Fix, Fly

• Analysis and Reaction Time

• Facilities− Do we have the right facilities− Availability

• Test articles− Availability, fidelity, re-use issues, timing

SR&QAFocus

SE&IFocus

Page 24: Turner.john

NASA CxP John V. Turner, PMC 2009Page 24

Risk Topic Selection• Risk topics are chosen based on their priority in the various SR&QA risk

product results, historical data, and SR&QA judgment• Initial topic list:

– MMOD Impact to Orion for ISS DRM– First Stage/Upper Stage Separation– Orion descent and landing– Upper Stage Engine– Launch Abort System– Upper Stage/Orion Separation– Thermal Protection System

• List can be further expanded as new risk topics are identified

Page 25: Turner.john

NASA CxP John V. Turner, PMC 2009Page 25

Expert Elicitation

• Each risk topic is researched in order to understand the mechanisms and/or phenomena that drive the risk

• Attempt to identify an expert from each discipline area related to the risk – External candidates– Historical failure experts– CxP Internal subject matter experts– NESC panelists from applicable studies

• Elicitation is a structured one-on-one discussion with the candidate in which various topics related to the risk are discussed, but in context to:

– Risk calculation, characterization, and uncertainty– Test and Verification– Mitigations and Controls

• Following elicitation, results are combined into themes and organized such that they are useful to the assessment

Page 26: Turner.john

NASA CxP John V. Turner, PMC 2009Page 26

Results Format• RITOS objective is to provide results that are beneficial to CxP IT&V and

SR&QA• RITOS approach can be modified for each risk topic to accommodate IT&V

needs• Results are qualitative, but can provide “sanity check” of T&V plans• Results could be useful in prioritizing test objectives• Results will be presented in two formats:

1. Bulleted form as elicitation result conclusions2. Swimlane chart depiction of currently planned T&V activities with RITOS

recommendations mapped into process flow

Page 27: Turner.john

NASA CxP John V. Turner, PMC 2009Page 27

RITOS Progress to DateInitial Research

Candidate ID /Question Dev

Schedule and Conduct Elicitations; Compile Results

Assess Existing Test Plan Status

MMOD

FS/US Sep

US Engine

ED&L

TPS

LAS

US/Orion Sep

On-hold Reduced progress Normal progress

Page 28: Turner.john

NASA CxP John V. Turner, PMC 2009Page 28

RITOS Lessons Learned• Obtaining access to CxP Internal Subject Matter Experts has proven to be

challenging. Working through Level II representatives has helped but not fully solved problem.

• Coordination with Projects is challenging and time consuming. In cases where delay to study progress occurred we moved ahead to future topics to continue progressing.

• Obtaining test plans is challenging and in some cases test plans do not exist. In cases where test plans are not available we package results in way that can be used during test plan development. Test plan will be assessed once it becomes available.

• Typical RITOS elicitation results are qualitative, but can provide sanity check of test plans and insight into test prioritization when reductions are being considered. Conclusions obtained from elicitation themes are provided to Cx IT&V.

Page 29: Turner.john

NASA CxP John V. Turner, PMC 2009Page 29

Conclusions• Risk Informed Design Provides a methodology to incorporate risk

information early in the design process and obtain a more optimal balance of design commodities and risk than traditional rule of thumb safety design criteria

• Risk Informed Test utilizes risk information to identify areas where test can be used more effectively to reduce uncertainty and risk prior to transition to operations.

• Experience to date in the Constellation program indicates the value of the RID and RIT, but additional work is need to develop more consistent methods and tools to accomplish RID and RIT

• In order to eliminate defects and thus reduce actual failures, programs and projects need to proactively address defect sources in Design, Test, Mission Assurance, and operations– This presentation only addresses two of these four “buckets”

Page 30: Turner.john

NASA CxP John V. Turner, PMC 2009Page 30

Backup

Page 31: Turner.john

NASA CxP John V. Turner, PMC 2009Page 31 31

COMPONENTS AND FLOW OF A TYPICAL PRA MODEL

Phase I Results FMEAs/CILs Hazard Reports Functional Analyses Previous Risk

Assessments

MLDDevelopment

List ofInitiating Events

SAPHIRE

Flight Rules Training Manuals System Architecture Engineering Expertise

MADS PRACA Industry databases Other assessments

(e.g. off-line simulation models)

Relative risk drivers

Event Trees

Fault Trees

Data Analyses

Reviewed byProgram Organizations

Risk Levels forselected end states

End States

List of consequenceof interest

Cut Sets

CCF A,B,C 1E-3

Gas Explosion 2E-4

A fails, B fails, C fails 1.5 E-4

Etc.

For Shuttle:LOCV (Loss of Crew & Vehicle)

Something that this graphic does not display isthe necessary engineering analysis that must be done

to support success criteria and capacity

A large number of pages of detailed documentation are required

Assumptions•

Page 32: Turner.john

NASA CxP John V. Turner, PMC 2009Page 32

RITOS Process

Page 32RITOS Overview10/19/2009