14
Californ ia Institut e of Technolo gy Formalized Pilot Study of Safety-Critical Software Anomalies Dr. Robyn Lutz and Carmen Mikulski This research was carried out at the Jet Propulsion Laboratory, California Institute of Technology, under a contract with the National Aeronautics and Space Administration. The work was sponsored by the NASA Office of Safety and Mission Assurance under the Software Assurance Research Program led by the NASA Software IV&V Facility. This activity is managed locally at JPL through the Assurance and Technology Program Office OSMA Software Assurance Symposium September 4-6, 2002

California Institute of Technology Formalized Pilot Study of Safety- Critical Software Anomalies Dr. Robyn Lutz and Carmen Mikulski This research was carried

Embed Size (px)

Citation preview

Page 1: California Institute of Technology Formalized Pilot Study of Safety- Critical Software Anomalies Dr. Robyn Lutz and Carmen Mikulski This research was carried

California Institute of Technology

Formalized Pilot Study of Safety-Critical Software Anomalies

Dr. Robyn Lutz and Carmen Mikulski

This research was carried out at the Jet Propulsion Laboratory, California Institute of Technology, under a contract with the National Aeronautics and Space Administration. The work was sponsored by

the NASA Office of Safety and Mission Assurance under the Software Assurance Research Program led by the NASA Software IV&V Facility.

This activity is managed locally at JPL through the Assurance and Technology Program Office

OSMA Software Assurance Symposium

September 4-6, 2002

Page 2: California Institute of Technology Formalized Pilot Study of Safety- Critical Software Anomalies Dr. Robyn Lutz and Carmen Mikulski This research was carried

SAS’02 2

California Institute of TechnologyTopics

• Overview• Results

– Quantitative analysis– Evolution of requirements– Pattern identification & unexpected

patterns• Work-in-progress• Benefits

Page 3: California Institute of Technology Formalized Pilot Study of Safety- Critical Software Anomalies Dr. Robyn Lutz and Carmen Mikulski This research was carried

SAS’02 3

California Institute of Technology

Overview

• Goal: To reduce the number of safety-critical software anomalies that occur during flight by providing a quantitative analysis of previous anomalies as a foundation for process improvement.

• Approach: Analyzed anomaly data using adaptation of Orthogonal Defect Classification (ODC) method– Developed at IBM; widely used by industry– Quantitative approach– Used here to detect patterns in anomaly data– More information at

http://www.research.ibm.com/softeng

• Evaluated ODC for NASA use using a Formalized Pilot Study [Glass, 97]

Page 4: California Institute of Technology Formalized Pilot Study of Safety- Critical Software Anomalies Dr. Robyn Lutz and Carmen Mikulski This research was carried

SAS’02 4

California Institute of Technology

Overview: Status

• Year 3 of planned 3-year study Plan Design Conduct Evaluate Use

• FY’03 extension proposed to extend ODC work to pre-launch and transition to projects (Deep Impact, contractor-developed software, Mars Exploration Rover testing)

• Adapted ODC categories to operational spacecraft software at JPL:– Activity: what was taking place when anomaly

occurred?– Trigger: what was the catalyst?– Target: what was fixed?– Type: what kind of fix was done?

Page 5: California Institute of Technology Formalized Pilot Study of Safety- Critical Software Anomalies Dr. Robyn Lutz and Carmen Mikulski This research was carried

SAS’02 5

California Institute of Technology

Results: ODC Adaptation

Activities Triggers

Targets Types

Software Configuration Function/Algorithm Hardware Configuration

Ground Software Interfaces

System Test Start/Restart, Shutdown Assignment/Initialization Command Sequence Test

Timing

Inspection/Review Function/Algorithm Recovery Interfaces Normal Activity Flight Software Assignment/Initialization Flight Operations Data Access/Delivery Timing Special Procedure Flight Rule Hardware Failure Build /Package Install Dependency Unknown Unknown Packaging Scripts Ground Resources Resource Conflict Info. Development Documentation Procedures Hardware Hardware None/Unknown Nothing Fixed Unknown

Adapted ODC classification to post-launch spacecraft Incident Surprise Anomalies (ISAs)

Page 6: California Institute of Technology Formalized Pilot Study of Safety- Critical Software Anomalies Dr. Robyn Lutz and Carmen Mikulski This research was carried

SAS’02 6

California Institute of TechnologyResults: Quantitative Analysis

• Analyzed 189 Incident/Surprise/Anomaly reports (ISAs) of highest criticality from 7 spacecraft– Cassini, Deep Space 1, Galileo, Mars Climate Orbiter, Mars

Global Surveyor, Mars Polar Lander, Stardust

• Institutional defect database Access database of data of interest Excel spreadsheet with ODC categories Pivot tables with multiple views of data

• Frequency counts of Activity, Trigger, Target, Type, Trigger within Activity, Type within Target, etc.

• User-selectable representation of results • User-selectable sets of spacecraft for comparison• Provides rapid quantification of data

Page 7: California Institute of Technology Formalized Pilot Study of Safety- Critical Software Anomalies Dr. Robyn Lutz and Carmen Mikulski This research was carried

SAS’02 7

California Institute of Technology

Results: Quantitative Analysis

Distribution of Triggers within Activity

0

10

20

30

40

50

60

70

80

Dat

a A

cces

s/D

eliv

ery

Har

dwar

e F

ailu

re

Nor

mal

Act

ivity

Rec

over

y

Spe

cial

Pro

cedu

re

Cm

d S

eq T

est

Har

dwar

eC

onfig

urat

ion

Insp

ectio

n/R

evie

w

Sof

twar

e C

onfig

urat

ion

Sta

rt/R

esta

rt/S

hutd

own

Unk

now

n

Flight Operations System Test Unknow n

Total

PROJECT (All)

Count of Trigger

Activity Trigger

Drop More Series Fields Here

Page 8: California Institute of Technology Formalized Pilot Study of Safety- Critical Software Anomalies Dr. Robyn Lutz and Carmen Mikulski This research was carried

SAS’02 8

California Institute of Technology

Results: Quantitative Analysis

Flig

ht S

oftw

are

Gro

und

Softw

are

Sys

tem

Tes

t - T

imin

g

Sys

tem

Tes

t - In

terf

aces

Sys

tem

Tes

t -F

unct

ion/

Alg

orith

m

Sys

tem

Tes

t -A

ssig

nmen

t/Ini

tializ

atio

n

Flig

ht O

pera

tions

- T

imin

g

Flig

ht O

pera

tions

- In

terf

aces

Flig

ht O

pera

tions

-F

unct

ion/

Alg

orith

m

Flig

ht O

pera

tions

-A

ssig

nmen

t/Ini

tializ

atio

n

0

2

4

6

8

10

12

14

16

Data Access/Delivery

Hardw are Failure

Normal Activity

Recovery

Special Procedure Flight Softw are

Ground Resources

Ground Softw are

Hardw are

Information Development

None/Unknow n0

5

10

15

20

25

30

Trigger vs. Target

Ground/Flight S/W vs. Type within Activity

Page 9: California Institute of Technology Formalized Pilot Study of Safety- Critical Software Anomalies Dr. Robyn Lutz and Carmen Mikulski This research was carried

SAS’02 9

California Institute of Technology

Results: Evolution of Requirements• Anomalies sometimes result in changes to

software requirements• Finding:

– Change to handle rare event or scenario (software adds fault tolerance)

– Change to compensate for hardware failure or limitations (software adds robustness)

• Contradicts assumption that “what breaks is what gets fixed”

Example: Damaged Solar Array Panel cannot deploy as planned

Activity = Flight Operations (occurred during flight) Trigger = Hardware failure (Solar Array panel incorrect position--

broken piece rotated & prevented latching)

Target = Flight Software (Fixed via changes to flight software)

Type = Function/Algorithm (Added a solar-array-powered hold capability to s/w)

Page 10: California Institute of Technology Formalized Pilot Study of Safety- Critical Software Anomalies Dr. Robyn Lutz and Carmen Mikulski This research was carried

SAS’02 10

California Institute of TechnologyResults: Pattern Identification

• Sample Question: What is the typical signature of a post-launch critical software anomaly?

• Finding: – Activity = Flight Operations– Trigger = Data Access/Delivery– Target = Information Development– Type = Procedures

• Example: Star Scanner anomaly– Activity = occurred during flight – Trigger = star scanner telemetry froze – Target = fix was new description of star calibration – Type = procedure written

Page 11: California Institute of Technology Formalized Pilot Study of Safety- Critical Software Anomalies Dr. Robyn Lutz and Carmen Mikulski This research was carried

SAS’02 11

California Institute of Technology

Results: Unexpected Patterns

Examples of Unexpected ISA patterns:

Process Recommendation: Example (from spacecraft):

22% of critical ISAs had ground software as Target (fix)

Software QA for ground software

Unable to process multiple submissions. Fixed code.

23% of critical ISAs had procedures as Type

Assemble checklist of needed procedures for future projects

Not in inertial mode during star calibration. Additions made to checklist to prevent in future.

Of these, 41% had Data access / delivery as Trigger

Better communication of changes and updates to operations

Multiple queries for spacecraft engineering and monitor data failed. Streamlined notification to operators of problems.

34% of critical ISAs involving system test had software configuration as Trigger (cause) ; 24% had hardware configuration as Trigger

Additional end-to-end configuration testing

OPS personnel did not have a green command system for the uplink of two trajectory-correction command files. Problems resulted from a firewall configuration change.

Page 12: California Institute of Technology Formalized Pilot Study of Safety- Critical Software Anomalies Dr. Robyn Lutz and Carmen Mikulski This research was carried

SAS’02 12

California Institute of Technology

Work-In-Progress

• Assembling process recommendations tied to specific findings and unexpected patterns; in review by projects

• Working to incorporate ODC classifications into next-generation problem-failure reporting database (to support automation & visualization)

• Disseminating results: invited presentations to JPL Software Quality Improvement task, to JPL Mission Assurance Managers, to MER, informal briefings to other flight projects; at Assurance Technology Conference (B. Sigal), included in talk at Metrics 2002 (A. Nikora), at 2001 IFIP WG 2.9 Workshop on Requirements Engineering; papers in 5th IEEE Int’l Symposium on Requirements Engineering and The Journal of Systems and Software (to appear).

Page 13: California Institute of Technology Formalized Pilot Study of Safety- Critical Software Anomalies Dr. Robyn Lutz and Carmen Mikulski This research was carried

SAS’02 13

California Institute of Technology

Work-In-Progress

• Collaborating with Mars Exploration Rover to experimentally extend ODC approach to pre-launch software problem/failure testing reports– Adjusted ODC classifications to testing phases

(build, integration, acceptance)– Delivered experimental ODC analysis of 155

Problem/ Failure Reports to MER – Feedback from Project has been noteworthy– Results can support tracking trends and progress:

• Graphical summaries • Comparisons of testing phases

– Results can provide better understanding of typical problem signatures

Page 14: California Institute of Technology Formalized Pilot Study of Safety- Critical Software Anomalies Dr. Robyn Lutz and Carmen Mikulski This research was carried

SAS’02 14

California Institute of TechnologyBenefits

• User selects preferred representation (e.g., 3-D bar graphs) and set of projects to view

• Data mines historical and current databases of anomaly and problem reports to feed-forward into future projects

• Uses metrics information to identify and focus on problem areas

• Provides quantitative foundation for process improvement

• Equips us with a methodology to continue to learn as projects and processes evolve