View
223
Download
0
Tags:
Embed Size (px)
Citation preview
Decision and Risk Analysis with Applications
Introduction
DC335Martin Neil
Slide 2
Learning Goals
• At the end of this course you should be able to:– Quantify and reason about risk
– Use, in depth, decision support tools
– Analyse and design probabilistic risk models for a wide range of application areas
– Reason about and control software engineering risk
– understand and use basic probability theory
• These skills will be useful when:– Designing and building Bayesian expert systems
– Anticipating and controlling project risks
– Performing formal risk assessments of safety or mission critical systems
– Reasoning about risk and uncertainty in everyday situations
Slide 3
Reading
• No recommended text. Any introductory book on probability will help
• References:
1. Larson H.J. “Introduction to Probability”, Addison-Wesley, 1995.2. Lewis H W, "Why flip a coin: the art and science of good
decisions", John Wiley & Sons 1997.3. Dewdney A K, "200% of nothing", John Wiley & Sons 1997.4. Lindley D.V. “Making Decisions”, John Wiley and Sons, 1994.5. Lee, P. 1989. Bayesian Statistics: An Introduction. Arnold, 1997.6. Jensen FV, ''An Introduction to Bayesian Networks'', UCL Press,
1996.7. Pearl J, ''Probabilistic reasoning in intelligent systems'', Morgan
Kaufmann, Palo Alto, CA, 1988.
Slide 4
Software Engineering is Risky!
• Risks in development, operation and maintenance
• Typical Risks:– User harm or financial loss
– Poor quality products
– Poor quality design processes
– Poor tools and methods
– Poor skills and low expertise
– Budget overruns and late delivery
– Damage to organisational or personal reputation
Slide 5
Basic Definitions and Terminology
• Risk 1 — “chance of bad consequences, loss, failure”
• Risk 2 — chance of event x severity of event
• Decision — “act of deciding in favour or against a possible action”
• Risk analysis — the science and art of specifying actions and events and identifying the probability and desirability of their occurrence
• Decision support — the provision to aids to help decision makers explore risky problems and predict possible outcomes
Slide 6
Why do we need to improve decision making?
• People are not always rational! Perception of risk/chance may not match reality.
• Examples:– The national lottery– Modes of transport: Airplane Vs Car Vs Cruise– Automatic train protection
• Possible biases:– Availability of more recent cases– Ignorance of statistical evidence– Emphasis on easier to remember dramatic events– Large single consequences often outweighs multiple small
consequences– Illusion of control
• Society is becoming increasingly risk averse
Slide 7
Why do we need to formally evaluate risk?
• Most everyday decisions do not require explicit mathematical models
• Formal risk analyses involve statistical or probabilistic models
• Pros and cons:– Formal models are open to debate; their forecasts can be repeated;
they can help us learn to improve– Informal models are personal not open; assumptions are hidden;
are not often open to verification– Informal models faster to formulate and use
• Risk analysis assumes rational perspective:– Face up to cold reality: model how the world works not how we
would like it to be– Equal emphasis given to desirable and undesirable outcomes
Slide 8
The Ubiquity of Software
• Society’s increasing reliance on software
• Software used in:– Medical devices (pacemakers, radiotherapy machines)
– Consumer goods (fuzzy controllers in washing machines and cameras)
– Industrial plant and machinery (automated assembly lines)
– Commercial systems (world banking system, ATMs)
– Transport (fly-by-wire aircraft, car engine management)
– Internet
• Software faults cause system failures
Slide 9
Banking Failures
• Chemical Bank’s ATMs– 100 000 customers debited twice
– Fault in software
– Underlying cause was bank merger - software changed to cope with merged ATM systems
• VISA UK data centre– Programmer error caused hundreds of valid cards to be rejected for
several hours
Slide 10
Military Systems Failures
• Patriot Missile failure during Gulf War– Failed to track and intercept Iraqi Scud missile
– Struck a US Army Barracks leading to 28 deaths
– Software fault - 0.36 second error in clock timer
• Star Wars missile crash - Cape Canaveral– Aries rocket blown up 23 seconds after it was launched
– Instead of heading northeast over the Atlantic it sped south
– Technician accidentally loaded wrong software
– Cost of launch was $5 million
Slide 11
London Ambulance Service Failure
• Novel computer-aided dispatch system collapsed
• System wasn’t tracking accurately the position and status of each ambulance
• Led to downward spiral of delays
• Ambulance crews accustomed to arriving in minutes now took hours
• Multiple causes– Software assumed perfect ambulance position information
– Recent change introduced memory leak
– Operators were “out of the loop”
Slide 12
Therac-25 Radiotherapy Machine Failure
• Malfunction killed at least two patients; six received severe overdose
• Software designers did not anticipate use of keyboard’s arrow keys
• Possible reasons for failure– Safety analysis neglected to omit possibility of software fault
– Over confidence in software led to removal of hardware protection
– Programming done to commercial rather than safety-critical standards
Slide 13
Typical Therac-25 Facility
©2000, John Wiley & Sons, Inc. Horstmann/Java Essentials, 2/e
Slide 14
NASA’s Space Shuttle
• First actual launch 10 April 1984 - 3 years late, millions of $ overspend.
• First planned launch cancelled because of computer synchronisation fault
• In 1989 “Program Notes and Waivers” book detailed software faults
Slide 15
Airbus A320
• First civilian fly-by-wire aircraft
• Computer controls:– Electrical Flight Control System (EFCS) qualifies A320 as a fly-by-
wire aircraft
• Accidents:– Habsheim airshow (Habsheim Video)
– Bangalore
– Warsaw
– Strasbourg
• Poorer safety record than conventional aircraft
• Boeing 777 has been successful so far
Slide 16
Some Well-Known Commercial Failures
• ICL poll tax– Company successfully sued for supply of faulty software
– ICL’s limited liability defence rejected by judge
• Pepsi Cola– Fault led to printing of 500, 000 winning numbers rather than one
– Company faced a substantial liability
• Air tours– Air companies are completely dependent on automatic booking
systems
– Failure in booking system led to loss of £5m
Slide 17
Basic Risk Assessment Cycle
• Risk identification– Identify list of risky events that may cause loss– Identify chance of event occurring– Measure magnitude of possible losses– Determine whether total risk is acceptable
• Risk analysis– Sort events into controllable and uncontrollable sub-sets– Identify relationships between them (root causes, intermediate events,
consequences)
• Risk control– Identify remedial or mitigating actions for controllable events– For each action specify magnitude of cost to act– Choose action(s) that minimise risk for least cost
• Risk monitoring– Continue to measure and monitor risk– If risks exceed acceptable level or new risks occur repeat cycle
Slide 18
Example risk process for testing (1)
• Risks identified– Software contains severe defects
– Testing and debugging budget will be overspent and the target release date missed
• Results of risk analysis – Testing quality is poor
– Testers lack the testing techniques
– System requirements are out of date
– Quality of design and coding work is poor
– Budget and schedule fixed
Slide 19
Example risk process for testing (2)
• Risk controls– Improve tester methods and techniques by recruiting consultant to
train in-house personnel
– Buy-in testing tools and send testers on testing course
– Quality of coding and design work improved by doing code inspections
– Quality of coding and testing improved by using OO methods
– + all combinations of above improvements
• Risk monitors– Defects found during inspections
– Effort spent in inspections
– Defects found in testing
– Budget spent in testing
Slide 20
Ubiquity of uncertainty
• Rich language: likely, probable, credible, plausible, possible, chance, frequent, odds
• Uncertainty about propositions and future or past events that are currently unobserved– Did Cleopatra have a big nose?
– Will it rain tomorrow?
– Baby boy or girl?
• Aim to measure uncertainty numerically for:– making practical decisions
– predicting events (past or future)
– explaining phenomena (science)
Slide 21
Misunderstanding Randomness and Chance
• Jackie Mason
“I always carry a bomb on the plane; that way I know I’m safe because the chances of another bomber on board are infinitesimally low”
• Which sequence of coin flips is more likely from a fair coin?
A: H H H T T T
B: H T H T H T
Slide 22
Everyday examples
• Explaining coincidences– What is p(at least one pair of same birthdays in class of 40)?
1%, 5%, 40%, 90%, 100%?
– Woman in USA won lottery jackpot twice. Is this a rare event?
1 in 10 million, 1 in million, 1 in 1000, 1 in 10?
• Rational lottery player. p( win a higher jackpot | win) if:– Choose two consecutive numbers
– Choose numbers nearer middle of row
Slide 23
Game Show!
• Suppose you are a contestant on a television show and are allowed to pick one of three envelopes (red, green and blue). Two envelopes are empty and the third contains £1m prize.
• Now after you choose an envelope the host takes an empty envelope and reveals it as empty. He then asks if you would like to switch or stick?
• After your decision he then reveals the contents of your envelope.
Would you stick or switch?… and why?
Slide 24
Game Show Answer
Red Green Blue
X
X
X
X
X
X
X
X
X
Win if -
r r
r
r
r
r
r
r
r
r r
r
StickSwitchSwitch
SwitchStickSwitch
SwitchSwitchStick
X – prize, r – revealed, O – initial choice
Slide 25
Imprecision Vs Uncertain Inference
• Consider some propositions– “John is tall” Vs “John may grow to be 6 ft 2 in”– “It might rain heavily tomorrow” Vs “it might rain 2 in tomorrow”
• Imprecision of attribute– 6 ft 2 in is more precise than “tall” – 2 in is more precise than “rain heavily”
• Uncertainty in inference– “might” rain tomorrow…..because of storm front moving in– “may” grow to be…….because his father was tall
• Probability is restricted to measuring uncertainty not imprecision
• Ambiguity - disagreements over definitions (schizophrenia, s/w reliability)
Slide 26
Subjective Vs Objective Probability
• Statistical events are those where extensive repetition under similar conditions is observed– long-run probability of heads or tails in repeated coin tosses = 0.5
– frequency of rain at this time of year
– traditional concern with gambling games
– Events that are unique or novel. Little or no previous experience
» will large meteorite hit the earth in next decade?
» did Shakespeare write all plays attributed to him?
• Events that have patterns in nature are said to be objective because of consensus amongst observers. Subjective probabilities based on personal (unshared) experience.
Slide 27
Definitions
• p(E) - probability of event E• Subjective: probability as degree of belief in truth or falseness of
a proposition/event given prior knowledge H» Depends on viewpoint and prior experience
– Londoner - p(next bus is red | knowledge about routes and times)– Tourist - p(next bus is red | assumption that all London buses are red)
• Classical: ratio of frequency of occurrence– p(next bus is red) = # red buses/ # total buses
• Frequentist: Probability is the limiting value as the number of trials becomes infinite of the frequency of occurrence of some event– p(coin flip is Heads) = 0.5
Slide 28
Problems in Data-driven Approach
• Validity restricted to easily collected data or properly controlled experiments– But most important issues in software engineering involve soft or
people factors
– How much data can you afford to collect before making a decision?
• Data sets must represent homogeneous samples drawn from well defined population– Are all functions in a program the same?
– Are all projects the same?
• Strength of relationships determined by correlation between variables– High correlation between shoe size and IQ!
Slide 29
Assessing Risk of Road Fatalities: Naïve Approach
Season Colder months
Number of Fatalities
Fewer fatalities
Slide 30
Assessing Risk of Road Fatalities: Causal/explanatory model
Season
Weather
Averagespeed
Dangerlevel
RoadConditions
Number ofFatalities
Number of journeys
Slide 31
Decision Support Tool Demonstration
Slide 32
Lesson Summary
• Computers increasingly used in safety, financial and mission critical systems
• Need to assess the risks to producer and user
• High-stakes decisions need to be made rationally using formal tools
• Risk analysis process required
• Subjective and objective probabilities needed to assess risk as part of this process
• BBN decision support tools as a risk model
Slide 33
Course Overview
• Bayesian Probability Theory
• Bayesian Network modelling with examples
• Measurement of risk
• Principles of risk management
• Building causal models in practice
• Real world examples
• Decision support environments
• Labs using Hugin Bayesian Network tool
• Tutorials on probability theory and risk