View
215
Download
0
Category
Tags:
Preview:
Citation preview
CS, AU Henrik Bærbak Christensen 1
Critical Systems
Sommerville 7th Ed
Chapter 3
CS, AU Henrik Bærbak Christensen 2
Critical Systems
Sommerville: Critical System = Dependability is most
important quality Three main types:
– Safety-critical systems: A system whose failure may result in injury, loss of life or serious environmental damage
– Mission-critical systems: A system whose failure may result in the failure of some goal-directed activity
– Business-critical systems: A system whose failure may result in very high costs for users of the system.
CS, AU Henrik Bærbak Christensen 3
Dependability
Dependability equals thrustworthiness– Degree of user confidence that the system will
operate as they expect– Not a numerical/quantitative measure but a
relative/perceived measure: (very high very low)
Engineering dependable systems often– Are conservative: only use proven methods– Are more costly: may fx use formal methods– Must consider the socio-technical system
• Humans to handle errors; humans as source of errors
CS, AU Henrik Bærbak Christensen 4
Dependability subqualities
CS, AU Henrik Bærbak Christensen 5
Dependability
Availability:– Probability that it will be able to deliver useful service
at any given time
Reliability:– Probability that it will correctly deliver services as
expected over a given period of time
Safety:– Judgment of how likely it is that the system will cause
damage to people or environment
Security:– Judgment of how likely it is that the system can resist
accidental or deliberate intrusions
CS, AU Henrik Bærbak Christensen 6
Dependability versus performance
Dependability costs performance, but usually dependability is more important than performance:
– Undependable systems are unused– Failure may cost fortunes– Dependability cannot be retrofitted– Lack of performance can be compensated– Untrustworthy systems may loose information
CS, AU Henrik Bærbak Christensen 7
Measuring
Two of the four aspects are measured qualitatively, that is based upon judgment:
Security and Safety
Often one talks about integrity levels.– level 1 is better than level 2 etc.
Example: NASA Space shuttle mission software– Fault severity levels.
• Level 0 = Loss of craft and crew.• Level 1 = Failure of mission • Level 2 …
CS, AU Henrik Bærbak Christensen 8
Measuring
Two of the four aspects may be measured quantitatively:
– Availability: Probability that a system at a point in time will be operational and able to provide services
– Reliability: Probability that a software system will not cause the failure of the system for a specified time under specified conditions.
CS, AU Henrik Bærbak Christensen 9
Exercises
How does availability and reliability as defined by Sommerville fit definitions by IEEE and Bass?
Why does Bass not cover qualities such as Safety and Reliability?
An available system – does it really require “at any time?”
CS, AU Henrik Bærbak Christensen 10
Other sub qualities
Other sub qualities of dependability– Repairability: time to repair– Maintainability: cost of introducing change– Survivability: ability to continue to deliver services
while under attack or while part of the system is disabled. [particular important to web systems]
– Error Tolerance: the extent to which the system has been designed so that user input error are avoided and tolerated.
CS, AU Henrik Bærbak Christensen 11
Cost
Dependable systems are costly !
CS, AU Henrik Bærbak Christensen 12
Reliability and Availability
CS, AU Henrik Bærbak Christensen 13
The two
These two qualities are similar but not the same.– Both probabilities, but– High available but not high reliable
• Telephone switch systems: No dial tone, just try again– A connection may fail but if reconnecting is quick, then no harm
Availability relies on time to fix the error– A: Fails once a year, fixing takes three days– B: Fails once a month, fixing takes 10 minutes– A is most reliable, B is most available
CS, AU Henrik Bærbak Christensen 14
The two
However, of course they are related– An unreliable system will most certainly be
unavailable…
Why does Bass not mention reliability but does mention availability?
CS, AU Henrik Bærbak Christensen 15
Ensuring reliability
Reliability is compromised by failures. So – reliability can be enhanced by several measures.– Fault avoidance: simply avoid introducing defects!
– Fault detection and removal: Find and remove the defects before they cause failures.
– Fault tolerance: Ensure that faults does not lead to failures.
CS, AU Henrik Bærbak Christensen 1616
Run-time cycle Revisited
Faults cause failures when faulty code is executed with inputs that expose the fault.– I_e: input that will lead the system into error state
Program execution
state
I_e
error states
Input space
CS, AU Henrik Bærbak Christensen 1717
How does each technique cope?
A) Avoidance? B) Detection and Removal? C Tolerance?
Program execution
state
I_e
error states
Input space
CS, AU Henrik Bærbak Christensen 18
Safety
CS, AU Henrik Bærbak Christensen 19
Terminology
Safety brings its own vocabulary– Accident: Unplanned event or series of events which
results in death, injury, damage to property or environment
– Hazard: A condition which the potential for causing or contributing to an accident.
– Damage: A measure of the loss resulting from the accident.
– Hazard severity: Assessment of worst damage resulting from a hazard.
– Hazard probability: Probability of events occurring which create hazard
CS, AU Henrik Bærbak Christensen 20
Exercise
Therac-25 Cancer Radiation Therapy– Malfunction 54…
A software error killed Cox and Kidd. It involved the apparently straightforward operation of switching the machine between two operating modes. Linear accelerators, including the Therac-25, can produce two kinds of radiation beams: electron beams and X-rays. Patients are treated with both kinds. First, an electron beam is generated. It may irradiate the patient directly; alternatively, an X-ray beam can be created by placing a metal target into the electron beam: as electrons are absorbed in the target, X-rays emerge from the other side. However, the efficiency of this X-ray-producing process is very poor, so the intensity of the electron beam has to be massively increased when the target is in place. The electron beam intensity in X-ray mode can be over 100 times as great as during an electron beam treatment.
However, if the operator selected X-rays by mistake, realized her error, and then selected electrons--all within 8 seconds [1, 13]--the target was withdrawn but the full-intensity beam was turned on. This error--trivial to commit-- killed Cox and Kidd. Measurements at Tyler by physicist Fritz Hager, in which he reproduced the accident using a model of a patient called a "phantom," indicated that Kidd received a dose of about 25,000 rads-- more than 100 times the prescribed dose [1, 2, 5].
What is accident, hazard, damage, hazard severity, hazard probability…
CS, AU Henrik Bærbak Christensen 21
Techniques
Hazard avoidance
Hazard detection and removal
Damage limitation
CS, AU Henrik Bærbak Christensen 22
Security
CS, AU Henrik Bærbak Christensen 23
Terminology
Security also brings its own vocabulary– Exposure: Possible loss or harm to system.– Vulnerability: A weakness in the computer based
system that can be exploited to cause harm or loss.– Attack: An exploitation of a vulnerability.– Threats: Circumstances that have potential to cause
loss or harm. (Vulnerability subjected to attack)– Control: Protective measure that reduces a system’s
vulnerability.
CS, AU Henrik Bærbak Christensen 24
Types of damage
Denial of service: System is forced into state where its normal services becomes unavailable.
Corruption of programs or data: Software components are altered in unauthorized ways.
Disclosure of confidential information: Attack expose confidential information to non-authorized personal
CS, AU Henrik Bærbak Christensen 25
Techniques
Vulnerability avoidance
Attack detection and neutralization
Recommended