Upload
others
View
12
Download
0
Embed Size (px)
Citation preview
Probabilistic Analysis Using a Theorem Prover
Osman Hasan Sofiene Tahar
Hardware Verification Group Concordia University
Montreal, Canada
CADE-22 Tutorial August 2, 2009
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar
Objectives
q Probabilistic Theorem Proving “A formal verification technique for systems with random or unpredictable components”
q Why do we need it? q What is it?
q How can we apply it for the analysis of real-world applications?
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar
3
Outline
q Introduction and Motivation
q Probabilistic Theorem Proving
q Case Studies q Coupon Collector’s Problem
q Stop-and-Wait Protocol
q Reconfigurable Memory Arrays
q Conclusions
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar
4
Why System Verification?
Hardware Software
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar
5
Why System Verification?
q Therac-25
q Software Bug in a Cancer Therapy Machine
q 3 Deaths and 3 severe injuries between 1985-87
q FDIV bug in Intel Pentium
q Hardware error in the floating point division unit
q Resulted in net loss of US $500M to the company in 1994
q Faulty systems can be disastrous
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar
6
Why System Verification?
q Mars Polar Lander
q Engine shutdown due to spurious signals that gave false indication that spacecraft had landed Mars
q Resulted in a loss of US $370M in 1999
q Mars Climate Orbiter
q Conversion error from English units to metric units
q Resulted in a loss of US $125M in 1999
q Faulty systems can be disastrous
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar
7
Why System Verification?
q Faulty systems can be disastrous
q Unfortunately, many other examples can be found … And the list is still growing!
q System Verification is the process that allows us to debug errors in the design phase where it is cheaper to do so
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar
8
System Verification
Hardware Software
System Model
Property Satisfied? Yes/No
Properties
Computer Based Analysis Framework
Probabilistic Analysis using a Theorem Prover 9
Simulation
q State-of-the-art system verification approach
q Step 1 q Construct a computer based model of the system
q Step 2 q Analyze the behavior of the system model under a
number of test cases to deduce properties of interest
O. Hasan and S. Tahar
Probabilistic Analysis using a Theorem Prover
10
Simulation – Example q 8 Bit Adder
q Model q VHDL/Verilog
q Test Cases
q Deduction: The property is true as it is found to be true for all the test vectors used
Test vectors (x,y) System output (z) z=x+y
(1,1) 2 True
(4,0) 4 True
(100,100) 200 True
(127,127) 254 True
+x z
O. Hasan and S. Tahar
y
Probabilistic Analysis using a Theorem Prover 11
Simulation
q Easy to use
q May generate inaccurate results q Practically impossible to test for all possible cases when
dealing with large systems § Over a million gate hardware design § Windows kernel
q Example § 64-bit floating-point division routine.
§ There are 2128 combinations § At 1 test/µsec – 1025 years
O. Hasan and S. Tahar
Probabilistic Analysis using a Theorem Prover
12
Formal Verification
q Precise and accurate system analysis approach
q Based on Mathematical techniques q Construct a computer based mathematical model of
the system (implementation)
q Use mathematical reasoning to check if the implementation satisfies the properties of interest (specifications) in a computerized environment
q Sometimes is difficult and time consuming
O. Hasan and S. Tahar
Probabilistic Analysis using a Theorem Prover
13
Formal Verification Techniques
q Model Checking
q Theorem proving
O. Hasan and S. Tahar
Probabilistic Analysis using a Theorem Prover
14
Model Checking
S ys tem
Temporal LogicFSM
ModelChecker
True, if the model satis fies the specificationC ounter example, otherwise
Properties
O. Hasan and S. Tahar
Probabilistic Analysis using a Theorem Prover
15
Model Checking – Example – Traffic Light Controller q Objective is to prioritize the
highway traffic
q Highway light remains green until there is a car at the farm road
q Farm light remains green for only x time units
q There is a yellow light in every transition
O. Hasan and S. Tahar
Probabilistic Analysis using a Theorem Prover
16
Model Checking – Example – Traffic Light Controller
O. Hasan and S. Tahar
Time x has elapsed
Car Present on the Farm Road
No car on the Farm Road
Hwy =GREEN
Hwy = YELLOW Hwy = RED
Hwy = RED
Time x has not elapsed
Farm =RED
Farm =RED
Farm =GREEN
Farm =YELLOW
Probabilistic Analysis using a Theorem Prover
17
Model Checking – Example – Traffic Light Controller
O. Hasan and S. Tahar
MODEL CHECKER
System Description
Temporal Logic
Properties
Both lights are not green Both lights are not red If a car arrives at the farm road it will eventually get access
True
False
(Counter Example)
Probabilistic Analysis using a Theorem Prover
18
Model Checking
q Advantages q Automatic (Push button type analysis tools) q No proofs involved q Diagnostic counter examples
q Disadvantages q Limited expressiveness q State-space explosion problem
q Model Checking Tools q SMV (Symbolic Model Verifier) - Carnegie Mellon U. q VIS (Verification Interacting with Synthesis) - U. of California, Berkeley q Formal Check – Cadence Labs
O. Hasan and S. Tahar
Probabilistic Analysis using a Theorem Prover
19
Theorem Proving
S ys tem
Logic (Function)
Logic (Theorem)
Formal proofs of the sys tem properties
Properties
Theorem Prover
O. Hasan and S. Tahar
Probabilistic Analysis using a Theorem Prover
20
Logic
q Formal language q Modeling systems q Modeling system properties
q Types of Logic q Propositional logic
§ (Boolean Algebra, variables ∈ {T,F}
q First-order logic (Predicate logic) § Quantification over variables (∀: For all, ∃: there exists)
q Higher-order logic § Quantification over sets and function
Decidability: There is an algorithm for deciding the truth of a formula (theorem)
First-Order LogicPropositional Logic Higher-Order Logic
Less expressive(-) Very expressive(+)Decidable(+) Undecidable(-)
O. Hasan and S. Tahar
Probabilistic Analysis using a Theorem Prover
21
Theorem Prover
q A theorem prover consists of q A notation (syntax) to express logic q a small set of fundamental axioms (facts)
§ A Boolean variable can be True or False: ∀ a.(a = T) ∨ (a =F)
q a small set of inference (deduction) rules § Equality is transitive: ∀ a b c. (a = b) ∧ (b = c) ⇒ (a = c)
q Soundness is assured as every new theorem must be created from q The basic axioms and primitive inference rules q Any other already proved theorems or inference rules
q Theory (collection of verified theorems in a file) q Facilitate the reusability of pre-verified results
O. Hasan and S. Tahar
Probabilistic Analysis using a Theorem Prover
22
Theorem Proving – Example q Check if y>x for the given system (x is a natural number)
1 y>x Problem statement
2 (x+1)2>x Implementation
3 (x+1).(x+1)>x Definition of Square
4 (x+1).x+(x+1).1>x Distributivity
5 x.x+1.x+x.1+1.1>x Distributivity
6 x.x+x+x+1>x Multiplicative Identity
7 x.x+x+1+x>x Additive Commutivity
8 x.x+x+1>0 Addition Cancellation
9 True Natural numbers > 0
2)1( +xx y
O. Hasan and S. Tahar
Probabilistic Analysis using a Theorem Prover
23
Theorem Proving
q Advantages q High expressiveness
§ Can be essentially used to analyze any system that can be expressed mathematically
q Less risk of mistakes (human errors) q Some parts of the proofs can be automated
q Disadvantages q Detailed and explicit human guidance required q The state-of-the-art is limited
q Theorem Proving Tools q Boyer-Moore (First-order Logic) U. of Texas, Austin q PVS (Higher-order Logic) Stanford Research Institute q HOL (Higher-order-logic) U. of Cambridge, UK
O. Hasan and S. Tahar
Probabilistic Analysis using a Theorem Prover
24
Some Formal Verification Myths
q Formal Verification can only be used by mathematicians q They are primarily based on mathematical concepts that is
usually transparent to the user
q The reasoning process is itself prone to errors, so why bother? q We opt to reduce design bugs not eliminate them
q Using formal verification tends to slow the design process q The early detection of design bugs are allows us to speed up the
overall design process
O. Hasan and S. Tahar
Probabilistic Analysis using a Theorem Prover
25
Formal Verification Challenge
O. Hasan and S. Tahar
Environmental Conditions
Aging Phenomena Probabilistic Choice
Unpredictable Inputs
Noise
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar
Probabilistic System Verification
Hardware Software
System Model
Property Satisfied?
Random Components
Probabilistic and Statistical Properties
Computer Based Analysis Framework
R andom VariablesProperties
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar
27
Probabilistic Analysis Basics – Random Variables
q Discrete Random Variables q Attain a countable number of values
q Example § Dice[1, 6]
q Continuous Random Variables q Attain an uncountable (infinite) number of values q Examples
§ Uniform (all real numbers in an interval [a,b])
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar
28
Probabilistic Analysis Basics – Probabilistic Properties
Property Description Examples
Discrete Continuous
Probability Mass Function (PMF)
Probability that the random variable is equal to some number n
Cumulative Distribution Function (CDF)
Probability that the random variable is less than or equal to some number n
Probability Density Function (PDF)
Slope of CDF for continuous random variables
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar
29
Probabilistic Analysis Basics – Statistical Properties
Property Description Illustration
Expectation
Long-run average value of a random variable
Variance Measure of dispersion of a random variable
Tail Distribution
Bounds
Upper limits of the probability that the random variable acquires values far from its expectation
(Markov’s and Chebyshev’s inequalities)
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar
Probabilistic Analysis Approaches
Random Components
Probabilistic State Machine good
Analysis
Accuracy
Expressiveness
No CPU Time Issue
Automation
Approximate random variable
functions
Observing some test cases
û
ü
û
ü
Probabilistic State Machine
Exhaustive Verification
ü
û
û
ü
Precise random variable functions
Mathematical Reasoning
ü
ü
ü
û
Simulation Formal Methods
Model Checking Theorem Proving
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar O. Hasan and S. Tahar
31
Outline
ü Introduction and Motivations
q Probabilistic Theorem Proving
q Case Studies
q Conclusions
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar
32
HOL Theorem Prover
q Higher-order logic theorem prover q University of Cambridge, UK
q Notation: ML q 5 axioms
q 8 primitive inference rules
q Numerous proof assistants are available
q Inbuilt mathematical theories of Boolean, list, set, integers, real analysis and probability
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar
System Properties
33
Probabilistic Theorem Proving
System Description
Sys
tem
Pro
pert
ies
(Dis
cret
e R
ando
m V
aria
bles
)
Sys
tem
Pro
pert
ies
(Con
tinuo
us R
ando
m V
aria
bles
)
System Model
Probabilistic Analysis
Theorems
Discrete Random Variables
Continuous Random Variables
Random Components
Probabilistic Properties
Statistical Properties
PMF
CDF
Expectation
Variance
Probabilistic Properties
Statistical Properties
CDF
Expectation
Variance
Theorem Prover
Formal Proofs of Properties
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar
34
Probabilities in HOL
q Formal Verification of Probabilistic Algorithms in HOL, PhD thesis, U. of Cambridge, UK. [Hurd, 2002]
q A probabilistic algorithm that q Accepts : α
q Returns: β
can be modeled in HOL as a deterministic function
f : α → B∞ → (β x B∞
)
that passes around the infinite Bernoulli sequence (B∞
), which provides the source of randomness
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar
35
Probabilistic Algorithms in HOL – Example 1 q Coin Flip (Head, Tail)
B∞ → (flip_outcome x B∞
)
⊢ flip s =
(if (top element of s) then Head else Tail, remaining portion of s)
Definition: Coin Flip
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar
36
Probabilistic Algorithms in HOL – Example 2 q An n-bit Discrete Standard Uniform
Random Variable
num → B∞
→ (real x B∞)
q Algorithm:
q Where Bi = 1 if ith element in the Bernoulli sequence is T else 0
{H1, T2, H3, ...Tn} → (1/21 + 0/22 + 1/23 + …1/2n) =
(0.101...1)
PMF
0 x
1/8
1/4 2/4 3/4 1
PMF
0 x
1/4
1/4 2/4 3/4 1
∑=
=n
ii
in BU
1
)21(
PMF
0 x1
n21
n21
n
n
212 −
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar
37
Probabilistic Algorithms in HOL – Example 2 (Formalization)
q s = (B0, B1, B2, B3, …) q std_unif_disc 0 s = (0, (B0, B1, B2, B3, …) ) q std_unif_disc 1 s = (if (B0) then ((1/2) + 0) else 0, (B1, B2, B3, …) ) q std_unif_disc 2 s = (if (B1) then ((1/4) + fst (std_unif_disc 1 s) )
else (fst (std_unif_disc 1 s) ) , (B2, B3, …) )
⊢ std_unif_disc 0 s = (0,s) ∧ ∀ n. std_unif_disc (n + 1) s =
(if (shd (snd (std_unif_disc n s))) then ((1/2)n+1 + fst (std_unif_disc n s))
else (fst (std_unif_disc n s)), stl (snd (std_unif_disc n s)))
Definition: Discrete Standard Uniform Random Variable
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar
38
Probability Theory in HOL
q Probability Space (Ω, ∑, P)
q Ω: Sample Space § Set of Boolean Sequences
q ∑: Events § Sigma Algebra on Ω; a set of subsets Ω, which is closed under
complements and countable unions
q P : Probability § Function that maps the elements of ∑ to real interval [0,1]
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar
39
Probability Theory in HOL
Theorems: Basic Probability Axioms
Probability of the Sample Space ⊢ P (Ω) = 1
Probability Bounds ⊢ ∀ A. 0 ≤ P (A) ≤ 1
Probability is Monotonically Increasing
⊢ ∀ A B. A ⊆ B ⇒ P (A) ≤ P (B)
Probability is Additive ⊢ ∀ A B. A Ո B = ∅ ⇒ P (A U B) = P (A) + P (B)
Probability of a Complement Set ⊢ ∀ A. P(¬A) = 1 - P(A)
⊢ ∀ b. P {s | shd s = b} = ½
Theorem: Probability of an Element of the Boolean Sequence
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar
40
Probabilistic Algorithms in HOL – Example 1 (Verification) q Coin Flip (Head, Tail)
⊢ flip s =
(if (top element of s) then Head else Tail, remaining portion of s)
Definition: Coin Flip
⊢ P {s | FST (flip s) = Head} = ½
Theorem: PMF of Coin Flip
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar
41
Probabilistic Algorithms in HOL – Example 2 (Verification)
PMF
0 x1n21
n22
n
n
212 −
n21
⊢ ∀ m n x. P {s | fst (std_unif_disc n s) = x} = if (x < 0) then 0 else (if (x ≥ 1) then 0 else
(if (x=m/2n) then (1/2)n else 0
Theorem: PMF
⊢ ∀ m n x. P {s | fst (std_unif_disc n s) ≤ x} = if (x < 0) then 0 else (if (x ≥ 1) then 1 else
(if (x=m/2n) then ((m+1)/2n) else 0
Theorem: CDF
0 x
1
1
n21
n21
n22
n
n
212 −
CDF
Probabilistic Analysis using a Theorem Prover
Probabilistic Algorithms in HOL
q The approach described so far is limited to probabilistic algorithms q Can acquire a finite number of values (2n)
q The occurrence probability of each value is
1/2n: n is the number of elements of the Boolean sequence
q Not all algorithms satisfy these conditions q Example: Geometric Random Variable
§ Returns the index of the first success in an infinite number of Coin Flips or Bernoulli trials
q Probabilistic While Loop q The probability of loop termination is 1
O. Hasan and S. Tahar
42
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar
43
Discrete Random Variables in HOL
Theorems: Some Formalized Discrete Random Variables
Random variable PMF
Uniform(m) ⊢ ∀ m x. x < m ⇒ P {s | fst (prob_unif m s) = x} = 1/m
Bernoulli(p) ⊢ ∀ p. 0 ≤ p ∧ p < 1 ⇒ P {s | fst (prob_bern p s) = x} = p
Geometric(p) ⊢ ∀ n p. 0 < p ∧ p ≤ 1 ⇒ P {s | fst (prob_geom p s)=(n + 1)}= p(1-p)n
Binomial(m,p) ⊢ ∀ m n p. 0 < p ∧ p ≤ 1 ⇒ P {s | fst (prob_bino m p s) = n} = (binomial m n) pn (1 – p)m - n
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar
45
System Model
Probabilistic Theorem Proving
Random Components
System
Syst
em P
erfo
rman
ce (D
iscr
ete
Ran
dom
Var
iabl
es)
Syst
em P
erfo
rman
ce (C
ontin
uous
Ran
dom
Var
iabl
es)
Probabilistic Analysis
Theorems
Discrete Random Variables
Continuous Random Variables
Probabilistic Properties
Statistical Properties
PMF
CDF
Expectation
Variance
Probabilistic Properties
Statistical Properties
CDF
Expectation
Variance
Theorem Prover
Formal Proofs of Properties
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar
46
Continuous Random Variables in HOL q Sampling algorithms are non-terminating
q Tedious Formalization and Verification
q Inverse Transform Method q Extensively used Non-uniform random number generation
method q Formalization of Continuous Probability Distributions, ,
Automated Deduction, [Hasan and Tahar, 2007]
Standard Uniform Random
Number Generator [0, 1]
Inverse Transform
Method
Random Numbers From Continuous
Distributions (Closed CDF)
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar
47
Continuous Random Variables in HOL
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar
48
Standard Uniform Random Variable
0 else head) flipcoin (i if 1 where)21( th
1
===∑∞
=iii
i XXU
q Continuous Uniform random variable [0,1]
q Algorithm
§ {H, T, H, H ...} → (1/21 + 0/22 + 1/23 + 1/24 + …) = (0.1011..)2
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar
49
Formalization of Standard Uniform Random Variable q Step 1
q Discrete Standard Uniform Random Variable
§ std_unif_disc
Algorithm: q Where Bi = 1 if ith element in the
Bernoulli sequence is T else 0
{H1, T2, H3, ...Tn} → (1/21 + 0/22 + 1/23 + …1/2n) =
(0.101...1)
PMF
0 x
1/4
1/4 2/4 3/4 1
PMF
0 x1
n21
n21
n
n
212 −
∑=
=n
ii
in BU
1
)21(
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar
50
Formalization of Standard Uniform Random Variable q Step 1
q Discrete Standard Uniform Random Variable
§ std_unif_disc
⊢ std_unif_disc 0 s = (0,s) ∧ ∀ n. std_unif_disc (n + 1) s =
(if (shd (snd (std_unif_disc n s))) then ((1/2)n+1 + fst (std_unif_disc n s))
else (fst (std_unif_disc n s)), stl (snd (std_unif_disc n s)))
Definition: Discrete Standard Uniform Random Variable
∑=
=n
ii
in BU
1
)21(
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar
51
Formalization of Standard Uniform Random Variable q Step 1
q Discrete Standard Uniform Random Variable
§ std_unif_disc
q Step 2 q As n tends to infinity
⊢ ∀ s. std_unif_cont s = lim (λn. std_unif_disc n s)
Definition: Standard Uniform Random Variable
nnUU
∞→= lim
∑=
=n
ii
in BU
1
)21(
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar
52
Verification of Standard Uniform Random Variable
⊢ ∀ x. P {s | std_unif_cont s) ≤ x} =
if (x < 0) then 0 else (if (x < 1) then x else 1
Theorem: CDF of Standard Uniform Random Variable
q Proof Sketch: q Verify the CDF of the discrete Uniform random variable
q Take the limit as n approaches infinity
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar
53
Continuous Random Variables in HOL
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar
54
Cumulative Distribution Function
q Completely characterizes both Discrete and Continuous random variables
)Pr()( xRxFR ≤=
⊢ ∀ R x. cdf R x = P {s | R s ≤ x}
Definition: Cumulative Distribution Function (CDF)
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar
55
Verification of CDF Properties
Theorems: CDF Properties
Bounds ⊢ ∀ R x. 0 ≤ cdf R x ≤ 1
Monotonic ⊢ ∀ R a b. (a < b) ⇒ (cdf R a ≤ cdf R b)
Interval Probability
⊢ ∀ R a b. (a < b) ⇒ (P {s | (a < R s) ∧ (R s ≤ b)} = cdf R b – cdf R a)
Negative Infinity ⊢ ∀ R. lim (λn. cdf R (-n)) = 0
Positive Infinity ⊢ ∀ R. lim (λn. cdf R n) = 1
Continuous form the Right
⊢ ∀ R a. lim (λn. cdf R (a + )) = cdf R a
Limit from the left
⊢ ∀ R a. lim (λn. cdf R (a – )) = P {s | R s < s}
1n1+
1n1+
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar
56
Continuous Random Variables in HOL
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar
57
Inverse Transform Method
q Random variable X with well defined CDF F
q U: Standard Uniform random variable
q F-1: Inverse Function of F
q Proof utilizes the CDF of Standard Uniform random variable and CDF properties
)(1 UFX −=
⊢ ∀ f f_inv x. (is_cont_cdf_fn f) ∧ (inv_cdf_fn f_inv f) ⇒
P {s | f_inv (std_unif_cont s) ≤ x} = f x
Theorem: Inverse Transform Method
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar
58
Continuous Random Variables in HOL
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar
59
Continuous Random Variables in HOL
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar
60
Example – Exponential Random Variable
q CDF:
(λx. if x ≤ 0 then 0 else (1 – exp–ax))
q Inverse CDF: (λx. -‐1/a ln(1-‐x))
⎭⎬⎫
⎩⎨⎧
<
≤
x0 ,exp-10 x ,0
ax-
⊢ ∀a. is_cont_cdf_fn (λx. if x ≤ 0 then 0 else (1 – exp (–ax)))
Theorem: Valid CDF Function
⊢ ∀a. inv_cdf_fn (λx. if x ≤ 0 then 0 else (1 – exp (–ax))) (λx. -1/a ln(1-x))
Theorem: Valid CDF Inverse Function
a = 0.5 a = 1.0 a = 1.5
Probabilistic Analysis using a Theorem Prover
Example – Exponential Random Variable
q Proof q Inverse Transform Method Theorem q Real Analysis
O. Hasan and S. Tahar 61
⊢ ∀ a s. exp_rv a s = (λx.–(1/a)ln (1-x)) (std_unif_cont s)
Definition: Exponential Random Variable
⊢ ∀ a x. (0 < a) ⇒ cdf (λs. exp_rv a s) x =
if x ≤ 0 then 0 else (1 – exp (–ax))
Theorem: CDF of Exponential Random Variable
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar
62
Continuous Random Variables in HOL
Theorems: Continuous Random Variables
Random Var. HOL Functions CDF
Exponential(l)
⊢ ∀ a s. exp_rv a s = (λx.–(1/a)ln(1-x)) (std_unif_cont s)
Uniform(a,b) ⊢ ∀ a b s. uniform_rv a b s = (λx. (b – a)x + a) (std_unif_cont s)
Rayleigh(l) ⊢ ∀ l s. rayleigh_rv l s = (λx.l sqrt(-2ln(1-x))) (std_unif_cont s)
Triangular(a) ⊢ ∀ a s. triangular_rv a s = (λx. a(1–sqrt(1 – x))) (std_unif_cont s)
⎭⎬⎫
⎩⎨⎧
<
≤
x0 ,exp-10 x ,0
ax-
⎪⎪⎭
⎪⎪⎬
⎫
⎪⎪⎩
⎪⎪⎨
⎧
<
≤<
≤
xb 1,
bxa ,a-ba-x
a x,0
⎪⎭
⎪⎬⎫
⎪⎩
⎪⎨⎧
<
≤
x0 ,exp-1
0 x ,0
2
2
2x-l
⎪⎪⎭
⎪⎪⎬
⎫
⎪⎪⎩
⎪⎪⎨
⎧
≤
<−
≤
xa ,1
ax,2
(a2
0 x ,02
axx
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar
64
System Model
Probabilistic Theorem Proving
Random Components
System
Syst
em P
erfo
rman
ce (D
iscr
ete
Ran
dom
Var
iabl
es)
Syst
em P
erfo
rman
ce (C
ontin
uous
Ran
dom
Var
iabl
es)
Probabilistic Analysis
Theorems
Discrete Random Variables
Continuous Random Variables
Probabilistic Properties
Statistical Properties
PMF
CDF
Expectation
Variance
Probabilistic Properties
Statistical Properties
CDF
Expectation
Variance
Theorem Prover
Formal Proofs of Properties
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar
65
Statistical Properties in HOL
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar
66
Formalization of Expectation
q Using Theorem Proving to Verify Expectation and Variance for Discrete Random Variables, JAR, [Hasan and Tahar, 2008]
q Summarizes the distribution characteristics of a random variable in a single number
⊢ ∀ f R. expec_fn f R = (f n) P {s | fst (R s) = n}
Definition: Expectation of a Function of Random Variable
⊢ ∀ R. expec R = expec_fn (λn. n) R
Definition: Expectation of a Random Variable
∑∞
=
==0
)Pr()()]([n
nRnfRfEx
∑∞
=0n
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar
67
Verification of Expectation Properties
q Linearity of Expectation
q Helpful in reasoning about systems involving multiple random variables
q Proof Sketch § 2 random variables (real analysis + probability theory) § General case (induction)
][][01∑∑==
=n
ii
n
ii RExREx
⊢ ∀ L. (∀ R. R ∈ L ⇒ (∃x. i P {s | fst (R s) = i}=x) ⇒
expec (sum_rv_lst L) = expec (el (length L – (n + 1)) L)
Theorem: Linearity of Expectation
∑∞
=0i
∑=
Llenght
0n
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar
68
Verification of Expectation Properties
q Expectation of a random variable added and multiplied by constants
q Proof § Linearity of Expectation Property § (real analysis + probability theory)
][][ RbExabRaEx +=+
⊢ ∀ R a b. (∃x. i P {s | fst (R s) = i}=x) ⇒
expec (bind R (λm. unit (a + b m))) = a + b (expec R)
Theorem: Random Variable Added and Multiplied by Constants
∑∞
=0n
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar
69
Statistical Properties in HOL
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar
70
Formalization of Variance
q Measure of dispersion of a random variable
]])[[(][ 2RExRExRVar −=
⊢ ∀ R. variance R = expec_fn (λn. (n – expec R)2) R
Definition: Variance of a Random Variable
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar
71
Verification of Variance Properties
q Variance in terms of Moments
q Proof § Definitions of Variance and Expectation § (real analysis + probability theory)
22 ])[(][][ RExRExRVar −=
⊢ ∀ R. (∃ x. i P {s | fst (R s) = i}=x) ∧
(∃ x. i2 P {s | fst (R s) = i}=x) ⇒
variance R = expec_fn (λn. n2)R – (expec R)2
Theorem: Variance in Terms of Moments
∑∞
=0i
∑∞
=0i
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar
72
Verification of Variance Properties
q Linearity of Variance
q Proof Sketch § 2 random variables (real analysis + probability theory + Linearity of
Expectation Property) § General case (induction)
⊢ ∀ L. (∀ R. R ∈ L ⇒ (∃ x. i P {s | fst (R s) = i} = x) ∧
(∃ x. i2 P {s | fst (R s) = i} = x) ⇒
variance (sum_rv_lst L) = variance (el (length L – (n + 1)) L)
Theorem: Linearity of Variance
∑∞
=0i
∑=
Llenght
0n
∑∞
=0i
][][01∑∑==
=n
ii
n
ii RVarRVar
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar
73
Statistical Properties in HOL
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar
Tail Distribution Bounds
q Upper limits on the probability that the random variable acquires values far from its expectation
q Useful in estimating failure probabilities
q Markov’s Inequality q Chebyshev’s Inequality
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar
75
Markov’s Inequality
q Obtains weak tail bound in terms of expectation of a random variable
q Proof § Definition of expectation and its properties § (real analysis + probability theory)
aRExaR ][)Pr( ≤≥
⊢ ∀ R a. (∃ x. (n P {s | fst (R s) = n}) = x) ∧ (0 < a) ⇒
P {s | fst (R s) ≥ a}) ≤
Theorem: Markov’s Inequality
∑∞
=0n
aR expec
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar
76
Chebyshev’s Inequality
q Relatively stronger tail bound in terms of expectation and variance of a random variable
q Proof § Definitions of expectation and variance and their properties § (real analysis + probability theory)
2
][)|][Pr(|aRVaraRExR ≤≥−
⊢ ∀ R a. (0 < a) ∧
(∃ x. i P {s | fst (R s) = i} = x) ∧
(∃ x. i2 P {s | fst (R s) = i} = x) ⇒
P {s | abs (fst (R s) – expec R) ≥ a}) ≤
Theorem: Chebyshev’s Inequality
∑∞
=0i
∑∞
=0i
2aR variance
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar
77
Statistical Properties in HOL
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar
78
Example – Geometric Random Variable
q Proof § Definitions of expectation and variance and their properties § (real analysis + probability theory)
⊢ ∀ p. (0 < p) ∧ (p ≤ 1) ⇒ variance (λs. prob_geom p s) =
Theorem: Variance of Geometric Random Variable
2pp-1
⊢ ∀ p. (0 < p) ∧ (p ≤ 1) ⇒ expec (λs. prob_geom p s) =
Theorem: Expectation of Geometric Random Variable
p1
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar
79
Verification of Expectation and Variance Relations
Theorems: Discrete Random Variables
Random variable
HOL
Function
Expectation Variance
Uniform(m)
prob_unif
Bernoulli(p) prob_bern
Geometric(p) prob_geom
Binomial(m,p) prob_bino
2m
121)1( 2 −+m
p )1( pp −
p1
2
1pp−
mp )1( pmp −
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar
80
System Model
Probabilistic Theorem Proving
Random Components
System
Syst
em P
erfo
rman
ce (D
iscr
ete
Ran
dom
Var
iabl
es)
Syst
em P
erfo
rman
ce (C
ontin
uous
Ran
dom
Var
iabl
es)
Probabilistic Analysis
Theorems
Discrete Random Variables
Continuous Random Variables
Probabilistic Properties
Statistical Properties
PMF
CDF
Expectation
Variance
Probabilistic Properties
Statistical Properties
CDF
Expectation
Variance
Theorem Prover
Formal Proofs of Properties
Probabilistic Analysis using a Theorem Prover
Expectation of Continuous Random Variables
q Reimann Integral
q fX: Probability Density Function (PDF) of random variable X q Well known facilitate the reasoning process q Limited to random variables with well-defined PDFs q Requires extended real numbers Ŕ=R U {-∞,+∞}
q Lebesgue Integral
q Ω: sample space and P: Probability function q Most general definition of expectation,
§ Caters for both Discrete and Continuous random variables q Analytically complex to handle
∫+∞
∞−
= dxxxfXEx X )(][
O. Hasan and S. Tahar
∫Ω
= XdPXEx ][
81
Probabilistic Analysis using a Theorem Prover
Expectation of Continuous Random Variables
∫Ω
= XdPXEx ][
O. Hasan and S. Tahar
Simplified Expressions that involve commonly
used arithmetic operations
82
q Formal Reasoning about Expectation Properties for Continuous Random Variables, Formal Methods, [Hasan, Abbasi, Akbarpour and Tahar, 2009]
⊢ ∀ X. expec_cont (Ω, ∑, P) X = X d P
Definition: Expectation of a Continuous Random Variable
∫Ω
q Formalization of Lebesgue Integral in HOL, U. of Cambridge, UK. [Aaron, 2009]
Probabilistic Analysis using a Theorem Prover
Expectation of Continuous Random Variables q Simplified Expression 1: Bounded Random Variables
⎥⎦
⎤⎢⎣
⎡
⎭⎬⎫
⎩⎨⎧ −
++<≤−+−+= ∑
−
=∞→
12
0)(
21)(
2))(
2(lim][
n
innnn
abiaXabiaPabiaXE
O. Hasan and S. Tahar
83
⊢ ∀ a b X. (0 ≤ a) ∧ (a < b) ∧ (∀s. a ≤ X s ≤ b) ⇒ expec_cont (Ω, ∑, P) X =
Theorem: Expectation of Bounded Random Variables
⎥⎦
⎤⎢⎣
⎡
⎭⎬⎫
⎩⎨⎧ −
++<≤−+−+∑
−
=∞→
12
0innnn
n
a)(b21iasXa)(b
2ia|sa))P(b
2i(alim
Probabilistic Analysis using a Theorem Prover
Expectation of Continuous Random Variables q Simplified Expression 2: Unbounded Random Variables
⎥⎦
⎤⎢⎣
⎡≥+
⎭⎬⎫
⎩⎨⎧ +
<≤= ∑−
=∞→
12
0)(
21
22lim][
nn
innnn
nXnPiXiPiXE
O. Hasan and S. Tahar
84
⊢ ∀ a b X. (∀s. 0 ≤ X s) ⇒
expec_cont (Ω, ∑, P) X =
Theorem: Expectation of Unbounded Random Variables
{ }⎥⎦
⎤⎢⎣
⎡≥+
⎭⎬⎫
⎩⎨⎧ +
<≤∑−
=∞→
12
0|
21
2|
2lim
nn
innnn
nsXsnPisXisPi
Probabilistic Analysis using a Theorem Prover
Example: Exponential Random Variable
q Proof q Evaluating the two probability terms using the CDF of Exponential
random variable
q Evaluating the infinite summation
⎥⎦
⎤⎢⎣
⎡≥+
⎭⎬⎫
⎩⎨⎧ +
<≤= ∑−
=∞→
12
0)(
21
22lim][
nn
innnn
nXnPiXiPiXE
O. Hasan and S. Tahar
85
⊢ ∀ a. (0 < a) ⇒ expec_cont (Ω, ∑, P) (λs. exp_rv a s) = 1/a
Theorem: Expectation of Exponential Random Variable
Probabilistic Analysis using a Theorem Prover
Verification of Expectation Relations
O. Hasan and S. Tahar
86
Theorems: Continuous Random Variables
Random variable HOL
Function
Expectation
Uniform(a,b)
uniform_rv
Triangular(0,b) triangular_rv
Exponential(a) exp_rv
2ba +
3b
a1
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar
87
System Model
Probabilistic Theorem Proving
Random Components
System
Syst
em P
erfo
rman
ce (D
iscr
ete
Ran
dom
Var
iabl
es)
Syst
em P
erfo
rman
ce (C
ontin
uous
Ran
dom
Var
iabl
es)
Probabilistic Analysis
Theorems
Discrete Random Variables
Continuous Random Variables
Probabilistic Properties
Statistical Properties
PMF
CDF
Expectation
Variance
Probabilistic Properties
Statistical Properties
CDF
Expectation
Variance
Theorem Prover
Formal Proofs of Properties
Probabilistic Analysis using a Theorem Prover
Effort Statistics
O. Hasan and S. Tahar
88
Formalization Approx. Lines of HOL code
Measure and Probability Theories 17,000
Discrete Random Variables Formalization and Probabilistic Properties
1,500
Discrete Random Variables Statistical Properties
7,500
Continuous Random Variables Formalization and Probabilistic Properties
7,000
Continuous Random Variables Statistical Properties
8,500
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar O. Hasan and S. Tahar
89
Outline
ü Introduction and Motivation
ü Probabilistic Theorem Proving
q Case Studies q Coupon Collector’s Problem
q Stop-and-Wait Protocol
q Reconfigurable Memory Arrays
q Conclusions
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar
90
Coupon Collector’s Problem
q Collect all n coupons and win!
q A collection of coupons with n distinct entries
q Each distinct coupon is uniformly distributed in the collection
q Coupons are drawn randomly and independently
q How many trials are required to acquire all n coupons?
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar
91
Coupon Collector’s Problem
q How many message transmissions we need on average to get all router ID’s in the path?
q Tail Distribution Bounds: q Pr (number of transmissions required to know all router ID’s in
the path > some threshold value)
5 9
12 8
11
23
18
15
8 23 8 5
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar
92
Coupon Collector’s Problem
q The process of acquiring a new coupon q Geometric random variable
§ Number of trials to achieve the first success
q Coupon Collector’s Problem q A sum of n Geometric random variables
q Where each Xi denotes the Geometric random variable to acquire the ith new coupon
∑=
=n
iiXX
1
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar
93
Formalization of Coupon Collector’s Problem in HOL q Coupons are identified by unique positive integers
q Accepts: Number of acquired coupons q Returns: The corresponding coupon collector list
q Example: q Input: 1 → [0] q Input: 2 → [0, 1] q And so on …
⊢ (coupon_lst 0 = [ ]) ∧
∀ n. (coupon_lst (n + 1) = n :: (coupon_lst n))
Definition: Coupon Collection List
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar
94
Formalization of Coupon Collector’s Problem in HOL
q Accepts q Coupon Collector’s list (Already acquired coupons) q Total number of distinct coupons
q Returns q List of Geometric random variables corresponding to the number
of trials for all acquired and the next coupon
q Success probability is modeled using the Uniform random variable q Probability that a new coupon number is uniformly generated
⊢ ∀ n. (geom_rv_lst [ ] n = [prob_geom 1]) ∧ ∀ h t n. (geom_rv_lst (h::t) n =
(prob_geom P{s | ¬(mem (fst (prob_unif n s)) (h::t))} ) :: (geom_rv_lst t n))
Definition: Geometric Random Variable List
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar
95
Formalization of Coupon Collector’s Problem in HOL
q Example q Total number of distinct coupons: 5
§ [0]: [prob_geom P{s | ¬(mem (fst (prob_unif 5 s)) [0])} , prob_geom 1] = [prob_geom 4/5, prob_geom 1]
§ [0,1]: [prob_geom 3/5, prob_geom 4/5, prob_geom 1]
⊢ ∀ n. (geom_rv_lst [ ] n = [prob_geom 1]) ∧ ∀ h t n. (geom_rv_lst (h::t) n =
(prob_geom P{s | ¬(mem (fst (prob_unif n s)) (h::t))} ) :: (geom_rv_lst t n))
Definition: Geometric Random Variable List
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar
96
Formalization of Coupon Collector’s Problem in HOL
q Accepts q Total number of coupons (n+1)
q Returns q Sum of (n+1) Geometric random variables
§ Each Geometric random variable models the number of trails required to acquire a distinct coupon in coupon collector’s problem
⊢ ∀ n. coupon_collector (n + 1) =
sum_rv_lst (geo_rv_lst (coupon_lst n) (n + 1))
Definition: Coupon Collector’s Problem
∑=
=n
iiXX
1
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar
97
Verification of Coupon Collector’s Problem
q Proof q PMF of Uniform random variable
q Set theory and Real Analysis
⊢ ∀ L n. (dist_lst L) ∧ (∀ a. mem a L ⇒ (a < (n + 1)))
⇒ (P {s | ¬(mem (fst (prob_unif (n + 1) s)) L)}
=
Theorem: Probability of Acquiring a New Coupon
1nLlength - 1
+
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar
98
Verification of Coupon Collector’s Problem
q Proof q Expectation of Geometric random variable q Linearity of Expectation property q Real Analysis
q Proof q Variance of Geometric random variable q Linearity of Variance property q Real Analysis
⊢ ∀ n. expec (coupon_collector (n + 1)) =
Theorem: Average of Coupon Collector’s problem
∑+
= ++
1n
0i 1i11) (n
⊢ ∀ n. variance (coupon_collector (n + 1)) ≤
Theorem: Variance Upper Bound of Coupon Collector’s Problem
∑+
= ++
1n
0i2
2
1)(i11)(n
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar
99
Verification of Coupon Collector’s Problem
q Proof q Markov and Chebyshev’s inequalities q Expectation and variance for Coupon Collector’s problem q Real Analysis
⊢ ∀ n a. (0 < a) ⇒
P {s | fst (coupon_collector (n + 1) s) ≥ a}) ≤
Theorem: Tail Distribution Bound (Markov’s Inequality)
∑+
= +
+ 1n
0i 1i1
a1n
⊢ ∀ n a. (0 < a) ⇒
P {s | abs (fst (coupon_collector (n + 1) s) –
expec (coupon_collector (n + 1))) ≥ a}) ≤
Theorem: Tail Distribution Bound (Chebyshev’s Inequality)
∑+
= +
+ 1n
0i22
2
1)(i1
a1)(n
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar
100
Coupon Collector’s Problem -Summary
q Results exactly match the paper-and-pencil based analysis methods q 100% precise
q Analysis was based on the pre-existing formalization and verification of Geometric and Uniform random variables, and Linearity of expectation and variance properties, and Chebyshev’s and Markov inequalities q ~1000 lines of HOL code q ~100 man-hours
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar O. Hasan and S. Tahar
101
Outline
ü Introduction and Motivation
ü Probabilistic Theorem Proving
q Case Studies ü Coupon Collector’s Problem
q Stop-and-Wait Protocol
q Reconfigurable Memory Arrays
q Conclusions
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar
102
Why Stop-and-Wait Protocol?
q Stop-and-Wait Protocol q Classical example of a real-time system
q Real-Time Systems q Involve a subtle interaction of a number of distributed
components
q Performance Analysis is not very straight-forward
Both simulation and state-based formal techniques fail to produce reasonable results
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar
103
Stop-and-Wait Protocol
q Message Delay for a single packet q Unpredictable Characteristic
q Depends on the channel noise
q Channel Error probability: p q Average Message Delay: ?
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar
104
Formalization of the Stop-and-Wait Protocol
q source q data messages list
q dataS, dataR, ackS, ackR q (time → data message)
q sink, rem q (time → data message list)
Bernoulli Random Variable
DATA-TRANS
ACK_RECV
DATA-CHAN
ACK-CHAN
DATA-RECV
ACK-TRANS
Sender Channel Receiver
source, rem t sink
dataS dataR
ackS ackR
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar
105
Formalization of the Stop-and-Wait Protocol ⊢ ∀ in out del p bseqt. DATA_CHAN in out del p bseqt = ∀ t. (if t < del then (*If current time is less than channel Delay*)
(out t = set_non_packet) ∧ (*No output*)
(bseqt (t + 1) = bseqt t) (*Boolean seq. Retains its value*)
else (if good_packet (in (t - del)) then (*If a good packet arrives*)
(if ¬fst (prob_bern p (bseqt t)) then (*If no noise effect*)
(out t = in (t - del)) ∧ (*Packet reaches output*)
(bseqt (t + 1) = snd (prob_bern p (bseqt t))) (*Update Boolean seq.*)
else (out t = set_non_packet) ∧ (*No output*)
(bseqt (t + 1) = snd (prob_bern p (bseq t)))) (*Update Boolean seq.*)
else (out t = set_non_packet) ∧ (*No output*)
(bseqt (t + 1) = bseqt t))) (*Boolean seq. Retains its value*)
Definition: Data Channel
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar
106
Formalization of the Stop-and-Wait Protocol
STOP_WAIT source sink rem s i r ws sn ackty maxP abort dataS dataR ackS ackR d tprop dtout dtf dta tf ack msg ta tout rec flag bseqt bseq p = DATA_TRANS ws sn dataS s rem i ackS tout tf dtout dtf ∧ DATA_CHAN dataS dataR d tprop p bseqt ∧ DATA_RECV sn dataR sink r ∧ ACK_TRANS sn ackR r ackty ack msg ta dta rec flag ∧ ACK_CHAN ackR ackS d tprop ∧ ACK_RECV ws sn ackS rem s ∧ INIT source rem s sink r i ackR dtout dtf dta tout tf ta rec flag bseqt bseq
Definition: Stop-and-Wait Protocol
DATA-TRANS
ACK_RECV
DATA-CHAN
ACK-CHAN
DATA-RECV
ACK-TRANS
Sender Channel Receiver
source sink dataS dataR
ackS ackR
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar
107
Functional Verification of the Stop-and-Wait Protocol
⊢ ∀ source sink. REQ source sink =
(∃ t. sink t = source) ∧ ∀ t n. is_prefix (sink t) (sink (t + n))
Definition: Functional Requirement for the Stop-and-Wait Protocol
q Ensure reliable data transfer from the sender to receiver
DATA-TRANS
ACK_RECV
DATA-CHAN
ACK-CHAN
DATA-RECV
ACK-TRANS
Sender Channel Receiver
source sink dataS dataR
ackS ackR
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar
108
Functional Verification of the Stop-and-Wait Protocol
⊢ ∀ source sink rem s i r ws sn ackty maxP abort dataS dataR ackS ackR d tprop dtout dtf dta tf ack msg ta tout rec_flag bseqt bseq p.
STOP_WAIT source sink rem s i r ws sn ackty maxP abort dataS dataR ackS ackR d tprop dtout dtf dta tf ack msg ta tout rec_flag bseqt bseq p ∧
LIVE ASSUMPTION abort (*Liveness constrain: Data will be eventually received*)
⇒ REQ source sink
Theorem: Functional Correctness for the Stop-and-Wait Protocol
q The formal model of the Stop-and-Wait protocol implies the functional requirement
q Proof q Induction on the source list q Stop-and-Wait protocol definition
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar
109
Performance Analysis – Message Delay
q Transmission Trial (Noiseless Channel):
q Transmission Trial (Channel Error):
q Message Delay:
)(2 procpropaf tttt +++
outf tt +
)(2)1)(( )1( procpropafpoutf ttttGtt ++++−+ −
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar
110
Performance Analysis in HOL – Message Delay
q Message Delay is a random variable q Time required to successfully transmit a single data
message
q rem t: remaining portion of source list at time t
q @t : A t such that q bseq t: Infinite Boolean Sequence at time t
⊢ ∀ rem source bseqt. MSG_DELAY rem source bseqt =
((@t. (rem t = TL source) ∧ (rem (t - 1) = source)), bseqt @t. (rem t = TL source) ∧ (rem (t - 1) = source))
Definition: Message Transmission Delay
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar
111
Performance Analysis in HOL – Message Delay
⊢ ∀ source sink rem s i r ws sn ackty maxP abort dataS dataR ackS ackR d tprop dtout dtf dta tf ack msg ta tout rec_flag bseqt bseq p. STOP_WAIT source sink rem s i r ws sn ackty maxP abort dataS dataR ackS ackR d tprop dtout dtf dta tf ack msg ta tout rec_flag bseqt bseq p ∧ ¬(NULL source) ∧ (* Source has always some data to be transmitted*)
tprop + 1 + ta + tprop + 1 ≤ tout ∧ (*Tout is greater than the roundtrip delay of a message*)
LIVE ASSUMPTION abort ∧ 0 ≤ p ∧ p < 1 ⇒ (MSG_DELAY rem source bseqt = ((tf + tout) (fst (prob_geom (1 – p) bseq) – 1)+ (tf + ta + 2 (tprop + tproc)),
snd (prob_geom (1 – p) bseq))
Theorem: Average Message Delay for the Stop-and-Wait Protocol
)(2)1)(( :ChannelNoisy )1( procpropafpoutf ttttGtt ++++−+ −
q Proof q Stop-and-Wait protocol definition
q Geometric random variable properties
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar
112
Performance Analysis in HOL – Average Message Delay
q Proof q Stop-and-Wait protocol definition q Expectation and Geometric random variable properties
)(21
)( :Delay Message Average procpropaf
outf ttttpptt
++++−
+
⊢ ∀ source sink rem s i r ws sn ackty maxP abort dataS dataR ackS ackR d tprop dtout dtf dta tf ack msg ta tout rec_flag bseqt bseq p. STOP_WAIT source sink rem s i r ws sn ackty maxP abort dataS dataR ackS ackR d tprop dtout dtf dta tf ack msg ta tout rec flag bseqt bseq p ∧ ¬(NULL source) ∧ tprop + 1 + ta + tprop + 1 ≤ tout ∧ LIVE ASSUMPTION abort ∧ 0 ≤ p ∧ p < 1
⇒ expec (MSG_DELAY rem source bseqt) = (tf + tout) (p/(1-p)) + (tf + ta + 2 (tprop + tproc))
Theorem: Average Message Delay for the Stop-and-Wait Protocol
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar
113
Stop-and-Wait Protocol -Summary q Performance Analysis Results exactly match the paper-
and-pencil based analysis methods q 100% precise
q Analysis was based on the pre-existing formalization and verification of Geometric and Bernoulli random variables and expectation properties q ~6000 lines of HOL code
q ~300 man-hours
q A single Stop-and-Wait protocol model was used for both Performance Analysis and Functional Verification
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar O. Hasan and S. Tahar
114
Outline
ü Introduction and Motivation
ü Probabilistic Theorem Proving
ü Case Studies ü Coupon Collector’s Problem
ü Stop-and-Wait Protocol
q Reconfigurable Memory Arrays
q Conclusions
Probabilistic Analysis using a Theorem Prover
115
Motivation
q Solution q Add Redundancy q Make Memory
Reconfigurable
q How much redundancy? q Probabilistic Techniques
using Computer Simulation § Inaccurate
q Proposed Solution q Theorem Proving
O. Hasan and S. Tahar
Neighborhood Pattern
Sensitive Faults
Transition Faults
Stuck-at Faults
Coupling Faults
Probabilistic Analysis using a Theorem Prover
Reconfigurable Memory Arrays
q Memories fabricated with spare rows and columns q Spares can be reconfigured to replace rows and
columns with fabrication faults
q Repairability q If a combination of spare rows and columns exists such
that all faults from the memory array can be eliminated
116 O. Hasan and S. Tahar
Probabilistic Analysis using a Theorem Prover
Reconfigurable Memory Arrays
q Repairability is judged based on Probabilistic Techniques
q 3 Step Process q Model fault occurrence behavior with an appropriate random
variable
q Estimate statistical information regarding the number of faults, such as average number of faults
q A memory would be termed as repairable if the available spare rows and columns ascertain fixing all the estimated faults with probability 1.
117 O. Hasan and S. Tahar
Probabilistic Analysis using a Theorem Prover
118
Stuck-at Faults
O. Hasan and S. Tahar
q Most common Fabrication Fault q Occurs when a memory cell never changes its
state, i.e., it is always stuck in one state q Stuck-at 1 Fault
q Stuck-at 0 Fault
Probabilistic Analysis using a Theorem Prover
119
Formal Stuck-at Fault Model for Reconfigurable Memory Arrays q Memory Array modeled as a bipartite graph (R,C,F)
q R: set of vertices representing the memory rows
q C: set of vertices representing the memory columns
q F: set of edges, where each edge in this set represents a Stuck-at fault in the memory array and connects one vertex in R to a vertex in C
q Assumption q Faults are independent and identically distributed with
probability p
O. Hasan and S. Tahar
Probabilistic Analysis using a Theorem Prover
Formal Stuck-at Fault Model for Reconfigurable Memory Arrays
120 O. Hasan and S. Tahar
sc = b n cp cq cr
ri
rj
rk
sc = a n
Number of Columns = n
Num
ber of Row
s = n
ri
rj
rk
cp
cq
cr
e1
e4
F = { } e1, e2, e3, e4
Probabilistic Analysis using a Theorem Prover
121
Formal Stuck-at Fault Model for Reconfigurable Memory Arrays
q The repair probability of a memory array is defined as:
where each Pr and |F| represent the probability function and the cardinality of the set F, respectively
q The repair probability for a square memory array is given by:
where and .
q Our goal is to verify that the memory array is almost always
repairable if the stuck-at fault occurrence probability is where as .
sc)sr|FPr(| +≤
b)n)(a|FPr(| +≤nsr
a =nsc
b =
nn
w(n)nb)(a
p −+
= ∞→w(n) ∞→n
O. Hasa and S. Tahar
Probabilistic Analysis using a Theorem Prover
122
Higher-order-logic Formalization
q mem_fault_model accepts three parameters: the cardinalities of the sets R and C and the probability of fault occurrence p
q It returns total stuck-at faults in the memory array
q It basically performs a Bernoulli(p) trail for each cell in the memory and returns the number of True outcomes obtained
⊢ ∀ p. mem_fault_model_helper 0 p = unit 0) ∧ ∀ c p. mem_fault_model_helper (c + 1) p = bind (mem_fault_model_helper c p) (λa. bind (prob bern p) (λb. unit (if b then (a+1) else a))) ⊢ (∀ c p. mem_fault_model 0 c p = unit 0) ∧ ∀ r c p. mem_fault_model (r + 1) c p = bind (mem_fault_model r c p) (λa. bind (mem_fault_model_helper c p) ((λb. unit (a + b)))
Definition: Stuck-At Fault Memory Model
O. Hasan and S. Tahar
Probabilistic Analysis using a Theorem Prover
123
Higher-order-logic Formalization
⊢ ∀ n a b w. mem_fault_model_rep n a b w = mem_fault_model n n
Definition: Stuck-at Fault Memory Model for Repairability Problem
⎟⎠
⎞⎜⎝
⎛−
+
nn
w(n)nb)(a
O. Hasan and S. Tahar
q Function mem_fault_model_rep accepts four parameters: q Cardinality of sets R and C of a square reconfigurable memory
array as a natural number n q The fractions of spare row and columns as real numbers a and b q Real sequence w of type (naturalàreal)
q Utilizes mem_fault_model and returns number of stuck-at faults for the specific case of a square n x n memory array with fault occurrence probability
nn
w(n)nb)(a−
+
Probabilistic Analysis using a Theorem Prover
124
Alternate Expression for the Number of Faults
q The alternate expression is expressed in terms of the Binomial random variable
q Easy to use as we do not have to deal with the recursive definition
q Proof q Independence and identically distributed stuck-at
faults assumptions q Formal definitions of Bernoulli and Binomial random
variables
⊢ ∀ n a b w. mem_fault_model_rep n a b w = prob_bino n2
Lemma: Number of Stuck-at Faults in terms of Binomial R.V.
⎟⎠
⎞⎜⎝
⎛−
+
nn
w(n)nb)(a
O. Hasan and S. Tahar
Probabilistic Analysis using a Theorem Prover
125
Statistical Property 1
O. Hasan and S. Tahar
⊢ ∀ n a b w. (0 ≤a) ∧ (a ≤1) ∧ (0 ≤b) ∧ (b ≤1) ∧ (1<n) ∧ (∀ n. (0<w(n)) ∧ (w(n)<(a+b) ) ) ⇒ expec (λs. mem_fault_model_rep n a b w s) =
Theorem: Average Number of Stuck-at Faults
⎟⎠
⎞⎜⎝
⎛−
+
nn
w(n)nb)(a
n2n
q Assumptions q Fractions (a,b) are bounded by the interval [0,1] q 1<n to ensure that memory array has more than one cell q Bounds on w(n) ensure that the fault probability falls with in the
interval [0,1]
No such restriction placed on w(n) in paper and pencil analysis
q Proof q Expectation of Binomial random variable
Probabilistic Analysis using a Theorem Prover
126
Statistical Property 2
q Proof q Expectation and Variance of Binomial Random Variable
⊢ ∀ n a b w. (0 ≤a) ∧ (a ≤1) ∧ (0 ≤b) ∧ (b ≤1) ∧ (1<n) ∧ (∀ n. (0<w(n)) ∧ (w(n)<(a+b) ) ) ⇒ variance (λs. mem_fault_model_rep n a b w s) =
Theorem: Variance of Stuck-at Faults
⎟⎟⎠
⎞⎜⎜⎝
⎛⎟⎠
⎞⎜⎝
⎛−
+−⎟
⎠
⎞⎜⎝
⎛−
+
nn
w(n)nb)(a
1nn
w(n)nb)(a
n2
n
O. Hasan and S. Tahar
Probabilistic Analysis using a Theorem Prover
127
Statistical Property 3
O. Hasan and S. Tahar
q Proof q Probability Axioms q Expectation and Variance of Binomial random variable q Chebyshev’s inequality
⊢ ∀ n a b w. (0 ≤a) ∧ (a ≤1) ∧ (0 ≤b) ∧ (b ≤1) ∧ (1<n) ∧ (∀ n. (0<w(n)) ∧ (w(n)<(a+b) ) ) ⇒ P {s | (fst (mem_fault_model_rep n a b w s)) ≤ (a+b)n} ≥
Theorem: Tail Distribution Bound for Stuck-at Faults
( )2)(1
nwn
⎟⎟⎠
⎞⎜⎜⎝
⎛⎟⎠
⎞⎜⎝
⎛−
+−⎟
⎠
⎞⎜⎝
⎛−
+
−nn
w(n)nb)(a
1nn
w(n)nb)(a
n2
n
Probabilistic Analysis using a Theorem Prover
128
Repairability Problem
q Repairability Problem
( ) 1b)n(a|F|Prlimn
=+≤∞→
O. Hasan and S. Tahar
⊢ ∀ a b w. (0 ≤a) ∧ (a ≤1) ∧ (0 ≤b) ∧ (b ≤1) ∧ (∀ n. (0<w(n)) ∧ (w(n)<(a+b) ) ) ∧ (lim ) ⇒ (lim (λn. P{ s | (fst (num_of_faults n a b w s) ) ≤ (a+b)n}) = 1)
Theorem: Repairability Problem of Stuck-at Faults
⎟⎟⎠
⎞⎜⎜⎝
⎛= 0
w(n)1λ n.
n
q Proof q Probability Axioms q Tail Distribution Bound Theorem q Real Analysis and Limit Theory
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar
129
Reconfigurable Memory Array -Summary
q The Analysis Results exactly match the paper-and-pencil based analysis methods q 100% precise
q Analysis was based on the pre-existing formalization and verification of Bernoulli and Binomial random variables and Chebyshev’s inequality q ~1200 lines of HOL code
q ~80 man-hours
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar O. Hasan and S. Tahar
130
Outline
ü Introduction and Motivation
ü Probabilistic Theorem Proving
ü Case Studies ü Coupon Collector’s Problem
ü Stop-and-Wait Protocol
ü Reconfigurable Memory Arrays
q Conclusions
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar
Objectives
q Probabilistic Theorem Proving
q Why do we need it? § Exact Answers (Useful for the analysis of Safety critical application)
q What is it? § Mathematically reason about Probabilistic and Statistical
properties of a system using a computer-based theorem prover
q How can we apply it for the performance analysis of real-world applications? § Mathematically model (Formalize) the system as a higher-order-
logic function while modelling its random components with random variables
§ Formalize probabilistic and statistical properties as higher-order-logic theorems
§ Verify these theorems in a theorem prover
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar
132
Conclusions q Probabilistic Theorem Proving is not an alternative to
approaches such as simulation or model checking
q Less critical sections of the system q Simulation
q Critical sections of the system that can be expressed as a Markov Chain and can be handled without the state-space explosion problem q Model Checking
q Critical sections of the system that cannot be handled by Model Checking q Thereom Proving
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar
133
Ongoing and Future Work
q Theoretical Foundations q Probability Density Function (PDF) q Continuous random variables for which CDF does not
exist in a closed form q Variance and Moments for Continuous Random
Variables q Multiple Continuous Random Variables
q Discrete Time Markov Chains
q Continuous Time Markov Chains
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar
134
Ongoing and Future Work
q Applications q Algorithms
§ Birthday Paradox § Hiring Problem § Hat-Check Problem § Quicksort
q Telecommunications § Automated repeat request (ARQ) protocols § ARQ mechanism at the logic link control (LLC) layer of the General
Packet Radio Service (GPRS) § Wireless sensor Network Protocols
q VLSI and Digital Design § Irrepairability Analysis of reconfigurable Memory Arrays § Reliability of Digital Logic Circuits
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar
135
Thank you!
q For More Information
q Visit our website § http://hvg.ece.concordia.ca/Research/METH/PAHTP
q Contact § [email protected]
q Ask Now!
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar
136
References
q Motivation and a concise description of the tutorial q O. Hasan. Formal Probabilistic Analysis using Theorem Proving.
Concordia University, Montreal, Canada, 2008.
q Formalization of Measure and Probability Theory and Discrete Random Variables q J. Hurd. Formal Verification of Probabilistic Algorithms. PhD
Thesis, University of Cambridge, Cambridge, UK, 2002.
q Formalization of Continuous Random Variables q O.Hasan and S. Tahar.
Formalization of the Standard Uniform Random Variable. Theoretical Computer Science, Vol. 382, No. 1, Elsevier, 2007, pp. 71-83.
q O. Hasan and S. Tahar: Formalization of Continuous Probability Distributions; In: Automated Deduction, LNCS 4603, Springer Verlag, 2007, pp. 2-18.
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar
137
References
q Statistical Properties of Discrete Random Variables q O. Hasan and S. Tahar:
Verification of Expectation Properties for Discrete Random Variables in HOL; In: Theorem Proving in Higher-Order Logics, LNCS 4732, Springer Verlag, 2007, pp. 119-134.
q O.Hasan and S. Tahar. Using Theorem Proving to Verify Expectation and Variance for Discrete Random Variables. Journal of Automated Reasoning, Vol. 41, No. 3-4, Springer Verlag, 2008, pp. 295-323.
q O.Hasan and S. Tahar. Formal Verification of Tail Distribution Bounds in the HOL Theorem Prover. Mathematical Methods in The Applied Sciences, Vol. 32, no. 4, Wiley Interscience, March 2009, pp. 480-504.
q Statistical Properties of Continuous Random Variables q A. Coble. On Probability, Measure, and Integration in HOL4. Technical
Report, Computing Laboratory, University of Cambridge, UK, 2009, http://www.srcf.ucam.org/~arc54/techreport.pdf.
q O. Hasan, N. Abbasi, B. Akbarpour, S. Tahar and R. Akbarpour. Formal Reasoning about Expectation Properties for Continuous Random Variables; Formal Methods, Eindoven, Netherlands, November 2009. (To appear)
Probabilistic Analysis using a Theorem Prover O. Hasan and S. Tahar
138
References
q Applications q O. Hasan and S. Tahar:
Performance Analysis of ARQ Protocols using a Theorem Prover; In IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS'08), IEEE Computer Society, Austin, Texas, USA, April 2008, pp. 85-94.
q O. Hasan, N. Abbasi and S. Tahar: Formal Probabilistic Analysis of Stuck-at Faults in Reconfigurable Memory Arrays; In: Integrated Formal Methods, LNCS 5423, Springer Verlag, 2009, pp. 277-291.
q O.Hasan and S. Tahar. Performance Analysis and Functional Verification of the Stop-and-Wait Protocol in HOL. Journal of Automated Reasoning, Vol. 42, No. 1, Springer Verlag, January 2009, pp. 1-33.
q O.Hasan and S. Tahar. Probabilistic Analysis of Wireless Systems using Theorem Proving. Electronic Notes in Theoretical Computer Science, Vol. 242, No. 2, Elsevier, July 2009, pp. 43-58.