Upload
others
View
12
Download
0
Embed Size (px)
Citation preview
c©Stanley Chan 2019. All Rights Reserved.
ECE 302: Chapter 04: Continuous Random Variables
Fall 2019
Prof Stanley Chan
School of Electrical and Computer EngineeringPurdue University
1 / 56
c©Stanley Chan 2019. All Rights Reserved.
1. Continuous Random Variable
2 / 56
c©Stanley Chan 2019. All Rights Reserved.
Continuous Random Variable
Sample space becomes continuous
E.g., time, area
Characterized by histogram too!
Not PMF, but Probability Density Function (PDF)
3 / 56
c©Stanley Chan 2019. All Rights Reserved.
Continuous Random Variable
Definition
The probability density function (PDF) of a random variable X is afunction which, when integrated over an interval [a, b], yields theprobability of obtaining a ≤ X (ξ) ≤ b. We denote PDF of X as fX (x), and
P[a ≤ X ≤ b] =
∫ b
afX (x)dx . (1)
4 / 56
c©Stanley Chan 2019. All Rights Reserved.
Continuous and discrete unified!
If X is continuous,
P[a ≤ X ≤ b] =
∫ b
afX (x)dx
If X is discrete,
P[a ≤ X ≤ b] = P[X = x0] = pX (x0) =
∫ b
apX (x0)δ(x − x0)︸ ︷︷ ︸
fX (x)
dx
5 / 56
c©Stanley Chan 2019. All Rights Reserved.
Property
A PDF fX (x) should satisfy ∫ ∞−∞
fX (x)dx = 1. (2)
Example. Let fX (x) = c(1− x2) for −1 ≤ x ≤ 1, and 0 otherwise. Find c .
6 / 56
c©Stanley Chan 2019. All Rights Reserved.
Expectation
Definition (Expectation)
The expectation of a continuous random variable X is
E[X ] =
∫ ∞−∞
x fX (x)dx . (3)
7 / 56
c©Stanley Chan 2019. All Rights Reserved.
Expectation
Definition (Expectation of Function)
The expectation of a function g of a continuous random variables X is
E[g(X )] =
∫ ∞−∞
g(x) fX (x)dx . (4)
Definition (Moment)
The kth moment of a continuous random variables X is
E[X k ] =
∫ ∞−∞
xk fX (x)dx . (5)
8 / 56
c©Stanley Chan 2019. All Rights Reserved.
Variance
Definition (Variance)
The variance of a continuous random variables X is
Var[X ] = E[(X − µX )2]
=
∫ ∞−∞
(x − µX )2fX (x)dx
where µXdef= E[X ].
Remark: It also holds that
Var[X ] = E[X 2]− E[X ]2.
9 / 56
c©Stanley Chan 2019. All Rights Reserved.
2. Common Continuous Random Variables
10 / 56
c©Stanley Chan 2019. All Rights Reserved.
Uniform Distribution
Definition (Uniform Distribution)
Let X be a continuous uniform random variable. The PDF of X is
fX (x) =
{1
b−a , a ≤ x ≤ b,
0, otherwise,(6)
where [a, b] is the interval on which X is defined. We write
X ∼ Uniform(a, b)
to say that X is drawn from a uniform distribution on an interval [a, b].
11 / 56
c©Stanley Chan 2019. All Rights Reserved.
Mean and Variance
Proposition (Mean/Variance of Uniform Distribution)
If X ∼ Uniform(a, b), then
E[X ] =a + b
2, and Var[X ] =
(b − a)2
12.
12 / 56
c©Stanley Chan 2019. All Rights Reserved.
Application of Uniform Distribution
Analysis of Uniform QuantizerAssumption: X [n] is random signal.Quantization: partition the amplitude of X [n] into a discrete set of levels.
13 / 56
c©Stanley Chan 2019. All Rights Reserved.
Application of Uniform Distribution
We can model the quantization error as uniform distribution.
Or if we let the ∆ be the height of the quantization interval, then
Eq[n] ∼ Uniform
[−∆
2,
∆
2
].
The mean and variance of Eq[n] is
E[Eq[n]] = 0, Var[Eq[n]] =∆2
12.
14 / 56
c©Stanley Chan 2019. All Rights Reserved.
Application of Uniform Distribution
Knowing the distribution of Eq[n] is important:
It helps us design error compensation algorithms
It helps us understand the limit of data compression
It helps us generalize the concept to more advanced coding schemesR. Gray, Source Coding Theory, Kluwer Academic Publishers, 1990.
15 / 56
c©Stanley Chan 2019. All Rights Reserved.
Exponential distribution
Definition (Exponential Distribution)
Let X be an exponential random variable. The PDF of X is
fX (x) =
{λe−λx , x ≥ 0,
0, otherwise,(7)
where λ > 0 is a parameter. We write
X ∼ Exponential(λ)
to say that X is drawn from an exponential distribution of parameter λ.
Example. Inter-arrival time of Poisson random variables
16 / 56
c©Stanley Chan 2019. All Rights Reserved.
Effect of λ
Proposition (Mean/Variance of Exponential Distribution)
If X ∼ Exponential(λ), then
E[X ] =1
λ, and Var[X ] =
1
λ2.
17 / 56
c©Stanley Chan 2019. All Rights Reserved.
Neighbor of Exponential Distribution
A closely related distribution to Exponential distribution is the Laplacedistribution:
fX (x) = λe−λ|x |
Example: Image statistics.
18 / 56
c©Stanley Chan 2019. All Rights Reserved.
Neighbor of Exponential Distribution
• Instead of looking at the image intensity I directly, we can look at the
gradient of the image:
[∇x I∇y I
].
• Image gradients are sparse.
19 / 56
c©Stanley Chan 2019. All Rights Reserved.
3. Cumulative Distribution Function
20 / 56
c©Stanley Chan 2019. All Rights Reserved.
Cumulative Distribution Function
Definition
The cumulative distribution function (CDF) of a continuous randomvariable X is
FX (x)def= P[X ≤ x ] =
∫ x
−∞fX (x ′)dx ′. (8)
Example. Let fX (x) = c(1− x2) for −1 ≤ x ≤ 1, and 0 otherwise. FindFX (x).
21 / 56
c©Stanley Chan 2019. All Rights Reserved.
Properties of CDF
1 FX (−∞) =
2 FX (+∞) =
3 FX (x) is a non-decreasing function of x .
4 0 ≤ FX (x) ≤ 1
5 P[a ≤ X ≤ b] =
22 / 56
c©Stanley Chan 2019. All Rights Reserved.
Properties of CDF
Before we discuss Properties 6-7, we need the following terms.
(i) FX (b): The value of FX (x) at x = b.
(ii) limh→0 FX (b − h): The limit of FX (x) from the left hand side ofx = b.
(iii) limh→0 FX (b + h): The limit of FX (x) from the right hand side ofx = b.
23 / 56
c©Stanley Chan 2019. All Rights Reserved.
Properties of CDF
We say that FX (x) is
Left-continuous at x = b if
Right-continuous at x = b if
Continuous at x = b if
24 / 56
c©Stanley Chan 2019. All Rights Reserved.
Properties of CDF
6 FX (x) is right-continuous. That is,
limh→0
FX (b + h) = FX (b).
7 P[X = b] is determined by
P[X = b] = FX (b)− limh→0
FX (b − h).
25 / 56
c©Stanley Chan 2019. All Rights Reserved.
Theorem (Fundamental theorem of calculus)
If a function f is continuous, then
f (x) =d
dx
∫ x
af (t)dt
for some constant a.
Theorem
The probability density function (PDF) is the derivative of thecumulative distribution function (CDF):
fX (x) =dFX (x)
dx=
d
dx
∫ x
−∞fX (x ′)dx ′, (9)
provided FX is differentiable at x .
26 / 56
c©Stanley Chan 2019. All Rights Reserved.
Example. Consider a CDF
FX (x) =
{1− 1
4e−2x , x ≥ 0
0, x < 0.
Find fX (x).
27 / 56
c©Stanley Chan 2019. All Rights Reserved.
Example. Consider a CDF
FX (x) =
0.2, 0 ≤ x < 1
0.7, 1 ≤ x < 2
0.9, 2 ≤ x < 4
1, x ≥ 4.
Find fX (x).
28 / 56
c©Stanley Chan 2019. All Rights Reserved.
Mean / Mode / Median
Given a random variable X , can we define its mean/mode/median?From PDF:
Mean:
Mode:
Median:
29 / 56
c©Stanley Chan 2019. All Rights Reserved.
Mean / Mode / Median
From CDF:
Mean:
E[X ] =
∫ ∞0
(1− FX (x ′)
)dx ′ −
∫ 0
−∞FX (x ′)dx ′. (10)
Mode:
Median:
30 / 56
c©Stanley Chan 2019. All Rights Reserved.
Application of CDF
Q-Q Plot - a tool to check how good your model is.
Example Consider a dataset containing N data points. The histogram(empirical PDF) and empirical CDF is as follows:
Is it a Gaussian distribution?31 / 56
c©Stanley Chan 2019. All Rights Reserved.
QQ-Plot
32 / 56
c©Stanley Chan 2019. All Rights Reserved.
QQ-Plot
Why does it work?
Assume x1, . . . , xN are samples of a random variable X .Hypothesis: These data points are generated from certain randomvariable X̂ . Let F
X̂be its CDF.
Consider y1, . . . , yN are the equally spaced points of FX̂
. Then the zi ’s are
zi = F−1X̂
(yi ).
Testing: If X = X̂ , then for large N, we must have
zi = F−1X̂
(yi ) ≈ xi .
Therefore, we should have a linear function if we plot xi against zi .
33 / 56
c©Stanley Chan 2019. All Rights Reserved.
QQ-Plot
Figure: Left: Poor fit. In fact, the empirical data is generated from at-distribution. Right: Good fit.
34 / 56
c©Stanley Chan 2019. All Rights Reserved.
4. Gaussian Distribution
35 / 56
c©Stanley Chan 2019. All Rights Reserved.
Gaussian Distribution
Definition (Gaussian Distribution)
Let X be an Gaussian random variable. The PDF of X is
fX (x) =1√
2πσ2e−
(x−µ)2
2σ2 (11)
where (µ, σ2) are parameters of the distribution. We write
X ∼ N (µ, σ2)
to say that X is drawn from a Gaussian distribution of parameter (µ, σ2).
36 / 56
c©Stanley Chan 2019. All Rights Reserved.
Gaussian Distribution
Figure: Gaussian distribution
Proposition (Mean/Variance of Gaussian Distribution)
If X ∼ N (µ, σ2), then
E[X ] = µ, and Var[X ] = σ2.
37 / 56
c©Stanley Chan 2019. All Rights Reserved.
Gaussian Distribution
Proof.
38 / 56
c©Stanley Chan 2019. All Rights Reserved.
Percentile of Gaussian Distribution
39 / 56
c©Stanley Chan 2019. All Rights Reserved.
Standard Gaussian
Definition (Standard Gaussian)
A standard Gaussian (or standard Normal) random variable X has a PDF
fX (x) =1√2π
e−x2
2 . (12)
That is, X ∼ N (0, 1) is a Gaussian with µ = 0 and σ2 = 1.
Definition (CDF of Standard Gaussian)
The Φ(·) function of the standard Gaussian is
Φ(z) =1√2π
∫ z
−∞e−
x2
2 dx (13)
40 / 56
c©Stanley Chan 2019. All Rights Reserved.
Standardize Random Variable
If X ∼ N (µ, σ2), then
Z =X − µσ
∼ N (0, 1).
Proof. Key: Change of variable.
FX (x) =
∫ x
−∞fX (x ′)dx ′
=
∫ x
−∞
1√2πσ2
e−(x′−µ)2
2σ2 dx ′
=
∫ x−µσ
−∞
1√2π
e−x′22 dx ′
= Φ
(x − µσ
).
41 / 56
c©Stanley Chan 2019. All Rights Reserved.
Standard Gaussian
Figure: Definition of Φ(y).
Example. Let X ∼ N (µ, σ2). Find P[X ≤ b] and P[a ≤ X ≤ b].
42 / 56
c©Stanley Chan 2019. All Rights Reserved.
Standard Gaussian
Example. X ∼ N (5, 16), find
(a) P[X > 3]
(b) If P[X < a] = 0.7910, find a.
(c) If P[X > b] = 0.1635, find b.
43 / 56
c©Stanley Chan 2019. All Rights Reserved.
Example: Find the Outlier!
Find the outlier of this set of data:[0.25, 0.31, 0.33, 0.32, 0.36, 0.28, 0.29, 0.26, 0.7, 0.34].
Compute the statistics.
µ = 0.344, σ = 0.129.
Standarize Z = (X − µ)/σ.
The z-values are:-0.72, -0.26, -0.10, -0.18, 0.12, -0.49, -0.41, -0.64, 2.74, -0.03.
The probabilities P[Z < z ] are:0.23, 0.39, 0.45, 0.42, 0.54, 0.31, 0.33, 0.25, 0.9969, 0.48.
44 / 56
c©Stanley Chan 2019. All Rights Reserved.
Linear Transform of Gaussian
If X is Gaussian, and if we let
Y = aX + b,
then Y is also Gaussian.
Why?Assume X ∼ N (0, 1). Otherwise, standardize Z = (X − µ)/σ.
FY (y) = P[Y ≤ y ]
= P[aX + b ≤ y ]
= P[X ≤ (y − b)/a]
=
∫ (y−b)/a
−∞
1√2π
e−x2
2 dx .
45 / 56
c©Stanley Chan 2019. All Rights Reserved.
Linear Transform of Gaussian
Therefore, by Fundamental Theorem of Calculus,
fY (y) =d
dyFY (y)
=d
dy
∫ (y−b)/a
−∞
1√2π
e−x2
2 dx
=d y−b
a
dy· d
d y−ba
∫ (y−b)/a
−∞
1√2π
e−x2
2 dx (chain rule)
=1
a· 1√
2πe−
((y−b)/a)2
2 =1√
2πa2e−
(y−b)2
2a2 .
So Y is also Gaussian, with mean E[Y ] = b and Var[Y ] = a2.
In General: If X is Gaussian but not N (0, 1), then
E[Y ] = aE[X ] + b, Var[Y ] = a2Var[X ].
46 / 56
c©Stanley Chan 2019. All Rights Reserved.
Detection
Problem: Consider two clusters of data points.You want to build a simple classifier to determine whether a point belongsto N (µ1, σ
21) or N (µ2, σ
22).
Solution: Given the data point x , check whether one probability is largerthan the other! 47 / 56
c©Stanley Chan 2019. All Rights Reserved.
Detection
Write down the two PDFs:
1√2πσ21
e− (x−µ1)
2
2σ21 ≷
1√2πσ22
e− (x−µ2)
2
2σ22
Simplified Case: When σ1 = σ2 = σ. Then,
e−(x−µ1)
2
2σ2 ≷ e−(x−µ2)
2
2σ2
−(x − µ1)2
2σ2≷ −(x − µ2)2
2σ2
(x − µ1)2 ≶ (x − µ2)2
x2 − 2µ1x + µ21 ≶ x2 − 2µ2x + µ22
x ≶µ1 + µ2
2.
Therefore, if x < µ1+µ22 , then it is more likely that it belongs to class 1.
Otherwise, it is more likely that it belongs to class 2.48 / 56
c©Stanley Chan 2019. All Rights Reserved.
5. Function of Random Variable
49 / 56
c©Stanley Chan 2019. All Rights Reserved.
Function of Random Variable
Problem:
Given X .
Let Y = g(X ).
Want to find fY (y) and FY (y).
Example 1. Let X ∼ Uniform(0, 1). Let Y = 2X + 3. Find fY (y).
Example 2. Let X ∼ N (0, 1). Let Y = X 2. Find fY (y).
Why should we care about this?
Needed by problem. E.g., power and voltage: P = V 2/R.
Needed by analysis. E.g., random phase cos(ωt + Θ).
Needed by design. E.g., variance stabilizing transform.
50 / 56
c©Stanley Chan 2019. All Rights Reserved.
Examples
Example 1. Let X ∼ N (0, 1). Let Y = 2X + 3. Find fY (y) and FY (y).
51 / 56
c©Stanley Chan 2019. All Rights Reserved.
Examples
Example 2. Let X ∼ Uniform(−1, 1). Suppose Y = X 2. Find fY (y) andFY (y).
52 / 56
c©Stanley Chan 2019. All Rights Reserved.
Examples
Example 3. Let X ∼ Uniform(0, 2π). Suppose Y = cosX . Find fY (y)and FY (y). Hint: d
dy cos−1 y = −1√1−y2
.
53 / 56
c©Stanley Chan 2019. All Rights Reserved.
General Procedure
As shown in the previous examples, the basic steps are
FY (y) = P[Y ≤ y ]
P[Y ≤ y ] = P[g(X ) ≤ y ] = P[X ≤ g−1(y)], if g is increasing.Otherwise, pay attention to the inequality sign.
P[x ≤ g−1(y)] = FX (g−1(y)).
fY (y) = ddy FY (y) = d
dy FX (g−1(y))
Fundamental theorem of calculus is useful here:
d
dyFX (g−1(y)) =
d
dy
∫ g−1(y)
−∞fX (x ′)dx ′.
Chain rule:
d
dy
∫ g−1(y)
−∞fX (x ′)dx ′ =
dg−1(y)
dy· d
dg−1(y)
∫ g−1(y)
−∞fX (x ′)dx ′.
54 / 56
c©Stanley Chan 2019. All Rights Reserved.
Why Study Function of Random Variable?
Variance Stabilizing TransformMost of the denoising algorithms are
Designed for Gaussian noise
Assume variance is constant throughout the image
Easy to analyze, easy to implement
But, most photon shot noise is
Poisson
If X ∼ Poisson(λ), then E[X ] = λ and Var[X ] = λ
Variance changes as pixel intensity changes.
Variance stabilizing transform:
Let Y =√X + 3/8
Var[Y ] ≈ 1/4, constant throughout the image
Anscombe, F. J. (1948), “The transformation of Poisson, binomial and negative-binomial data”, Biometrika, 35 (34), pp.246254.
55 / 56
c©Stanley Chan 2019. All Rights Reserved.
Variance Stabilizing Transform
X , noisy input Var[X ] (before) Var[Y ] (after)
noisy input direct denoise transform-denoise56 / 56