Upload
tranliem
View
234
Download
5
Embed Size (px)
Citation preview
1
Advanced Random Processes
Text: PROBABILITY AND RANDOM PROCESSES, by Davenport
Course Outline
- Basic Probability Theory; Random Variables and Vectors;
Conditional Probability and Densities; Expectation; Conditional Expectation
- Estimation with Static Models Random Process;
- Stationarity; Power Spectral Density
- Mean-Square Calculus; Linear System
- Kalman Filter
- Wiener Integrals; Wiener Filter
Grade
Mid-term (40 %), Final (40 %), Homework (20 %)
2
2. SAMPLE POINTS AND SAMPLE SPACES
Sample point
A sample point is a representation of a possible outcome of an experiment.
Sample space
A sample space is the totality of all possible sample points, that is, the representation of allpossible outcomes of an experiment.
Event
An event is an outcome or a collection of outcomes. It is also defined as the correspondingsample point or set of sample points, respectively.
Event defined by listing
A = s1, s2, · · · , sn B = s1, s2, s3, · · ·
Event defined by description
A = s : prop(s) is true
where prop (s) is some proposition about s : for example, |s| < 1.
3
Implication or inclusionA ⊂ B ⇔ (s ∈ A ⇒ s ∈ B)
EqualityA = B ⇔ A ⊂ B and B ⊂ A
Union
A ∪B , s : s ∈ A or s ∈ B or both
A ⊂ A ∪B and B ⊂ A ∪B
A ⊂ B ⇔ A ∪B = B
Intersection
A ∩B , s : s ∈ A and s ∈ B
A ∩B ⊂ A and A ∩B ⊂ B
A ⊂ B ⇔ A ∩B = A
4
Distributive laws
A ∩ (B ∪ C) = (A ∩B) ∪ (A ∩ C)
andA ∪ (B ∩ C) = (A ∪B) ∩ (A ∪ C)
Complement
Ac , s : s ∈ S and s 6∈ A(Ac)c = A
A ⊂ B ⇒ Bc ⊂ Ac
Relative complement
B −A , s : s ∈ B and s 6∈ A
B −A = B ∩Ac
5
Null set φ
A ∩Ac = φ
Sc = φ
A ∪ φ = A and A ∩ φ = φ
φ ⊂ A for any A ⊂ S
Disjoint events
The events A and B are disjoint if and only if A ∩B = φ.
De Morgan’s rules
(A ∩B)c = Ac ∪Bc
(A ∪B)c = Ac ∩Bc
Partitions of S
A1, A2, · · · , Anwhere
Ai ∩Aj = φ for i 6= j and ∪ni=1 Ai = S
6
3. PROBABILITY
Probability Space
A Probability Space is a triple (S,A, P ).
S = Sample space
A = σ-algebra on S
P = Probability measure
σ- Algebra
A is a nonempty class of subsets of S such that
(i) A ∈ A ⇒ Ac ∈ A(ii) A,B ∈ A ⇒ A ∪B ∈ A(iii) A1, A2, A3, · · · ∈ A ⇒ ∪∞i=1 Ai ∈ A
Probability
Probability is a set function P : A → R which satisfies the following axioms:
(i) P (A) ≥ 0
(ii) P (S) = 1
7
(iii) Ai ∩Aj = φ, i 6= j ⇒
P [∪∞i=1Ai] =∞∑
i=1
P [Ai] (countably additive)
Elementary properties of probability
P [Ac] = 1− P [A]
P [φ] = 0
P [A] ≤ 1
P [B −A] = P [B]− P [A ∩B]
A ⊂ B ⇒ P [B −A] = P [B]− P [A]
A ⊂ B ⇒ P [A] ≤ P [B]
P [A ∪B] = P [A] + P [B]− P [A ∩B]
P [A ∪B] ≤ P [A] + P [B]
8
Joint probability
If the sample space S is partitioned both by the collection of events A1, A2, · · · , An, then
P [B] = P [B ∩ S] = P [B ∩ (∪mj=1Ai)] = P [∪m
j=1(B ∩Ai)]
=m∑
j=1
P [B ∩Ai]
=m∑
j=1
P [B|Aj ]P [Aj ]
Conditional probability
P [B|A] , P [A ∩B]P [A]
so long as P [A] > 0.
Bayes’ rule
Let B be an arbitrary event in a sample space S. Suppose that the events A1, A2, · · · , Am
partition S and that P [Ai] > 0 for all i. Then
P [Ai|B] =P [Ai ∩B]
P [B]=
P [B|Ai]P [Ai]∑mj=1 P [B|Aj ]P [Aj ]
9
Independent events
The events A and B are said to be statistically independent if
P [A ∩B] = P [A] P [B]
The events A1, A2, · · · , An are said to be mutually independent if and only if the relations
P [Ai ∩Aj ] = P [Ai]P [Aj ]
P [Ai ∩Aj ∩Ak] = P [Ai]P [Aj ]P [Ak]
· · ·P [A1 ∩A2 ∩ · · · ∩An] = P [A1]P [A2] · · ·P [An]
hold for all combinations of the indices such that 1 ≤ i < j < k < · · · ≤ n.
Independent experiments
Suppose that we are concerned with the outcomes of n different experiments E1, E2, · · · , En.Suppose further that the sample space Sk of the kth of these n experiments is partitioned by themk events Akik
, ik = 1, 2, · · · ,mk. The n given experiments are then said to be statisticallyindependent if the equation
P [A1i1 ∩A2i2 ∩ · · · ∩Anin ] = P [A1i1 ]P [A2i2 ] · · ·P [Anin ]
holds for every possible set of n integers i1, i2, · · · , in, where the index ik ranges form 1 to mk.
10
4. RANDOM VARIABLES
Set-indicator function
IA(s) =
1, if s ∈ A
0, if s 6∈ A
Inverse imageX−1(A) = s ∈ S | X(s) ∈ A
A-measurable
A map X : S → R is A-measurable if
s | X(s) < a ∈ A, for all a ∈ R
Random variable
A random variable X is an A-measurable function from the sample space S to R.
Induced probability
P [X ∈ A] , P [X−1(A)] = P [s ∈ S | X(s) ∈ A]P [X < a] , P [s ∈ S | X(s) < a]
RangeRX = range of X = a ∈ R | a = X(s) for s ∈ S
11
Probability distribution function
FX(x) , P [X ≤ x] = P [s ∈ S | X(s) ≤ x]
Properties of probability distribution function
FX(+∞) = 1 and FX(−∞) = 0
b > a ⇒ FX(b) ≥ FX(a) (monotone non-decreasing)
FX(a) = FX(a + 0) = limε→+0
FX(a + ε) (right continuous)
FX(a− 0) + P [X = a] = FX(a)
Decomposition of distribution functions
FX(x) = DX(x) + CX(x)
where DX is a step function and hence may be expressed as
DX(x) =∑
i
P [X = xi]U(x− xi)
and where CX is continuous everywhere.
12
Interval probability
P [X ∈ (a, b]] , P [a < X ≤ b] = FX(b)− FX(a)
Probability density
fX(x) , dFX(x)dx
Properties of probability density function
fX(x) ≥ 0
FX(x) =∫ x
−∞fX(ξ)dξ
∫ ∞
−∞fX(ξ)dξ = 1
Calculation of probability
P [X ∈ A] =∫
A
fX(x)dx
13
uniform density function
fX(x) =
1b−a , for a ≤ x ≤ b
0, otherwise
exponential density function
fX(x) =
ae−ax, for x ≥ 0
0, otherwise(a > 0)
normal density function
fX(x) =1√2π
e−x2/2
Rayleigh density function
fX(x) =
xb e−x2/2b, for x ≥ 0
0, otherwise(b > 0)
Cauchy density function
fX(x) =a
π
1a2 + x2
(a > 0)
14
5. RANDOM VECTORS
Random vector
X(s) =
X1(s)
X2(s)...
Xn(s)
where Xi(s), i = 1, · · · , n is a random variable defined on S. Thus, a random vector is a finitefamily of random variables.
Joint-probability distribution function
FX,Y (x, y) , P [X ≤ x, Y ≤ y] , P [s ∈ S | X(s) ≤ x and Y (s) ≤ y]
Properties of joint-probability distribution function
FX,Y (−∞, y) = 0, FX,Y (x,−∞) = 0
FX,Y (+∞, +∞) = 1
FX,Y (x, +∞) = FX(x), FX,Y (+∞, y) = FY (y) (marginal distribution)
15
Joint-probability density
fX,Y (x, y) , ∂2FX,Y (x, y)∂x∂y
Properties of joint-probability density function
fX,Y (x, y) ≥ 0
∫ ∞
−∞
∫ ∞
−∞fX,Y (x, y) dxdy = 1
FX(x) =∫ ∞
−∞
∫ x
−∞fX,Y (ξ, η)dξdη
fX(x) =∫ ∞
−∞fX,Y (x, η)dη
FY (y) =∫ y
−∞
∫ ∞
−∞fX,Y (ξ, η) dξdη
fY (y) =∫ ∞
−∞fX,Y (ξ, y) dξ
16
Two-dimensional normal (or gaussian) density
fX,Y (x, y) =1
2π√
1− ρ2exp
[−x2 − 2ρxy + y2
2(1− ρ2)
](|ρ| ≤ 1)
Probability calculation
P [(X, Y ) ∈ A] =∫ ∫
A
fX,Y (x, y) dxdy
Ex :
fX,Y (x, y) =
18 (x + y), 0 ≤ x, y ≤ 2
0, otherwise
fX(x) =∫ ∞
−∞fX,Y (x, y)dy =
∫ 2
018 (x + y)dy, 0 ≤ x ≤ 2
0, otherwise=
14 (x + 1), 0 ≤ x ≤ 2
0, otherwise
P [ |X − Y | > 1] = 2∫ 2
1
∫ x−1
0
18(x + y)dydx =
14
17
Conditional-probability distribution function
FX(x|Y ∈ B) , P [X ≤ x|Y ∈ B] =P [X ≤ x, Y ∈ B]
P [Y ∈ B]
whenever P [Y ∈ B] > 0.
FX(−∞|Y ∈ B) = 0
FX(+∞|Y ∈ B) = 1
Conditional-probability density
fX(x|Y ∈ B) , dFX(x|Y ∈ B)dx
fX(x|Y ∈ B) ≥ 0
∫ +∞
−∞fX(x|Y ∈ B)dx = 1
FX(x|Y ∈ B) =∫ x
−∞fX(ξ|Y ∈ B)dξ
P [X ∈ A|Y ∈ B) =∫
A
fX(ξ|Y ∈ B)dξ
18
Point conditioning conditional-probability distribution function
FX|Y (x|y) , FX(x|Y = y) ∼=∫ y+dy
y
∫ x
−∞ fX,Y (ξ, η)dξdη
fY (y)dy=
dy∫ x
−∞ fX,Y (ξ, y)dξ
fY (y)dy=
∫ x
−∞ fX,Y (ξ, y)dξ
fY (y)
Point conditioning conditional-probability density
fX|Y (x|y) ,dFX|Y (x|y)
dx=
fX,Y (x, y)fY (y)
P [X ∈ A|Y = y] =∫
A
fX|Y (ξ|y)dξ
P [X ∈ A] =∫ +∞
−∞P [X ∈ A|Y = y]fY (y)dy
Independent random variables
The random variables X and Y are statistically independent if
FX,Y (x, y) = FX(x)FY (y)
orfX,Y (x, y) = fX(x)fY (y)
X and Y independent ⇔ fX|Y (x|y) = fX(x)
19
6. Functions of random variables
X - - random variable with FX(x)Y , g(X)
s ∈ S| g(X(s)) ≤ a ∈ A , for all a ∈ R
Y - - random variable with FY (y)
FY (y) , P [x | g(x) ≤ y]
fY (y) =d
dyFY (y) =?
g is increasing ⇒ FY (y) = P [Y ≤ y] = P [X ≤ g−1(y)] = FX
(g−1(y)
)
fY (y) =d
dyFY (y) =
d
dyFX
(g−1(y)
)= fX(h(y))
dh
dy
whereh(y) , g−1(y)
g is decreasing ⇒ FY (y) = P [Y ≤ y] = P [X ≥ g−1(y)] = 1− FX
(g−1(y)
)
fY (y) =d
dyFY (y) = − d
dyFX
(g−1(y)
)= −fX(h(y))
dh
dy
20
g is one-to-one ⇒ fY (y) = fX(h(y))∣∣∣∣dh
dy
∣∣∣∣
Ex: Y = sin X
fX(x) =
1π , −π/2 < x < π/2
0 , o/w
⇒ fY (y) = fX(sin−1 y)∣∣∣∣
d
dy(sin−1 y)
∣∣∣∣ = fX(sin−1 y)1√
1− y2
fX(sin−1 y) =
1π , −1 < y < 1
0 , o/w⇒ fY (y) =
1
π√
1−y2, −1 < y < 1
0 , o/w
g is NOT one-to-one ⇒ fY (y) =∑
j
fX(g−1j (y))
∣∣∣∣∣dg−1
j (y)dy
∣∣∣∣∣
Ex: Y = |sin X|
fX(x) =
1π , −π/2 < x < π/2
0 , o/w
fY (y) = fX(sin−1 y)∣∣∣∣
d
dy(sin−1 y)
∣∣∣∣+fX(− sin−1 y)∣∣∣∣
d
dy(− sin−1 y)
∣∣∣∣ =(fX(sin−1 y)+fX(− sin−1 y)
) 1√1− y2
21
fX(sin−1 y) =
1π , 0 < y < 1
0 , o/wand fX(− sin−1 y) =
1π , 0 < y < 1
0 , o/w
⇒ fY (y) =
2
π√
1−y2, 0 < y < 1
0 , o/w
Z , X + Y
FZ(z) = P [X + Y ≤ z] =∫ +∞
−∞
[∫ z−x
−∞fX,Y (x, y)dy
]dx
fZ(z) =d
dz
∫ +∞
−∞
[∫ z−x
−∞fX,Y (x, y)dy
]dx
=
∫ +∞
−∞
[d
dz
∫ z−x
−∞fX,Y (x, y)dy
]dx
=∫ +∞
−∞fX,Y (x, z − x)dx
X and Y independent ⇒ fZ(z) =∫ +∞
−∞fX(x)fY (z − x)dx (convolution)
Z , XY
22
FZ(z) = P [XY ≤ z] =∫ 0
−∞
[∫ ∞
z/x
fX,Y (x, y)dy
]dx +
∫ +∞
0
[∫ z/x
−∞fX,Y (x, y)dy
]dx
fZ(z) =∫ 0
−∞(− 1
x)fX,Y (x,
z
x)dx +
∫ +∞
0
1x
fX,Y (x,z
x)dx =
∫ +∞
−∞
1|x|fX,Y (x,
z
x)dx
X and Y independent ⇒ fZ(z) =∫ +∞
−∞
1|x|fX(x)fY (
z
x)dx
X =
X1
X2
and Y =
Y1
Y2
Y , g(X) (1− 1)
fY (y) = fX(h(y))
∣∣∣∣∣∣dh1dy1
dh1dy2
dh2dy1
dh2dy2
∣∣∣∣∣∣where X = h(Y )
23
7. Statistical averages
Statistical average or Expectation
E[X] ,∑
k
xkP [X = xk]
E[X] ,∫ ∞
−∞xfX(x)dx
E[g(X)] ,∑
k
g(xk)P [X = xk]
E[g(X)] ,∫ ∞
−∞g(x)fX(x)dx
Random vectors
X =
X1
X2
...
Xn
E[X] ,
E[X1]
E[X2]...
E[Xn]
E[g(X)] =∑
k1
∑
k2
· · ·∑
kn
g(xk1 , xk2 , · · · , xkn)P [X1 = xk1 , X2 = xk2 , · · · , Xn = xkn ]
24
E[g(X)] =∫ ∞
−∞· · ·
∫ ∞
−∞
∫ ∞
−∞g(x1, x2, · · · , xn)fX1,X2,··· ,Xn(x1, x2, · · · , xn)dx1dx2dxn
General properties of the Expectation
E[IA(X)] = P [X ∈ A]
where IA is the set indicator of the event A ⊂ R.
E[aX] = aE[X], for any real a
E[a1X1 + a2X2] = a1E[X1] + a2E[X2], for any real a1 and a2
E[AX] = AE[X], for any real matrix A
|E[X]| ≤ E[|X|]
X(s) ≥ 0 for every s ∈ S ⇒ E[X] ≥ 0
X1(s) ≥ X2(s) for every s ∈ S ⇒ E[X1] ≥ E[X2]
25
X1 and X2 independent ⇒ E[X1X2] = E[X1]E[X2]
kth Moments
kth moment of X , E[Xk]
VarianceΣX , E[(X − E[X])2] = E[X2]− E[X]2
ΣX , E[(X − E[X])(X − E[X])T ] = E[XXT ]− E[X]E[X]T
CovarianceΣXY , E
[(X − E[X])(Y − E[Y ])T
]= E[XY T ]− E[X]E[Y ]T
Uncorrelated random variables
X and Y uncorrelated ⇔ ΣXY = 0 ⇔ E[XY T ] = E[X]E[Y ]T
Orthogonal random variables
X and Y orthogonal ⇔ E[XY T ] = 0
Properties of the variance and the covariance
ΣTX = ΣX , (symmetric)
26
bT ΣXb ≥ 0 , for all b ∈ Rn (positive semidefinite)
ΣAX+b = AΣXAT , for any real A and b
E[(X − c)2] ≥ ΣX , for any real c
X1, X2, X3 pairwise uncorrelated ⇒ ΣX1+X2+X3 = ΣX1 + ΣX2 + ΣX3
ΣY X = ΣXYT
ΣAX+BY,Z = AΣX,Z + BΣY,Z , for any real A and B
ΣAX+BY = ΣAX+BY,AX+BY = AΣX,AX+BY + BΣY,AX+BY
= A(ΣAX+BY,X)T + B(ΣAX+BY,Y )T = A(AΣX,X + BΣY,X)T + B(AΣX,Y + BΣY,Y )T
= AΣXAT + AΣX,Y BT + BΣY,XAT + BΣY BT
X, Y uncorrelated ⇒ ΣAX+BY = AΣXAT + BΣY BT
27
Bernoulli random variables
P [X = 1] = p and P [X = 0] = 1− p
E[X] = p and E[Xk] = p, k = 1, 2, 3, · · ·ΣX = p(1− p)
Binomial random variables
P [Y = k] =(
n
k
)pkqn−k, k = 0, 1, 2, · · · , n
where q = 1− p.E[Y ] = np ΣY = npq
Poisson random variables
P [X = k] = e−M Mk
k!, k = 0, 1, 2, · · ·
E[X] = M = ΣX
28
Uniform random variables
Let X be uniformly distributed over the interval [a, b]. Then
E[X] =a + b
2
E[Xk] =1
k + 1(bk + bk−1a + · · ·+ bak−1 + ak)
Exponential random variables
fT (t) =
ae−at, t ≥ 0
0, t < 0
E[T ] =1a
E[T k] =k!ak
Gaussian or normal random variables
fX(x) =1√2πσ
exp[− (x−m)2
2σ2
]
E[X] = m ΣX = σ2
fX1,X2(x1, x2) =1
2πσ1σ2
√1− ρ2
exp
[− (x1−m1
σ1)2 − 2ρ(x1−m1
σ1)(x2−m2
σ2) + (x2−m2
σ2)2
2(1− ρ2)
]
29
E[Xi] = mi ΣXi= σ2
i ΣX1,X2 = ρσ1σ2
X ∈ Rn
fX(x) =1
(2π)n/2|ΣX |1/2exp
[−1
2(x−mX)T Σ−1
X (x−mX)]
Rayleigh random variables
fR(r) =
rb e−r2/2b, for r ≥ 0
0, otherwise
E[R] =
√bπ
2E[R2] = 2b
Chebyshev inequality
Let X be a r.v. with E[|X|r] < ∞ , for any r > 0. Then
P [|X| ≥ ε] ≤ E[|X|r]εr
, for any ε > 0.
PF:
Y ,
0 , if |X| < ε
εr , if |X| ≥ ε
E[Y ] = 0 · P [Y = 0] + εrP [Y = εr] = εrP [|X| ≥ ε]
30
Y ≤ |X|r ⇒ E[Y ] ≤ E[|X|r] ⇒ P [|X| ≥ ε] =E[Y ]εr
≤ E[|X|r]εr
Special case: X = Z − E[Z] and r = 2
P [|Z − E[Z]| ≥ ε] ≤ ΣZ
ε2, for any ε > 0
Cauchy-Schwarz inequality
Let the real random variables X and Y have finite second moments. Then
E[XY ]2 ≤ E[X2]E[Y 2]
PF: For any real λ,
0 ≤ E[(λX + Y )2] = λ2E[X2] + 2λE[XY ] + E[Y 2]
Conditional Expectation
E[X|Y = y] ,∑
j
xjP [X = xj |Y = y]
E[X|Y = y] ,∫ ∞
−∞xfX|Y (x|y)dx
E[g(X)|Y = y] ,∫ ∞
−∞g(x)fX|Y (x|y)dx = h(y)
31
⇒ E[g(X)|Y ] , h(Y )
ΣX|Y , E[(X − E[X|Y ])(X − E[X|Y ])T
∣∣Y ]= E
[XXT
∣∣Y ]− E[X|Y ]E[X|Y ]T
Properties of the Conditional Expectation
E[IA(Y )|X = x] = P [Y ∈ A|X = x]
E [g(Y )E[h(X)|Y ]] = E[g(Y )h(X)]
PF:∫ ∞
−∞g(y)E[h(X)|Y = y]fY (y)dy =
∫ ∞
−∞g(y)
[∫ ∞
−∞h(x)fX|Y (x|y)dx
]fY (y)dy
=∫ ∞
−∞
∫ ∞
−∞g(y)h(x)fX|Y (x|y)fY (y)dxdy =
∫ ∞
−∞
∫ ∞
−∞g(y)h(x)fX,Y (x, y)dxdy
Special case: g(Y ) = 1, h(X) = X
E [E[X|Y ]] = E[X]
E[AX|Y ] = AE[X|Y ]
E[X + Z|Y ] = E[X|Y ] + E[Z|Y ]
32
E[g(Y )X|Y = y] = g(y)E[X|Y = y]
E[g(Y )X|Y ] = g(Y )E[X|Y ]
X and Y independent ⇒ E[h(X)|Y = y] = E[h(X)]
33
8. ESTIMATION, SAMPLING, AND PREDICTION
X & Y – jointly distributed
X – to be estimated
Y – observed
Question: Given that value of Y = y, what is ”best” estimate x of the value of x thatminimizes, over all x,
E‖X − x‖2∣∣Y = y = E(X − x)T (X − x)∣∣Y = y
Theorem: x = EX|Y = y and minimum value of mean-squared error is
E‖X − x‖2∣∣Y = y = E(X − x)T (X − x)∣∣Y = y
= Etr(X − x)(X − x)T∣∣Y = y
= trE(X − x)(X − x)T∣∣Y = y = trΣX|Y =y
Proof.
E(X − x)T (X − x)∣∣Y = y = EXT X − xT X −XT x + xT x
∣∣Y = y= EXT X
∣∣Y = y − xT EX∣∣Y = y − EXT∣∣Y = yx + xT x
= EXT X∣∣Y = y − 2xT EX
∣∣Y = y+ xT x + ‖EX∣∣Y = y‖2 − ‖EX
∣∣Y = y‖2= ‖x− EX
∣∣Y = y‖2 + EXT X∣∣Y = y − ‖EX
∣∣Y = y‖2
34
Remark: n× n matrix A = aij, trace(A) = trA ,∑n
i=1 aij
(a) A is a scalar ⇒ trA = A
(b) tr(AB) = tr(BA)
(c) tr(A + B) = trA + trB
Terminology:
(a) For any value y of Y , ”Best Estimate” is x = EX|Y = y.(b) Let y vary. X = EX|Y is ”Best Estimator”. Thus, ”Best Estimator” is a random variable.
Theorem: The estimator of X in terms of Y that minimizes E‖X − g(Y )‖2 over allfunctions g is X = EX|Y .
Proof. See ”I.B. Rhodes, A Tutorial Introduction to Estimation and Filtering, IEEE Transactionon Automatic Control, Vol.16, No.6, 1971”.
Properties of Best Estimator:
(a) Linear:EAX + BZ + C|Y = AEX|Y + BEZ|Y + C
(b) Unbiased:EX = E[EX|Y ] = EX
35
(c) Projection Theorem: Error X − X = X is orthogonal to the r.v. g(Y ) for any scalar functiong, i.e.,
Eg(Y )XT = 0
PF:
Eg(Y )XT = E[Eg(Y )XT |Y ] = E[g(Y )EXT − EXT |Y |Y ]= E[g(Y )(EXT |Y − EXT |Y )] = 0
Definition:
(a) X & Y are L2− orthogonal if EXT Y = 0. (denoted X⊥Y ).
(Reminder: X & Y are orthogonal if EXY T = 0. EXT Y = trEXY T .(b) Let M be a subspace of X . (e.g., M = all n− vector valued functions f(Y ) ).
M⊥ , X ∈ X |X⊥Y, Y ∈ M.
Projection Theorem:
Let M be a subspace of X . Then there exists a unique pair of maps P : X → M andQ : X → M⊥ such that X = PX + QX, for all X ∈ X .
Also:
36
(a)
X ∈ M ⇒ PX = X and QX = 0
X ∈ M⊥ ⇒ PX = 0 and QX = X
(b) For all X ∈ X ,‖X − PX‖ = min
X∈M‖X − X‖
i.e., projection of X on M gives minimum error over all points in M .
(c)‖X‖2 = ‖PX‖2 + ‖QX‖2
(d) P & Q are linear.
Problem: Find the best linear estimator X∗ = A∗Y + b∗ that minimizes E‖X −AY − b‖2over all n×m matrix A and n× 1 matrix b.
Sol.) First, assume that X & Y have zero mean. Let
M = all random vectors of the form AY + b.
By Projection theorem,X −A∗Y − b∗⊥M
That is, for all A & b,
E(AY + b)T (X −A∗Y − b∗) = trE(X −A∗Y − b∗)(AY + b)T = tr[ΣXY AT −A∗ΣY AT − b∗bT ] = tr[(ΣXY −A∗ΣY )AT ]− bT b∗ = 0
37
ThusA∗ = ΣXY Σ−1
Y and b∗ = 0
which implies that X∗ = ΣXY Σ−1Y Y .
Assume non-zero mean. Then,
(X −mX)∗ = ΣXY Σ−1Y (Y −mY )
ThusX∗ = mX + ΣXY Σ−1
Y (Y −mY )
Basic Properties of Best Linear Estimator:
(a) Unbiased:EX∗ = EX = mX
(b) Let X = X −X∗. Then the error covariance is
ΣX = E(X −X∗)(X −X∗)T = ΣX − ΣXY Σ−1Y ΣY X
Remark:
(a) If uncorrelated, best linear estimator is
X∗ = mX and ΣX = ΣX (∵ ΣXY = 0)
(b) If independent,
38
best linear estimator is X∗ = mX . (∵ independent ⇒ uncorrelated)
best estimator is X = mX .
Example:
fXY (x, y) =
3, 0 ≤ y ≤ 1, 0 ≤ x ≤ y2
0, otherwise
Estimate X by (a) constant, (b) linear estimator, and (c) nonlinear estimator.
Sol)
fX(x) =
3(1−√x), 0 ≤ x ≤ 1
0, otherwise
(a)
E[X] =∫ ∞
−∞xfX(x)dx =
∫ 1
0
3x(1−√x)dx =(
32x2 − 6
5x
52
) ∣∣∣∣1
0
=32− 6
5=
310
Error var. = E[(X −mX)2] =37700
∼= 0.0529
(b)
fY (y) =
3y2, 0 ≤ y ≤ 1
0, otherwise
mY =34, var(Y ) =
380
, cov(X, Y ) =140
39
X∗ =310
+140
380
(Y − 34) =
23Y − 1
5
Error var. = var(X)− cov(X,Y )2
var(Y )= 0.0362
(c)
fX|Y (x|y) =
1y2 , 0 ≤ x ≤ y2 ≤ 1
0, otherwise
E[X|Y = y] =∫ y2
0
x1y2
dx =12y2
X = E[X|Y ] =12Y 2
Error var. = E[(X − X)2] = 0.0357
Matrix Inversion Lemma:
(P−1 + HT R−1H)−1 = P − PHT (HPHT + R)−1HP
(A + XT Y )−1 = A−1 −A−1XT (I + Y A−1XT )−1Y A−1
PF: exercise.
Best Linear Min. Var. Estimator of X given Y :
X∗ = X∗|Y = E∗[X|Y ] = mX + ΣXY Σ−1
Y (Y −mY )
40
More Properties of Best Linear Estimator:
(1) Only depend on 1st and 2nd moments.
(2) X & Y are jointly Gaussian ⇒ E∗[X|Y ] = E[X|Y ].
(3) E∗[X|Y ] is linear in 1st argument.
(4) Assume that Y & Z are uncorrelated. Also, let X, Y, & Z have zero mean.
(a)E∗[X|Y,Z] = E∗[X|Y ] + E∗[X|Z]
Let X|Y , X − E∗[X|Y ] and X|Y,Z , X − E∗[X|Y, Z]. Then,
ΣX|Y = ΣX − ΣXY Σ−1Y ΣY X
ΣX|Y,Z = ΣX − ΣXY Σ−1Y ΣY X − ΣXZΣ−1
Z ΣZX
(b)E∗[X|Y,Z] = E∗[X|Y ] + E∗[X|Y |Z]
ΣX|Y,Z= ΣX|Y
− ΣX|Y ,ZΣ−1Z ΣZ,X|Y
(5) Let X, Y, & Z have zero mean.
E∗[X|Y,Z] = E∗[X|Y, Z|Y ]
= E∗[X|Y ] + E∗[X|Z|Y ] (by 4(a))
= E∗[X|Y ] + E∗[X|Y |Z|Y ] (by 4(b))
41
ΣX|Y,Z= ΣX|Y
− ΣX|Y ,Z|YΣ−1
Z|YΣZ|Y ,X|Y
Z|Y = innovation in Z w.r.t. Y
(6) X,Y1, · · · , Yk+1 zero mean. Denote E∗[X|Y1, · · · , Yk+1] , X∗|k+1
X∗|k+1 = X∗
|k + E∗[X|k|Yk+1|k]
where X|k = X −X∗|k and
Yk+1|k = Yk+1 − E∗[Yk+1|Y1, · · · , Yk]
innovation in Yk+1 w.r.t. Y1, · · · , Yk
(7) Y1, Y2, Y3, · · · , Yk+1 are linearly related to Y1, Y2|1, Y3|2, · · · , Yk+1|k.”Gram-Schmidt orthogonalization”
42
Sample Mean
Estimator of E(X) = m
mn , 1n
n∑
i=1
Xi
E[mn] = m
Xi, i = 1, 2, 3, ..., n independent ⇒ var(mn) =1n
var(X)
Assume independence.
limn→0
P [|mn −m| ≥ ε] = 0 (the weak law of large numbers)
Relative frequency
- Suppose that we sample a random variable, say X, and that we determine for each samplewhether or not some given event A occurs. The random variable characterizing the relativefrequency of occurrence of the event A has the statistical properties.
E[nA
n] = p and var(
nA
n) =
p(1− p)n
wherep , P [X ∈ A]
43
P[∣∣∣nA
n− p
∣∣∣ ≥ ε]≤ 1
4nε2
limn→∞
P[∣∣∣nA
n− p
∣∣∣ ≥ ε]
= 0 (Bernoulli theorem)
44
9. Random Processes
Random Process
An indexed family of random variables, Xt, t ∈ T, where T denotes the set of possible values ofthe index t.
If T is a countably infinite set, then the process is called a discrete-parameter random process; ifT is a continuum, then the process is called a continuous-parameter random process.
Bernoulli process
A random process Xn, n = 1, 2, 3, · · · in which the random variables Xn are Bernoulli randomvariables, for example, where
P [Xn = 1] = p and P [Xn = 0] = 1− p
and where the Xn are statistically independent random variables.
E[Xn] = p
var(Xn) = pq = p(1− p)
Binomial counting process
A random process Yn, n = 1, 2, 3, · · ·
Yn ,n∑
i=1
Xi
45
where Xi, i = 1, 2, 3, · · · is independent Bernoulli r.p.
P [Yn = k] =(
n
k
)pk(1− p)n−k, for k = 0, 1, 2, · · · , n
E[Yn] = np
var(Yn) = npq
cov(Ym, Yn) = pq min(m,n)
var(Ym − Yn) = |m− n|pq
Sine wave processXt , V sin(Ωt + Φ) , t ∈ R
where V, Ω, and Φ are r.v.’s.
Stationarity (strict sense)
A random process XT , t ∈ T is stationary (in the strict sense) if and only if all of thefinite-dimensional probability distribution functions are invariant under shifts of the time origin.
Mean functionmX(t) , E[Xt]
46
Autocorrelation functionRX(t1, t2) , E[Xt1Xt2 ]
Covariance functionKX(t1, t2) , cov(Xt1 , Xt2)
Covariance function
KX(t1, t2) , cov(Xt1 , Xt2) = RX(t1, t2)−mX(t1)mX(t2)
Cross-correlation functionRXY (t1, t2) , E[Xt1Yt2 ]
Cross-covariance function
KXY (t1, t2) , cov(Xt1 , Yt2) = RXY (t1, t2)−mX(t1)mY (t2)
Stationary random processes
Let Xt,−∞ < t < +∞ be a strictly stationary real random process. It then follows that
mX(t) = E[Xt] = E[X0] = const
47
RX(t, t− τ) = RX(0,−τ)
We generally write in this caseE[Xt] = E[X] = mX
RX(t, t− τ) = RX(τ) , t ∈ R
RX(−τ) = RX(τ)
|RX(τ)| ≤ RX(0)
Wide sense stationarity (wss)
Let Xt,−∞ < t < +∞ be a real random process such that
E[Xt] = E[X0] , t ∈ R
RX(t, t− τ) = RX(0, 0− τ) , t ∈ R, τ ∈ RThen the given random process is said to be stationary in the wide sense.
Jointly wide sense stationary random processes
We say that the random processes Xt,−∞ < t < +∞ and Yt,−∞ < t < +∞ are jointly wss,if Xt,−∞ < t < +∞ and Yt,−∞ < t < +∞ are wss and
RXY (t, t− τ) = RXY (0,−τ) , t ∈ R, τ ∈ R
48
Sample mean
Consider the wide-sense stationary random process Xt,−∞ < t < +∞ whose second momentis finite. Suppose that we sample that process at the n time instants t1, t2, · · · , tn. The estimator
mn , 1n
n∑
i=1
Xi
whereXi , Xti
is called the sample mean.
E[mn] = mX
var(mn) =1n2
n∑
i=1
n∑
j=1
KX(ti, tj)
Special cases are:
Xi pairwise uncorrelated ⇒ var(mn) =KX(0, 0)
n=
σ2
n
⇒ limn→∞
P [|mn −m| > ε] = 0 (the weak law of large numbers)
Xi highly correlated ⇒ var(mn) = σ2
49
Periodic sampling
Let the wss random process Xt,−∞ < t < +∞ be sampled periodically throughout the interval0 ≤ t ≤ T in such a way that there are n sampling instants equally spaced throughout thatinterval (the last at t = T ). The variance of the sample mean is given in this case by the formula
var(mn) =σ2
n+
2n
n−1∑
k=1
(1− k
n)KX(k ∆t)
where ∆t , T/n. It therefore follows that
limn→∞
var(mn) =2T
∫ T
0
(1− τ
T)KX(τ)dτ
if we pass to the limit n →∞ while keeping T fixed.
50
10. LINEAR TRANSFORMATIONS
n-dimensional case
Suppose that the m-dimensional real random vector
Y = (Y1, Y2, · · · , Ym)
is generated from the n-dimensional real random vector
X = (X1, X2, · · · , Xn)
by the transformation g, that is,Y = g(X)
We say that g is a linear transformation if and only if it satisfies the relation
g(aW + bZ) = ag(W ) + bg(Z) , ∀a, b ∈ R
Y1 = g11X1 + g12X2 + · · ·+ g1nXn
Y2 = g21X1 + g22X2 + · · ·+ g2nXn
· · ·Ym = gm1X1 + gm2X2 + · · ·+ gmnXn
51
Yi =n∑
j=1
gijXj , i = 1, 2, · · · ,m
E[Yi] =n∑
j=1
gijE[Xj ]
cov(Yi, Yk) =n∑
j=1
n∑r=1
gijgkrcov(Xj , Xr)
Matrix formulation
X =
X1
X2
Y =
Y1
Y2
Y3
G =
g11 g12
g21 g22
g31 g32
Y = GX
E[Y ] = GE[X] ΣY = GΣXGT
52
Time averages
Xt, −∞ < t < +∞ - - r.p.
Yt , 1T
∫ t
t−T
Xτdτ
E[Yt] =1T
∫ t
t−T
E[Xτ ]dτ
Output autocorrelation function
RY (t1, t2) = E[Yt1Yt2 ] = E
[1T
∫ t1
t1−T
Xα1dα11T
∫ t2
t2−T
Xα2dα2
]= E
[1
T 2
∫ t1
t1−T
∫ t2
t2−T
Xα1Xα2dα1dα2
]
=1
T 2
∫ t1
t1−T
∫ t2
t2−T
RX(α1, α2)dα1dα2 =1
T 2
∫ T
0
∫ T
0
RX(τ1 + t1 − T, τ2 + t2 − T )dτ1dτ2
Xt, −∞ < t < +∞ wss ⇒E[Yt] = mX , E[Xt]
RY (t1, t2) =1
T 2
∫ T
0
∫ T
0
RX(t1 − t2 + τ1 − τ2)dτ1dτ2
53
RY (t, t) =1
T 2
∫ T
0
∫ T
0
RX(τ1 − τ2)dτ1dτ2 =2
T 2
∫ T
0
∫ T
α1
RX(α1)dα2dα1
=2
T 2
∫ T
0
(T − α1)RX(α1)dα1 =2T
∫ T
0
(1− τ
T)RX(τ)dτ
var(Yt) = RY (t, t)−m2X =
2T
∫ T
0
(1− τ
T)[RX(τ)−m2
X ]dτ =2T
∫ T
0
(1− τ
T)KX(τ)dτ
∫ ∞
−∞|KX(τ)|dτ < C ⇒ var(Yt) <
C
T(??)
Weighting functions
time-invariant Linear system
y(t) =∫ +∞
−∞h(τ)x(t− τ)dτ
h(t) - - system weighting function
Output moments
Xt ∼ random input
Yt =∫ +∞
−∞h(τ)Xt−τdτ
54
E[Yt] =∫ +∞
−∞h(τ)E[Xt−τ ]dτ
KY (t1, t2) =∫ +∞
−∞
∫ +∞
−∞h(τ1)h(τ2)KX(t1 − τ1, t2 − τ2)dτ1dτ2
Xt ∼ wss
E[Yt] = mX
∫ +∞
−∞h(τ)dτ
KY (τ) =∫ +∞
−∞
∫ +∞
−∞h(τ1)h(τ2)KX(τ − τ1 + τ2)dτ1dτ2
RY X(τ) =∫ +∞
−∞h(t′)RX(τ − t′)dt′
System correlation function
Rh(τ) ,∫ +∞
−∞h(t)h(t− τ)dt
RY (τ) =∫ +∞
−∞Rh(t′)RX(τ − t′)dt′
var(Yt) =∫ +∞
−∞Rh(t′)KX(t′)dt′
55
11. SPECTRAL ANALYSIS
Fourier transforms
X(f) ,∫ +∞
−∞x(t)e−i2πftdt
x(t) =∫ +∞
−∞X(f)ei2πftdf
System functions
h(t) - - the weighting function of a stable, linear, time-invariant linear system.
H(f) ,∫ +∞
−∞h(τ)e−i2πfτdτ
the system function
h(τ) =∫ +∞
−∞H(f)ei2πfτdf
y(t) =∫ +∞
−∞h(τ)x(t− τ)dτ
Y (f) = X(f)H(f)
56
Spectral density
SX(f) ,∫ +∞
−∞RX(τ)e−i2πfτdτ
RX(τ) =∫ +∞
−∞SX(f)ei2πfτdf
SX(0) =∫ +∞
−∞RX(τ)dτ
E[X2t ] = RX(0) =
∫ +∞
−∞SX(f)df
SX(f) ≥ 0, for all f
Xt real ⇒ SX(−f) = SX(f)
Spectral analysis of linear system
Xt ∼ wss input , Yt ∼ wss output
SY (f) = |H(f)|2SX(f)
57
E[Y 2t ] =
∫ +∞
−∞|H(f)|2SX(f)df
If H has the value unity over a narrowband of width ∆f centered about a frequency f1 , then
E[Y 2t ] = 2SX(f1)∆f
Cross-spectral density
SXY (f) ,∫ +∞
−∞RXY (τ)e−i2πfτdτ
RXY (τ) =∫ +∞
−∞SXY (f)ei2πfτdf
58
12. SUMS OF INDEPENDENT RANDOM VARIABLES
Independent-increment process
The real random process Yt , t ≥ 0 is said to be an independent-increment process if for everyset of time instants
0 < t1 < t2 < · · · < tn
the increments(Yt0 − Y0), (Yt2 − Yt1), · · · , (Ytn − Ytn−1)
are mutually independent random variables and
Y0 , 0
Ytn =n∑
i=1
Xi
whereXi , Yti − Yti−1 , i = 1, 2, · · · , n
Independent-increment process with stationary increments
FYt2+τ−Yt1+τ (x) = FYt2−Yt1(x) , τ ∈ R
E[Yt2+t1 − Yt1 ] = E[Yt2 − Y0] = E[Yt2 ] ⇒ E[Yt2+t1 ] = E[Yt1 ] + E[Yt2 ]
59
E[Yt] = mt
wherem , E[Yt]
∣∣t=1
RY (t2 + t1, t1) = E[Yt2+t1Yt1 ] = E[(
Yt2+t1 − Yt1 + Yt1
)Yt1
]= E[Yt2+t1 − Yt1 ]E[Yt1 ] + E[Y 2
t1 ]
= m2t2t1 + E[Y 2t1 ]
E[(Yt2 − Y0)2] = E[(Yt2+t1 − Yt1)2] = RY (t2 + t1, t2 + t1)− 2RY (t2 + t1, t1) + RY (t1, t1)
= RY (t2 + t1, t2 + t1)− 2m2t2t1 −RY (t1, t1)
KY (t2 + t1, t2 + t1) = RY (t2 + t1, t2 + t1)−m2(t2 + t1)2
= RY (t2, t2) + 2m2t2t1 + RY (t1, t1)−m2t22 −m2t21 − 2m2t2t1
= KY (t2, t2) + KY (t1, t1)
var(Yt) = KY (t, t) = σ2t
whereσ2 , var(Yt)
∣∣t=1
60
t2 ≥ t1 ⇒ RY (t2, t1) = E[Yt2Yt1 ] = E[(
Yt2 − Yt1 + Yt1
)Yt1
]= E[Yt2 − Yt1 ]E[Yt1 ] + E[Y 2
t1 ]
= m2t2t1 + σ2t1
t2 ≥ t1 ⇒ KY (t2, t1) = RY (t2, t1)−m2t2t1 = σ2t1
KY (t2, t1) = σ2 min(t2, t1)
t2 ≥ t1 ⇒ var(Yt2−Yt1) = KY (t2, t2)+KY (t1, t1)−2KY (t2, t1) = σ2t2+σ2t1−2σ2t1 = σ2(t2−t1)
var(Yt2 − Yt1) = σ2|t2 − t1|
Characteristic function
φX(v) , E[eivX ] =∫ ∞
−∞fX(x)eivxdx (Fourier transform)
fX(x) =12π
∫ ∞
−∞φX(v)e−ivxdv
φX(v) =∑
k
P [X = xk]eivxk
61
|φX(v)| ≤ φX(0) = 1
Sums of independent random variables
Yn ,n∑
i=1
Xi
where X1, X2, · · · , Xn are mutually independent.
φYn(v) = E[ei∑n
i=1 viXi ] =n∏
i=1
E[eiviXi ] =n∏
i=1
φXi(v)
fYn(y) = fX1(y) ∗ fX2(y) ∗ · · · ∗ fXn(y)
Linear functions
Y , aX + b , a, b ∈ R
⇒ φY (v) = E[eiv(aX+b)] = eivbφX(av)
62
Gaussian random variables
Let X be a gaussian random variable.
φX(v) = exp(
ivE[X]− v2σ2X
2
)
Let Y be a sum of n mutually independent gaussian random variable Xk.
φY (v) = exp(
ivm− v2σ2
2
)
where
m ,n∑
k=1
mk and σ2 ,n∑
k=1
σ2k
and where mk and σ2k are the mean and variance, respectively, of Xk.
Cauchy random variables
fX(x) =1
π(1 + x2)
φX(v) = e−|v|
63
Chi-squared random variables
fX(x) =
x(n−2)/2e−x/2
2n/2Γ(n2 ) , for x ≥ 0
0, for x < 0
where n is a nonnegative integer.
φX(v) = (1− i2v)−n/2
Poisson random variables
P [X = k] =λke−λ
k!, for k = 0, 1, 2, · · ·
φX(v) = exp[λ(eiv − 1)]
Moment-generating property
E[Xk] = (−i)kφ(k)X (0)
φX(v) =∞∑
k=0
E[Xk](iv)k
k!
64
Joint - characteristic functions
φX1,X2(x1, x2) = E
[exp
(i
2∑
k=1
vkXk
)]
φX(v) = E[exp(ivT X)]
v ,
v1
v2
X ,
X1
X2
|φX(v)| ≤ φX(0) = 1
X ∼ n× 1
fX(x) = fX1,X2,··· ,Xn(x1, · · · , xn)
φX1,··· ,Xn(v1, · · · , vn) =∫ +∞
−∞· · ·
∫ +∞
−∞exp(i
n∑
k=1
vkxk)fX1,··· ,Xn(x1, · · · , xn)dx1 · · · dxn
fX1,··· ,Xn(x1, · · · , xn) =1
(2π)n
∫ +∞
−∞· · ·
∫ +∞
−∞exp(−i
n∑
k=1
vkxk)φX1,··· ,Xn(v1, · · · , vn)dv1 · · · dvn
65
Independent random variables
Xk’s are mutually independent ⇔ φX1,X2,··· ,Xn(v1, v2, · · · , vn) =n∏
k=1
φXk(vk)
Moment-generating properties
E[Xm1 Xk
2 ] = (−i)m+k ∂m+kφX1,X2(v1, v2)∂vm
1 ∂vk2
∣∣∣∣v1=v2=0
φX1,X2(v1, v2) =∞∑
m=0
∞∑
k=0
E[Xm1 Xk
2 ](iv1)m
m!(iv2)k
k!
Independent-increment processes
Let Yt, t ≥ 0 be a real random process with stationary and independent increments and letY0 = 0. If, given the time instants
0 = t0 < t1 < t2 < · · · < tn
Xk , Ytk− Ytk−1 , for k = 1, 2, · · · , n
66
Ytn =n∑
k=1
Xk
φYtn(v) =
n∏
k=1
φXk(v)
φYt1 ,Yt2 ,...,Ytn(v1, v2, · · · , vn) =
n∏
k=1
φXk
n∑
j=k
vj
Probability generating function
Let X be a discrete random variable with nonnegative integer possible values.
ψX(z) , E[zX ] =∑
k
P [X = k]zk
E[X] = ψ′X(1)
E[X(X − 1) · · · (X − n + 1)] = ψ(n)X (1)
Joint-probability generating functions
Each of the Xk is a nonnegative, integer-valued random variable.
67
ψX(z) , E[zX11 zX2
2 · · · zXnn ] = E[
n∏
k=1
zXk
k ]
mutually independent ⇔ ψX(z) =n∏
k=1
ψXk(zk)
68
13. The Poisson process
Poisson process
Let Nt , 0 ≤ t < +∞ be a counting random process such that:
a. Nt assumes only nonnegative integer values and
N0 , 0
b. The process has stationary and independent increments,
c.P [Nt+∆t −Nt = 1] = λ∆t + o(∆t) (λ > 0)
d.P [Nt+∆t −Nt > 1] = o(∆t)
where
lim∆t→0
o(∆t)∆t
= 0
It then follows that
P [Nt+∆t −Nt = 0] = 1− λ∆t + o(∆t)
and that
69
P [Nt = k] =e−λt(λt)k
k!, k = 0, 1, 2, · · · ;
That is, Nt is a Poisson random variable. The counting process Nt, 0 ≤ t < ∞ is then called aPoisson counting process.
E[Nt] = λt
var(Nt) = λt
Arrival times
Let Tk be the random variable which describes the arrival time of the kth event counted by thecounting random process Nt, 0 ≤ t < ∞. Then
FTk(t) = 1− FNt(k − 1)
If Nt, 0 ≤ t < ∞ is a Poisson counting process, then
FTk(t) =
1− e−λt∑k−1
j=0(λt)j
j! , t ≥ 0
0 , t < 0
fTk(t) =
λe−λt (λt)k−1
(k−1)! , t ≥ 0
0 , t < 0
70
that is, Tk has an Erlang probability density. In this case:
E[Tk] =k
λ
var(Tk) =k
λ2
φTk(v) =
1(1− iv
λ
)k
Interarrival times
Let Nt, 0 ≤ t < +∞ be a counting process with the arrival times Tk , k = 1, 2, 3, · · · . Thedurations
Z1 , T1
Zk , Tk − Tk−1 , k = 2, 3, 4, · · ·
are called the interarrival times of the counting process. We then have
FZk(τ) = 1− P [Ntk−1+τ −Ntk−1 = 0]
If the counting process has stationary increments, then, for all k,
71
FZk(τ) = 1− P [Nτ = 0]
Further, if the given counting process is Poisson, then
FZk(τ) =
1− e−λτ , τ ≥ 0
0 , τ < 0
fZk(τ) =
λe−λτ , τ ≥ 0
0 , τ < 0
In this case
E[Zk] =1λ
, k = 1, 2, 3, · · ·
In any case
E[Tk] = kE[Zk]
Renewal counting processes
Let Nt , 0 ≤ t < ∞ be a counting process. If the interarrival times of this counting process aremutually independent random variables, all with the same probability distribution function, thenthe given process is called a renewal counting process. The renewal function m(t) of a renewalcounting process is the expected value of that process; that is,
72
m(t) , E[Nt]
and its derivative
λ(t) , dm(t)dt
is called the renewal intensity of the process. It then follows that
m(t) =∞∑
k=1
FTk(t)
and, if the various derivative exist,
λ(t) =∞∑
k=0
fTk(t)
On defining Λ(v) to be the Fourier transform of the renewal intensity, that is,
Λ(v) ,∫ +∞
−∞λ(t)eivtdt
it then follows that
Λ(v) =φZ(v)
1− φZ(v)
73
where φz is the common characteristic function of the interarrival times.
Unordered arrival times
Let Nt, 0 ≤ t < ∞ be a Poisson counting process and suppose that Nt = k : that is, supposethat k events occur by time t. The unordered arrival times U1, U2, · · · , Uk of those k events arethen mutually independent random variables, each of which is uniformly distributed over theinterval (0, t] :
fUi(ui|Nt = k) =
1t , 0 < ui ≤ t
0 , otherwise
for all i = 1, 2, · · · , k.
Filtered Poisson processes
Let Nt, 0 ≤ t < ∞ be a Poisson counting process. The random process Xt, 0 ≤ t < ∞ inwhich
Xt ,Nt∑
j=1
h(t− Uj)
where an event which occurs at time uj generates an outcome h(t− uj) at time t and where therandom variables Uj are the unordered arrival times of the events which occur during the interval(0, t], is called a filtered Poisson process. The mean of a filtered Poisson process is
74
E[Xt] = λ
∫ t
0
h(u)du
the variance is
var(Xt) = λ
∫ t
0
h(u)2du
and the characteristic function of the random variable Xt is
φXt(v) = exp[λ
∫ t
0
(eivh(u) − 1
)du
]
Random partitioning
Let Nt, 0 ≤ t < ∞ be a Poisson counting process and let Xt, 0 ≤ t < ∞ be thecorresponding filtered Poisson process in which
Xt ,Nt∑
j=1
h(t− Uj)
We say that the random process Zt, 0 ≤ t < ∞ is a randomly partitioned filtered Poissonrandom process if
Zt ,Nt∑
j=1
Yjh(t− Uj)
75
where the partitioning random variables Yj are mutually independent random variables which areindependent of the unordered arrival times Uj , and where each of the Yj has the same Bernoulliprobability distribution
P [Yj = 1] = p and P [Yj = 0] = q , 1− p
where 0 < p < 1. In this case,
E[Zt] = pλ
∫ t
0
h(u)du = pE[Xt]
The characteristic function of the randomly partitioned random variable Zt is
φZt(v) = exp[pλ
∫ t
0
(eivh(u) − 1
)du
]
76
14. Gaussian random process
Gaussian random vectors
Y = (Y1, Y2, · · · , Ym)
φY (v) = exp(imTY v − 1
2vT ΣY v)
fY (y) =exp[− 1
2 (y −mY )T ΣY (y −mY )](2π)m/2|ΣY |1/2
Gaussian random processes
The real random process Yt, t ∈ T is said to be a gaussian random process if for every finiteset of time instants tj ∈ T , the corresponding random variables Ytj are jointly gaussian randomvariables.
Narrowband random processes
The random process Xt, −∞ < t < ∞ is said to be a narrowband random process if it has azero mean, is stationary in the wide sense, and if its spectral density SX differ from zero only insome narrowband of width ∆f centered about some frequency f0 where
f0 >> ∆f
A narrowband random process may be represented in terms of an envelope random processVt, −∞ < t < ∞ and a phase random precess Φt, −∞ < t < ∞ by using the relation
77
Xt = Vt cos(ω0t + Φt)
where w0 = 2πf0. Alternatively, a narrowband random process may also be represented in termsof cosine and sine component random processes Xct,−∞ < t < ∞ and Xst,−∞ < t < ∞,respectively, by using the relation
Xt = Xct cos ω0t−Xst sin ω0t
The relations between these two representations are given by the formulas
Xct = Vt cosΦt and Xst = Vt sinΦt
which have the inverses
Vt =√
X2ct + X2
st and Φt = tan−1 Xst
Xct
The random variables Xct, Xst, Xc(t+τ), and Xs(t+τ) have the covariance matrix
R(τ) =
RX(0) 0 Rc(τ) Rcs(τ)
0 RX(0) −Rcs(τ) Rc(τ)
Rc(τ) −Rcs(τ) RX(0) 0
Rcs(τ) Rc(τ) 0 RX(0)
78
where
Rc(τ) = 2∫ +∞
0
SX(f) cos[2π(f − f0)τ ]df
and
Rcs(τ) = 2∫ +∞
0
SX(f) sin[2π(f − f0)τ ]df
Narrowband gaussian processes
The cosine- and sine-component random variables Xct and Xst of a gaussian narrowband randomprocess are independent random variables with zero means, each with a variance equal to RX(0),and a joint-probability density
fXctXst(x, y) =exp
[− x2+y2
2RX(0)
]
2πRX(0)
The envelope and phase random variables Vt and Φt of a gaussian narrowband random processare also independent random variables. The envelope has the Rayleigh probability density
fVt(v) =
vRX(0)exp
[− v2
2RX(0)
], v ≥ 0
0 , otherwise
and the phase is uniformly distributed over [0, 2π] :
79
fΦt(φ) =
12π , 0 ≤ φ ≤ 2π
0 , otherwise