Text: PROBABILITY AND RANDOM PROCESSES, by …cel.cau.ac.kr/class/rp/RP-ADV01.pdf · · 2016-03-021 Advanced Random Processes Text: PROBABILITY AND RANDOM PROCESSES, by Davenport

1

Advanced Random Processes

Text: PROBABILITY AND RANDOM PROCESSES, by Davenport

Course Outline

- Basic Probability Theory; Random Variables and Vectors;

Conditional Probability and Densities; Expectation; Conditional Expectation

- Estimation with Static Models Random Process;

- Stationarity; Power Spectral Density

- Mean-Square Calculus; Linear System

- Kalman Filter

- Wiener Integrals; Wiener Filter

Grade

Mid-term (40 %), Final (40 %), Homework (20 %)

2

2. SAMPLE POINTS AND SAMPLE SPACES

Sample point

A sample point is a representation of a possible outcome of an experiment.

Sample space

A sample space is the totality of all possible sample points, that is, the representation of allpossible outcomes of an experiment.

Event

An event is an outcome or a collection of outcomes. It is also defined as the correspondingsample point or set of sample points, respectively.

Event defined by listing

A = s1, s2, · · · , sn B = s1, s2, s3, · · ·

Event defined by description

A = s : prop(s) is true

where prop (s) is some proposition about s : for example, |s| < 1.

3

Implication or inclusionA ⊂ B ⇔ (s ∈ A ⇒ s ∈ B)

EqualityA = B ⇔ A ⊂ B and B ⊂ A

Union

A ∪B , s : s ∈ A or s ∈ B or both

A ⊂ A ∪B and B ⊂ A ∪B

A ⊂ B ⇔ A ∪B = B

Intersection

A ∩B , s : s ∈ A and s ∈ B

A ∩B ⊂ A and A ∩B ⊂ B

A ⊂ B ⇔ A ∩B = A

4

Distributive laws

A ∩ (B ∪ C) = (A ∩B) ∪ (A ∩ C)

andA ∪ (B ∩ C) = (A ∪B) ∩ (A ∪ C)

Complement

Ac , s : s ∈ S and s 6∈ A(Ac)c = A

A ⊂ B ⇒ Bc ⊂ Ac

Relative complement

B −A , s : s ∈ B and s 6∈ A

B −A = B ∩Ac

5

Null set φ

A ∩Ac = φ

Sc = φ

A ∪ φ = A and A ∩ φ = φ

φ ⊂ A for any A ⊂ S

Disjoint events

The events A and B are disjoint if and only if A ∩B = φ.

De Morgan’s rules

(A ∩B)c = Ac ∪Bc

(A ∪B)c = Ac ∩Bc

Partitions of S

A1, A2, · · · , Anwhere

Ai ∩Aj = φ for i 6= j and ∪ni=1 Ai = S

6

3. PROBABILITY

Probability Space

A Probability Space is a triple (S,A, P ).

S = Sample space

A = σ-algebra on S

P = Probability measure

σ- Algebra

A is a nonempty class of subsets of S such that

(i) A ∈ A ⇒ Ac ∈ A(ii) A,B ∈ A ⇒ A ∪B ∈ A(iii) A1, A2, A3, · · · ∈ A ⇒ ∪∞i=1 Ai ∈ A

Probability

Probability is a set function P : A → R which satisfies the following axioms:

(i) P (A) ≥ 0

(ii) P (S) = 1

7

(iii) Ai ∩Aj = φ, i 6= j ⇒

P [∪∞i=1Ai] =∞∑

i=1

P [Ai] (countably additive)

Elementary properties of probability

P [Ac] = 1− P [A]

P [φ] = 0

P [A] ≤ 1

P [B −A] = P [B]− P [A ∩B]

A ⊂ B ⇒ P [B −A] = P [B]− P [A]

A ⊂ B ⇒ P [A] ≤ P [B]

P [A ∪B] = P [A] + P [B]− P [A ∩B]

P [A ∪B] ≤ P [A] + P [B]

8

Joint probability

If the sample space S is partitioned both by the collection of events A1, A2, · · · , An, then

P [B] = P [B ∩ S] = P [B ∩ (∪mj=1Ai)] = P [∪m

j=1(B ∩Ai)]

=m∑

j=1

P [B ∩Ai]

=m∑

j=1

P [B|Aj ]P [Aj ]

Conditional probability

P [B|A] , P [A ∩B]P [A]

so long as P [A] > 0.

Bayes’ rule

Let B be an arbitrary event in a sample space S. Suppose that the events A1, A2, · · · , Am

partition S and that P [Ai] > 0 for all i. Then

P [Ai|B] =P [Ai ∩B]

P [B]=

P [B|Ai]P [Ai]∑mj=1 P [B|Aj ]P [Aj ]

9

Independent events

The events A and B are said to be statistically independent if

P [A ∩B] = P [A] P [B]

The events A1, A2, · · · , An are said to be mutually independent if and only if the relations

P [Ai ∩Aj ] = P [Ai]P [Aj ]

P [Ai ∩Aj ∩Ak] = P [Ai]P [Aj ]P [Ak]

· · ·P [A1 ∩A2 ∩ · · · ∩An] = P [A1]P [A2] · · ·P [An]

hold for all combinations of the indices such that 1 ≤ i < j < k < · · · ≤ n.

Independent experiments

Suppose that we are concerned with the outcomes of n different experiments E1, E2, · · · , En.Suppose further that the sample space Sk of the kth of these n experiments is partitioned by themk events Akik

, ik = 1, 2, · · · ,mk. The n given experiments are then said to be statisticallyindependent if the equation

P [A1i1 ∩A2i2 ∩ · · · ∩Anin ] = P [A1i1 ]P [A2i2 ] · · ·P [Anin ]

holds for every possible set of n integers i1, i2, · · · , in, where the index ik ranges form 1 to mk.

10

4. RANDOM VARIABLES

Set-indicator function

IA(s) =

1, if s ∈ A

0, if s 6∈ A

Inverse imageX−1(A) = s ∈ S | X(s) ∈ A

A-measurable

A map X : S → R is A-measurable if

s | X(s) < a ∈ A, for all a ∈ R

Random variable

A random variable X is an A-measurable function from the sample space S to R.

Induced probability

P [X ∈ A] , P [X−1(A)] = P [s ∈ S | X(s) ∈ A]P [X < a] , P [s ∈ S | X(s) < a]

RangeRX = range of X = a ∈ R | a = X(s) for s ∈ S

11

Probability distribution function

FX(x) , P [X ≤ x] = P [s ∈ S | X(s) ≤ x]

Properties of probability distribution function

FX(+∞) = 1 and FX(−∞) = 0

b > a ⇒ FX(b) ≥ FX(a) (monotone non-decreasing)

FX(a) = FX(a + 0) = limε→+0

FX(a + ε) (right continuous)

FX(a− 0) + P [X = a] = FX(a)

Decomposition of distribution functions

FX(x) = DX(x) + CX(x)

where DX is a step function and hence may be expressed as

DX(x) =∑

i

P [X = xi]U(x− xi)

and where CX is continuous everywhere.

12

Interval probability

P [X ∈ (a, b]] , P [a < X ≤ b] = FX(b)− FX(a)

Probability density

fX(x) , dFX(x)dx

Properties of probability density function

fX(x) ≥ 0

FX(x) =∫ x

−∞fX(ξ)dξ

∫ ∞

−∞fX(ξ)dξ = 1

Calculation of probability

P [X ∈ A] =∫

A

fX(x)dx

13

uniform density function

fX(x) =

1b−a , for a ≤ x ≤ b

0, otherwise

exponential density function

fX(x) =

ae−ax, for x ≥ 0

0, otherwise(a > 0)

normal density function

fX(x) =1√2π

e−x2/2

Rayleigh density function

fX(x) =

xb e−x2/2b, for x ≥ 0

0, otherwise(b > 0)

Cauchy density function

fX(x) =a

π

1a2 + x2

(a > 0)

14

5. RANDOM VECTORS

Random vector

X(s) =

X1(s)

X2(s)...

Xn(s)

where Xi(s), i = 1, · · · , n is a random variable defined on S. Thus, a random vector is a finitefamily of random variables.

Joint-probability distribution function

FX,Y (x, y) , P [X ≤ x, Y ≤ y] , P [s ∈ S | X(s) ≤ x and Y (s) ≤ y]

Properties of joint-probability distribution function

FX,Y (−∞, y) = 0, FX,Y (x,−∞) = 0

FX,Y (+∞, +∞) = 1

FX,Y (x, +∞) = FX(x), FX,Y (+∞, y) = FY (y) (marginal distribution)

15

Joint-probability density

fX,Y (x, y) , ∂2FX,Y (x, y)∂x∂y

Properties of joint-probability density function

fX,Y (x, y) ≥ 0

∫ ∞

−∞

∫ ∞

−∞fX,Y (x, y) dxdy = 1

FX(x) =∫ ∞

−∞

∫ x

−∞fX,Y (ξ, η)dξdη

fX(x) =∫ ∞

−∞fX,Y (x, η)dη

FY (y) =∫ y

−∞

∫ ∞

−∞fX,Y (ξ, η) dξdη

fY (y) =∫ ∞

−∞fX,Y (ξ, y) dξ

16

Two-dimensional normal (or gaussian) density

fX,Y (x, y) =1

2π√

1− ρ2exp

[−x2 − 2ρxy + y2

2(1− ρ2)

](|ρ| ≤ 1)

Probability calculation

P [(X, Y ) ∈ A] =∫ ∫

A

fX,Y (x, y) dxdy

Ex :

fX,Y (x, y) =

18 (x + y), 0 ≤ x, y ≤ 2

0, otherwise

fX(x) =∫ ∞

−∞fX,Y (x, y)dy =

∫ 2

018 (x + y)dy, 0 ≤ x ≤ 2

0, otherwise=

14 (x + 1), 0 ≤ x ≤ 2

0, otherwise

P [ |X − Y | > 1] = 2∫ 2

1

∫ x−1

0

18(x + y)dydx =

14

17

Conditional-probability distribution function

FX(x|Y ∈ B) , P [X ≤ x|Y ∈ B] =P [X ≤ x, Y ∈ B]

P [Y ∈ B]

whenever P [Y ∈ B] > 0.

FX(−∞|Y ∈ B) = 0

FX(+∞|Y ∈ B) = 1

Conditional-probability density

fX(x|Y ∈ B) , dFX(x|Y ∈ B)dx

fX(x|Y ∈ B) ≥ 0

∫ +∞

−∞fX(x|Y ∈ B)dx = 1

FX(x|Y ∈ B) =∫ x

−∞fX(ξ|Y ∈ B)dξ

P [X ∈ A|Y ∈ B) =∫

A

fX(ξ|Y ∈ B)dξ

18

Point conditioning conditional-probability distribution function

FX|Y (x|y) , FX(x|Y = y) ∼=∫ y+dy

y

∫ x

−∞ fX,Y (ξ, η)dξdη

fY (y)dy=

dy∫ x

−∞ fX,Y (ξ, y)dξ

fY (y)dy=

∫ x

−∞ fX,Y (ξ, y)dξ

fY (y)

Point conditioning conditional-probability density

fX|Y (x|y) ,dFX|Y (x|y)

dx=

fX,Y (x, y)fY (y)

P [X ∈ A|Y = y] =∫

A

fX|Y (ξ|y)dξ

P [X ∈ A] =∫ +∞

−∞P [X ∈ A|Y = y]fY (y)dy

Independent random variables

The random variables X and Y are statistically independent if

FX,Y (x, y) = FX(x)FY (y)

orfX,Y (x, y) = fX(x)fY (y)

X and Y independent ⇔ fX|Y (x|y) = fX(x)

19

6. Functions of random variables

X - - random variable with FX(x)Y , g(X)

s ∈ S| g(X(s)) ≤ a ∈ A , for all a ∈ R

Y - - random variable with FY (y)

FY (y) , P [x | g(x) ≤ y]

fY (y) =d

dyFY (y) =?

g is increasing ⇒ FY (y) = P [Y ≤ y] = P [X ≤ g−1(y)] = FX

(g−1(y)

)

fY (y) =d

dyFY (y) =

d

dyFX

(g−1(y)

)= fX(h(y))

dh

dy

whereh(y) , g−1(y)

g is decreasing ⇒ FY (y) = P [Y ≤ y] = P [X ≥ g−1(y)] = 1− FX

(g−1(y)

)

fY (y) =d

dyFY (y) = − d

dyFX

(g−1(y)

)= −fX(h(y))

dh

dy

20

g is one-to-one ⇒ fY (y) = fX(h(y))∣∣∣∣dh

dy

∣∣∣∣

Ex: Y = sin X

fX(x) =

1π , −π/2 < x < π/2

0 , o/w

⇒ fY (y) = fX(sin−1 y)∣∣∣∣

d

dy(sin−1 y)

∣∣∣∣ = fX(sin−1 y)1√

1− y2

fX(sin−1 y) =

1π , −1 < y < 1

0 , o/w⇒ fY (y) =

1

π√

1−y2, −1 < y < 1

0 , o/w

g is NOT one-to-one ⇒ fY (y) =∑

j

fX(g−1j (y))

∣∣∣∣∣dg−1

j (y)dy

∣∣∣∣∣

Ex: Y = |sin X|

fX(x) =

1π , −π/2 < x < π/2

0 , o/w

fY (y) = fX(sin−1 y)∣∣∣∣

d

dy(sin−1 y)

∣∣∣∣+fX(− sin−1 y)∣∣∣∣

d

dy(− sin−1 y)

∣∣∣∣ =(fX(sin−1 y)+fX(− sin−1 y)

) 1√1− y2

21

fX(sin−1 y) =

1π , 0 < y < 1

0 , o/wand fX(− sin−1 y) =

1π , 0 < y < 1

0 , o/w

⇒ fY (y) =

2

π√

1−y2, 0 < y < 1

0 , o/w

Z , X + Y

FZ(z) = P [X + Y ≤ z] =∫ +∞

−∞

[∫ z−x

−∞fX,Y (x, y)dy

]dx

fZ(z) =d

dz

∫ +∞

−∞

[∫ z−x

−∞fX,Y (x, y)dy

]dx

=

∫ +∞

−∞

[d

dz

∫ z−x

−∞fX,Y (x, y)dy

]dx

=∫ +∞

−∞fX,Y (x, z − x)dx

X and Y independent ⇒ fZ(z) =∫ +∞

−∞fX(x)fY (z − x)dx (convolution)

Z , XY

22

FZ(z) = P [XY ≤ z] =∫ 0

−∞

[∫ ∞

z/x

fX,Y (x, y)dy

]dx +

∫ +∞

0

[∫ z/x

−∞fX,Y (x, y)dy

]dx

fZ(z) =∫ 0

−∞(− 1

x)fX,Y (x,

z

x)dx +

∫ +∞

0

1x

fX,Y (x,z

x)dx =

∫ +∞

−∞

1|x|fX,Y (x,

z

x)dx

X and Y independent ⇒ fZ(z) =∫ +∞

−∞

1|x|fX(x)fY (

z

x)dx

X =

X1

X2

and Y =

Y1

Y2

Y , g(X) (1− 1)

fY (y) = fX(h(y))

∣∣∣∣∣∣dh1dy1

dh1dy2

dh2dy1

dh2dy2

∣∣∣∣∣∣where X = h(Y )

23

7. Statistical averages

Statistical average or Expectation

E[X] ,∑

k

xkP [X = xk]

E[X] ,∫ ∞

−∞xfX(x)dx

E[g(X)] ,∑

k

g(xk)P [X = xk]

E[g(X)] ,∫ ∞

−∞g(x)fX(x)dx

Random vectors

X =

X1

X2

...

Xn

E[X] ,

E[X1]

E[X2]...

E[Xn]

E[g(X)] =∑

k1

∑

k2

· · ·∑

kn

g(xk1 , xk2 , · · · , xkn)P [X1 = xk1 , X2 = xk2 , · · · , Xn = xkn ]

24

E[g(X)] =∫ ∞

−∞· · ·

∫ ∞

−∞

∫ ∞

−∞g(x1, x2, · · · , xn)fX1,X2,··· ,Xn(x1, x2, · · · , xn)dx1dx2dxn

General properties of the Expectation

E[IA(X)] = P [X ∈ A]

where IA is the set indicator of the event A ⊂ R.

E[aX] = aE[X], for any real a

E[a1X1 + a2X2] = a1E[X1] + a2E[X2], for any real a1 and a2

E[AX] = AE[X], for any real matrix A

|E[X]| ≤ E[|X|]

X(s) ≥ 0 for every s ∈ S ⇒ E[X] ≥ 0

X1(s) ≥ X2(s) for every s ∈ S ⇒ E[X1] ≥ E[X2]

25

X1 and X2 independent ⇒ E[X1X2] = E[X1]E[X2]

kth Moments

kth moment of X , E[Xk]

VarianceΣX , E[(X − E[X])2] = E[X2]− E[X]2

ΣX , E[(X − E[X])(X − E[X])T ] = E[XXT ]− E[X]E[X]T

CovarianceΣXY , E

[(X − E[X])(Y − E[Y ])T

]= E[XY T ]− E[X]E[Y ]T

Uncorrelated random variables

X and Y uncorrelated ⇔ ΣXY = 0 ⇔ E[XY T ] = E[X]E[Y ]T

Orthogonal random variables

X and Y orthogonal ⇔ E[XY T ] = 0

Properties of the variance and the covariance

ΣTX = ΣX , (symmetric)

26

bT ΣXb ≥ 0 , for all b ∈ Rn (positive semidefinite)

ΣAX+b = AΣXAT , for any real A and b

E[(X − c)2] ≥ ΣX , for any real c

X1, X2, X3 pairwise uncorrelated ⇒ ΣX1+X2+X3 = ΣX1 + ΣX2 + ΣX3

ΣY X = ΣXYT

ΣAX+BY,Z = AΣX,Z + BΣY,Z , for any real A and B

ΣAX+BY = ΣAX+BY,AX+BY = AΣX,AX+BY + BΣY,AX+BY

= A(ΣAX+BY,X)T + B(ΣAX+BY,Y )T = A(AΣX,X + BΣY,X)T + B(AΣX,Y + BΣY,Y )T

= AΣXAT + AΣX,Y BT + BΣY,XAT + BΣY BT

X, Y uncorrelated ⇒ ΣAX+BY = AΣXAT + BΣY BT

27

Bernoulli random variables

P [X = 1] = p and P [X = 0] = 1− p

E[X] = p and E[Xk] = p, k = 1, 2, 3, · · ·ΣX = p(1− p)

Binomial random variables

P [Y = k] =(

n

k

)pkqn−k, k = 0, 1, 2, · · · , n

where q = 1− p.E[Y ] = np ΣY = npq

Poisson random variables

P [X = k] = e−M Mk

k!, k = 0, 1, 2, · · ·

E[X] = M = ΣX

28

Uniform random variables

Let X be uniformly distributed over the interval [a, b]. Then

E[X] =a + b

2

E[Xk] =1

k + 1(bk + bk−1a + · · ·+ bak−1 + ak)

Exponential random variables

fT (t) =

ae−at, t ≥ 0

0, t < 0

E[T ] =1a

E[T k] =k!ak

Gaussian or normal random variables

fX(x) =1√2πσ

exp[− (x−m)2

2σ2

]

E[X] = m ΣX = σ2

fX1,X2(x1, x2) =1

2πσ1σ2

√1− ρ2

exp

[− (x1−m1

σ1)2 − 2ρ(x1−m1

σ1)(x2−m2

σ2) + (x2−m2

σ2)2

2(1− ρ2)

]

29

E[Xi] = mi ΣXi= σ2

i ΣX1,X2 = ρσ1σ2

X ∈ Rn

fX(x) =1

(2π)n/2|ΣX |1/2exp

[−1

2(x−mX)T Σ−1

X (x−mX)]

Rayleigh random variables

fR(r) =

rb e−r2/2b, for r ≥ 0

0, otherwise

E[R] =

√bπ

2E[R2] = 2b

Chebyshev inequality

Let X be a r.v. with E[|X|r] < ∞ , for any r > 0. Then

P [|X| ≥ ε] ≤ E[|X|r]εr

, for any ε > 0.

PF:

Y ,

0 , if |X| < ε

εr , if |X| ≥ ε

E[Y ] = 0 · P [Y = 0] + εrP [Y = εr] = εrP [|X| ≥ ε]

30

Y ≤ |X|r ⇒ E[Y ] ≤ E[|X|r] ⇒ P [|X| ≥ ε] =E[Y ]εr

≤ E[|X|r]εr

Special case: X = Z − E[Z] and r = 2

P [|Z − E[Z]| ≥ ε] ≤ ΣZ

ε2, for any ε > 0

Cauchy-Schwarz inequality

Let the real random variables X and Y have finite second moments. Then

E[XY ]2 ≤ E[X2]E[Y 2]

PF: For any real λ,

0 ≤ E[(λX + Y )2] = λ2E[X2] + 2λE[XY ] + E[Y 2]

Conditional Expectation

E[X|Y = y] ,∑

j

xjP [X = xj |Y = y]

E[X|Y = y] ,∫ ∞

−∞xfX|Y (x|y)dx

E[g(X)|Y = y] ,∫ ∞

−∞g(x)fX|Y (x|y)dx = h(y)

31

⇒ E[g(X)|Y ] , h(Y )

ΣX|Y , E[(X − E[X|Y ])(X − E[X|Y ])T

∣∣Y ]= E

[XXT

∣∣Y ]− E[X|Y ]E[X|Y ]T

Properties of the Conditional Expectation

E[IA(Y )|X = x] = P [Y ∈ A|X = x]

E [g(Y )E[h(X)|Y ]] = E[g(Y )h(X)]

PF:∫ ∞

−∞g(y)E[h(X)|Y = y]fY (y)dy =

∫ ∞

−∞g(y)

[∫ ∞

−∞h(x)fX|Y (x|y)dx

]fY (y)dy

=∫ ∞

−∞

∫ ∞

−∞g(y)h(x)fX|Y (x|y)fY (y)dxdy =

∫ ∞

−∞

∫ ∞

−∞g(y)h(x)fX,Y (x, y)dxdy

Special case: g(Y ) = 1, h(X) = X

E [E[X|Y ]] = E[X]

E[AX|Y ] = AE[X|Y ]

E[X + Z|Y ] = E[X|Y ] + E[Z|Y ]

32

E[g(Y )X|Y = y] = g(y)E[X|Y = y]

E[g(Y )X|Y ] = g(Y )E[X|Y ]

X and Y independent ⇒ E[h(X)|Y = y] = E[h(X)]

33

8. ESTIMATION, SAMPLING, AND PREDICTION

X & Y – jointly distributed

X – to be estimated

Y – observed

Question: Given that value of Y = y, what is ”best” estimate x of the value of x thatminimizes, over all x,

E‖X − x‖2∣∣Y = y = E(X − x)T (X − x)∣∣Y = y

Theorem: x = EX|Y = y and minimum value of mean-squared error is

E‖X − x‖2∣∣Y = y = E(X − x)T (X − x)∣∣Y = y

= Etr(X − x)(X − x)T∣∣Y = y

= trE(X − x)(X − x)T∣∣Y = y = trΣX|Y =y

Proof.

E(X − x)T (X − x)∣∣Y = y = EXT X − xT X −XT x + xT x

∣∣Y = y= EXT X

∣∣Y = y − xT EX∣∣Y = y − EXT∣∣Y = yx + xT x

= EXT X∣∣Y = y − 2xT EX

∣∣Y = y+ xT x + ‖EX∣∣Y = y‖2 − ‖EX

∣∣Y = y‖2= ‖x− EX

∣∣Y = y‖2 + EXT X∣∣Y = y − ‖EX

∣∣Y = y‖2

34

Remark: n× n matrix A = aij, trace(A) = trA ,∑n

i=1 aij

(a) A is a scalar ⇒ trA = A

(b) tr(AB) = tr(BA)

(c) tr(A + B) = trA + trB

Terminology:

(a) For any value y of Y , ”Best Estimate” is x = EX|Y = y.(b) Let y vary. X = EX|Y is ”Best Estimator”. Thus, ”Best Estimator” is a random variable.

Theorem: The estimator of X in terms of Y that minimizes E‖X − g(Y )‖2 over allfunctions g is X = EX|Y .

Proof. See ”I.B. Rhodes, A Tutorial Introduction to Estimation and Filtering, IEEE Transactionon Automatic Control, Vol.16, No.6, 1971”.

Properties of Best Estimator:

(a) Linear:EAX + BZ + C|Y = AEX|Y + BEZ|Y + C

(b) Unbiased:EX = E[EX|Y ] = EX

35

(c) Projection Theorem: Error X − X = X is orthogonal to the r.v. g(Y ) for any scalar functiong, i.e.,

Eg(Y )XT = 0

PF:

Eg(Y )XT = E[Eg(Y )XT |Y ] = E[g(Y )EXT − EXT |Y |Y ]= E[g(Y )(EXT |Y − EXT |Y )] = 0

Definition:

(a) X & Y are L2− orthogonal if EXT Y = 0. (denoted X⊥Y ).

(Reminder: X & Y are orthogonal if EXY T = 0. EXT Y = trEXY T .(b) Let M be a subspace of X . (e.g., M = all n− vector valued functions f(Y ) ).

M⊥ , X ∈ X |X⊥Y, Y ∈ M.

Projection Theorem:

Let M be a subspace of X . Then there exists a unique pair of maps P : X → M andQ : X → M⊥ such that X = PX + QX, for all X ∈ X .

Also:

36

(a)

X ∈ M ⇒ PX = X and QX = 0

X ∈ M⊥ ⇒ PX = 0 and QX = X

(b) For all X ∈ X ,‖X − PX‖ = min

X∈M‖X − X‖

i.e., projection of X on M gives minimum error over all points in M .

(c)‖X‖2 = ‖PX‖2 + ‖QX‖2

(d) P & Q are linear.

Problem: Find the best linear estimator X∗ = A∗Y + b∗ that minimizes E‖X −AY − b‖2over all n×m matrix A and n× 1 matrix b.

Sol.) First, assume that X & Y have zero mean. Let

M = all random vectors of the form AY + b.

By Projection theorem,X −A∗Y − b∗⊥M

That is, for all A & b,

E(AY + b)T (X −A∗Y − b∗) = trE(X −A∗Y − b∗)(AY + b)T = tr[ΣXY AT −A∗ΣY AT − b∗bT ] = tr[(ΣXY −A∗ΣY )AT ]− bT b∗ = 0

37

ThusA∗ = ΣXY Σ−1

Y and b∗ = 0

which implies that X∗ = ΣXY Σ−1Y Y .

Assume non-zero mean. Then,

(X −mX)∗ = ΣXY Σ−1Y (Y −mY )

ThusX∗ = mX + ΣXY Σ−1

Y (Y −mY )

Basic Properties of Best Linear Estimator:

(a) Unbiased:EX∗ = EX = mX

(b) Let X = X −X∗. Then the error covariance is

ΣX = E(X −X∗)(X −X∗)T = ΣX − ΣXY Σ−1Y ΣY X

Remark:

(a) If uncorrelated, best linear estimator is

X∗ = mX and ΣX = ΣX (∵ ΣXY = 0)

(b) If independent,

38

best linear estimator is X∗ = mX . (∵ independent ⇒ uncorrelated)

best estimator is X = mX .

Example:

fXY (x, y) =

3, 0 ≤ y ≤ 1, 0 ≤ x ≤ y2

0, otherwise

Estimate X by (a) constant, (b) linear estimator, and (c) nonlinear estimator.

Sol)

fX(x) =

3(1−√x), 0 ≤ x ≤ 1

0, otherwise

(a)

E[X] =∫ ∞

−∞xfX(x)dx =

∫ 1

0

3x(1−√x)dx =(

32x2 − 6

5x

52

) ∣∣∣∣1

0

=32− 6

5=

310

Error var. = E[(X −mX)2] =37700

∼= 0.0529

(b)

fY (y) =

3y2, 0 ≤ y ≤ 1

0, otherwise

mY =34, var(Y ) =

380

, cov(X, Y ) =140

39

X∗ =310

+140

380

(Y − 34) =

23Y − 1

5

Error var. = var(X)− cov(X,Y )2

var(Y )= 0.0362

(c)

fX|Y (x|y) =

1y2 , 0 ≤ x ≤ y2 ≤ 1

0, otherwise

E[X|Y = y] =∫ y2

0

x1y2

dx =12y2

X = E[X|Y ] =12Y 2

Error var. = E[(X − X)2] = 0.0357

Matrix Inversion Lemma:

(P−1 + HT R−1H)−1 = P − PHT (HPHT + R)−1HP

(A + XT Y )−1 = A−1 −A−1XT (I + Y A−1XT )−1Y A−1

PF: exercise.

Best Linear Min. Var. Estimator of X given Y :

X∗ = X∗|Y = E∗[X|Y ] = mX + ΣXY Σ−1

Y (Y −mY )

40

More Properties of Best Linear Estimator:

(1) Only depend on 1st and 2nd moments.

(2) X & Y are jointly Gaussian ⇒ E∗[X|Y ] = E[X|Y ].

(3) E∗[X|Y ] is linear in 1st argument.

(4) Assume that Y & Z are uncorrelated. Also, let X, Y, & Z have zero mean.

(a)E∗[X|Y,Z] = E∗[X|Y ] + E∗[X|Z]

Let X|Y , X − E∗[X|Y ] and X|Y,Z , X − E∗[X|Y, Z]. Then,

ΣX|Y = ΣX − ΣXY Σ−1Y ΣY X

ΣX|Y,Z = ΣX − ΣXY Σ−1Y ΣY X − ΣXZΣ−1

Z ΣZX

(b)E∗[X|Y,Z] = E∗[X|Y ] + E∗[X|Y |Z]

ΣX|Y,Z= ΣX|Y

− ΣX|Y ,ZΣ−1Z ΣZ,X|Y

(5) Let X, Y, & Z have zero mean.

E∗[X|Y,Z] = E∗[X|Y, Z|Y ]

= E∗[X|Y ] + E∗[X|Z|Y ] (by 4(a))

= E∗[X|Y ] + E∗[X|Y |Z|Y ] (by 4(b))

41

ΣX|Y,Z= ΣX|Y

− ΣX|Y ,Z|YΣ−1

Z|YΣZ|Y ,X|Y

Z|Y = innovation in Z w.r.t. Y

(6) X,Y1, · · · , Yk+1 zero mean. Denote E∗[X|Y1, · · · , Yk+1] , X∗|k+1

X∗|k+1 = X∗

|k + E∗[X|k|Yk+1|k]

where X|k = X −X∗|k and

Yk+1|k = Yk+1 − E∗[Yk+1|Y1, · · · , Yk]

innovation in Yk+1 w.r.t. Y1, · · · , Yk

(7) Y1, Y2, Y3, · · · , Yk+1 are linearly related to Y1, Y2|1, Y3|2, · · · , Yk+1|k.”Gram-Schmidt orthogonalization”

42

Sample Mean

Estimator of E(X) = m

mn , 1n

n∑

i=1

Xi

E[mn] = m

Xi, i = 1, 2, 3, ..., n independent ⇒ var(mn) =1n

var(X)

Assume independence.

limn→0

P [|mn −m| ≥ ε] = 0 (the weak law of large numbers)

Relative frequency

- Suppose that we sample a random variable, say X, and that we determine for each samplewhether or not some given event A occurs. The random variable characterizing the relativefrequency of occurrence of the event A has the statistical properties.

E[nA

n] = p and var(

nA

n) =

p(1− p)n

wherep , P [X ∈ A]

43

P[∣∣∣nA

n− p

∣∣∣ ≥ ε]≤ 1

4nε2

limn→∞

P[∣∣∣nA

n− p

∣∣∣ ≥ ε]

= 0 (Bernoulli theorem)

44

9. Random Processes

Random Process

An indexed family of random variables, Xt, t ∈ T, where T denotes the set of possible values ofthe index t.

If T is a countably infinite set, then the process is called a discrete-parameter random process; ifT is a continuum, then the process is called a continuous-parameter random process.

Bernoulli process

A random process Xn, n = 1, 2, 3, · · · in which the random variables Xn are Bernoulli randomvariables, for example, where

P [Xn = 1] = p and P [Xn = 0] = 1− p

and where the Xn are statistically independent random variables.

E[Xn] = p

var(Xn) = pq = p(1− p)

Binomial counting process

A random process Yn, n = 1, 2, 3, · · ·

Yn ,n∑

i=1

Xi

45

where Xi, i = 1, 2, 3, · · · is independent Bernoulli r.p.

P [Yn = k] =(

n

k

)pk(1− p)n−k, for k = 0, 1, 2, · · · , n

E[Yn] = np

var(Yn) = npq

cov(Ym, Yn) = pq min(m,n)

var(Ym − Yn) = |m− n|pq

Sine wave processXt , V sin(Ωt + Φ) , t ∈ R

where V, Ω, and Φ are r.v.’s.

Stationarity (strict sense)

A random process XT , t ∈ T is stationary (in the strict sense) if and only if all of thefinite-dimensional probability distribution functions are invariant under shifts of the time origin.

Mean functionmX(t) , E[Xt]

46

Autocorrelation functionRX(t1, t2) , E[Xt1Xt2 ]

Covariance functionKX(t1, t2) , cov(Xt1 , Xt2)

Covariance function

KX(t1, t2) , cov(Xt1 , Xt2) = RX(t1, t2)−mX(t1)mX(t2)

Cross-correlation functionRXY (t1, t2) , E[Xt1Yt2 ]

Cross-covariance function

KXY (t1, t2) , cov(Xt1 , Yt2) = RXY (t1, t2)−mX(t1)mY (t2)

Stationary random processes

Let Xt,−∞ < t < +∞ be a strictly stationary real random process. It then follows that

mX(t) = E[Xt] = E[X0] = const

47

RX(t, t− τ) = RX(0,−τ)

We generally write in this caseE[Xt] = E[X] = mX

RX(t, t− τ) = RX(τ) , t ∈ R

RX(−τ) = RX(τ)

|RX(τ)| ≤ RX(0)

Wide sense stationarity (wss)

Let Xt,−∞ < t < +∞ be a real random process such that

E[Xt] = E[X0] , t ∈ R

RX(t, t− τ) = RX(0, 0− τ) , t ∈ R, τ ∈ RThen the given random process is said to be stationary in the wide sense.

Jointly wide sense stationary random processes

We say that the random processes Xt,−∞ < t < +∞ and Yt,−∞ < t < +∞ are jointly wss,if Xt,−∞ < t < +∞ and Yt,−∞ < t < +∞ are wss and

RXY (t, t− τ) = RXY (0,−τ) , t ∈ R, τ ∈ R

48

Sample mean

Consider the wide-sense stationary random process Xt,−∞ < t < +∞ whose second momentis finite. Suppose that we sample that process at the n time instants t1, t2, · · · , tn. The estimator

mn , 1n

n∑

i=1

Xi

whereXi , Xti

is called the sample mean.

E[mn] = mX

var(mn) =1n2

n∑

i=1

n∑

j=1

KX(ti, tj)

Special cases are:

Xi pairwise uncorrelated ⇒ var(mn) =KX(0, 0)

n=

σ2

n

⇒ limn→∞

P [|mn −m| > ε] = 0 (the weak law of large numbers)

Xi highly correlated ⇒ var(mn) = σ2

49

Periodic sampling

Let the wss random process Xt,−∞ < t < +∞ be sampled periodically throughout the interval0 ≤ t ≤ T in such a way that there are n sampling instants equally spaced throughout thatinterval (the last at t = T ). The variance of the sample mean is given in this case by the formula

var(mn) =σ2

n+

2n

n−1∑

k=1

(1− k

n)KX(k ∆t)

where ∆t , T/n. It therefore follows that

limn→∞

var(mn) =2T

∫ T

0

(1− τ

T)KX(τ)dτ

if we pass to the limit n →∞ while keeping T fixed.

50

10. LINEAR TRANSFORMATIONS

n-dimensional case

Suppose that the m-dimensional real random vector

Y = (Y1, Y2, · · · , Ym)

is generated from the n-dimensional real random vector

X = (X1, X2, · · · , Xn)

by the transformation g, that is,Y = g(X)

We say that g is a linear transformation if and only if it satisfies the relation

g(aW + bZ) = ag(W ) + bg(Z) , ∀a, b ∈ R

Y1 = g11X1 + g12X2 + · · ·+ g1nXn

Y2 = g21X1 + g22X2 + · · ·+ g2nXn

· · ·Ym = gm1X1 + gm2X2 + · · ·+ gmnXn

51

Yi =n∑

j=1

gijXj , i = 1, 2, · · · ,m

E[Yi] =n∑

j=1

gijE[Xj ]

cov(Yi, Yk) =n∑

j=1

n∑r=1

gijgkrcov(Xj , Xr)

Matrix formulation

X =

X1

X2

Y =

Y1

Y2

Y3

G =

g11 g12

g21 g22

g31 g32

Y = GX

E[Y ] = GE[X] ΣY = GΣXGT

52

Time averages

Xt, −∞ < t < +∞ - - r.p.

Yt , 1T

∫ t

t−T

Xτdτ

E[Yt] =1T

∫ t

t−T

E[Xτ ]dτ

Output autocorrelation function

RY (t1, t2) = E[Yt1Yt2 ] = E

[1T

∫ t1

t1−T

Xα1dα11T

∫ t2

t2−T

Xα2dα2

]= E

[1

T 2

∫ t1

t1−T

∫ t2

t2−T

Xα1Xα2dα1dα2

]

=1

T 2

∫ t1

t1−T

∫ t2

t2−T

RX(α1, α2)dα1dα2 =1

T 2

∫ T

0

∫ T

0

RX(τ1 + t1 − T, τ2 + t2 − T )dτ1dτ2

Xt, −∞ < t < +∞ wss ⇒E[Yt] = mX , E[Xt]

RY (t1, t2) =1

T 2

∫ T

0

∫ T

0

RX(t1 − t2 + τ1 − τ2)dτ1dτ2

53

RY (t, t) =1

T 2

∫ T

0

∫ T

0

RX(τ1 − τ2)dτ1dτ2 =2

T 2

∫ T

0

∫ T

α1

RX(α1)dα2dα1

=2

T 2

∫ T

0

(T − α1)RX(α1)dα1 =2T

∫ T

0

(1− τ

T)RX(τ)dτ

var(Yt) = RY (t, t)−m2X =

2T

∫ T

0

(1− τ

T)[RX(τ)−m2

X ]dτ =2T

∫ T

0

(1− τ

T)KX(τ)dτ

∫ ∞

−∞|KX(τ)|dτ < C ⇒ var(Yt) <

C

T(??)

Weighting functions

time-invariant Linear system

y(t) =∫ +∞

−∞h(τ)x(t− τ)dτ

h(t) - - system weighting function

Output moments

Xt ∼ random input

Yt =∫ +∞

−∞h(τ)Xt−τdτ

54

E[Yt] =∫ +∞

−∞h(τ)E[Xt−τ ]dτ

KY (t1, t2) =∫ +∞

−∞

∫ +∞

−∞h(τ1)h(τ2)KX(t1 − τ1, t2 − τ2)dτ1dτ2

Xt ∼ wss

E[Yt] = mX

∫ +∞

−∞h(τ)dτ

KY (τ) =∫ +∞

−∞

∫ +∞

−∞h(τ1)h(τ2)KX(τ − τ1 + τ2)dτ1dτ2

RY X(τ) =∫ +∞

−∞h(t′)RX(τ − t′)dt′

System correlation function

Rh(τ) ,∫ +∞

−∞h(t)h(t− τ)dt

RY (τ) =∫ +∞

−∞Rh(t′)RX(τ − t′)dt′

var(Yt) =∫ +∞

−∞Rh(t′)KX(t′)dt′

55

11. SPECTRAL ANALYSIS

Fourier transforms

X(f) ,∫ +∞

−∞x(t)e−i2πftdt

x(t) =∫ +∞

−∞X(f)ei2πftdf

System functions

h(t) - - the weighting function of a stable, linear, time-invariant linear system.

H(f) ,∫ +∞

−∞h(τ)e−i2πfτdτ

the system function

h(τ) =∫ +∞

−∞H(f)ei2πfτdf

y(t) =∫ +∞

−∞h(τ)x(t− τ)dτ

Y (f) = X(f)H(f)

56

Spectral density

SX(f) ,∫ +∞

−∞RX(τ)e−i2πfτdτ

RX(τ) =∫ +∞

−∞SX(f)ei2πfτdf

SX(0) =∫ +∞

−∞RX(τ)dτ

E[X2t ] = RX(0) =

∫ +∞

−∞SX(f)df

SX(f) ≥ 0, for all f

Xt real ⇒ SX(−f) = SX(f)

Spectral analysis of linear system

Xt ∼ wss input , Yt ∼ wss output

SY (f) = |H(f)|2SX(f)

57

E[Y 2t ] =

∫ +∞

−∞|H(f)|2SX(f)df

If H has the value unity over a narrowband of width ∆f centered about a frequency f1 , then

E[Y 2t ] = 2SX(f1)∆f

Cross-spectral density

SXY (f) ,∫ +∞

−∞RXY (τ)e−i2πfτdτ

RXY (τ) =∫ +∞

−∞SXY (f)ei2πfτdf

58

12. SUMS OF INDEPENDENT RANDOM VARIABLES

Independent-increment process

The real random process Yt , t ≥ 0 is said to be an independent-increment process if for everyset of time instants

0 < t1 < t2 < · · · < tn

the increments(Yt0 − Y0), (Yt2 − Yt1), · · · , (Ytn − Ytn−1)

are mutually independent random variables and

Y0 , 0

Ytn =n∑

i=1

Xi

whereXi , Yti − Yti−1 , i = 1, 2, · · · , n

Independent-increment process with stationary increments

FYt2+τ−Yt1+τ (x) = FYt2−Yt1(x) , τ ∈ R

E[Yt2+t1 − Yt1 ] = E[Yt2 − Y0] = E[Yt2 ] ⇒ E[Yt2+t1 ] = E[Yt1 ] + E[Yt2 ]

59

E[Yt] = mt

wherem , E[Yt]

∣∣t=1

RY (t2 + t1, t1) = E[Yt2+t1Yt1 ] = E[(

Yt2+t1 − Yt1 + Yt1

)Yt1

]= E[Yt2+t1 − Yt1 ]E[Yt1 ] + E[Y 2

t1 ]

= m2t2t1 + E[Y 2t1 ]

E[(Yt2 − Y0)2] = E[(Yt2+t1 − Yt1)2] = RY (t2 + t1, t2 + t1)− 2RY (t2 + t1, t1) + RY (t1, t1)

= RY (t2 + t1, t2 + t1)− 2m2t2t1 −RY (t1, t1)

KY (t2 + t1, t2 + t1) = RY (t2 + t1, t2 + t1)−m2(t2 + t1)2

= RY (t2, t2) + 2m2t2t1 + RY (t1, t1)−m2t22 −m2t21 − 2m2t2t1

= KY (t2, t2) + KY (t1, t1)

var(Yt) = KY (t, t) = σ2t

whereσ2 , var(Yt)

∣∣t=1

60

t2 ≥ t1 ⇒ RY (t2, t1) = E[Yt2Yt1 ] = E[(

Yt2 − Yt1 + Yt1

)Yt1

]= E[Yt2 − Yt1 ]E[Yt1 ] + E[Y 2

t1 ]

= m2t2t1 + σ2t1

t2 ≥ t1 ⇒ KY (t2, t1) = RY (t2, t1)−m2t2t1 = σ2t1

KY (t2, t1) = σ2 min(t2, t1)

t2 ≥ t1 ⇒ var(Yt2−Yt1) = KY (t2, t2)+KY (t1, t1)−2KY (t2, t1) = σ2t2+σ2t1−2σ2t1 = σ2(t2−t1)

var(Yt2 − Yt1) = σ2|t2 − t1|

Characteristic function

φX(v) , E[eivX ] =∫ ∞

−∞fX(x)eivxdx (Fourier transform)

fX(x) =12π

∫ ∞

−∞φX(v)e−ivxdv

φX(v) =∑

k

P [X = xk]eivxk

61

|φX(v)| ≤ φX(0) = 1

Sums of independent random variables

Yn ,n∑

i=1

Xi

where X1, X2, · · · , Xn are mutually independent.

φYn(v) = E[ei∑n

i=1 viXi ] =n∏

i=1

E[eiviXi ] =n∏

i=1

φXi(v)

fYn(y) = fX1(y) ∗ fX2(y) ∗ · · · ∗ fXn(y)

Linear functions

Y , aX + b , a, b ∈ R

⇒ φY (v) = E[eiv(aX+b)] = eivbφX(av)

62

Gaussian random variables

Let X be a gaussian random variable.

φX(v) = exp(

ivE[X]− v2σ2X

2

)

Let Y be a sum of n mutually independent gaussian random variable Xk.

φY (v) = exp(

ivm− v2σ2

2

)

where

m ,n∑

k=1

mk and σ2 ,n∑

k=1

σ2k

and where mk and σ2k are the mean and variance, respectively, of Xk.

Cauchy random variables

fX(x) =1

π(1 + x2)

φX(v) = e−|v|

63

Chi-squared random variables

fX(x) =

x(n−2)/2e−x/2

2n/2Γ(n2 ) , for x ≥ 0

0, for x < 0

where n is a nonnegative integer.

φX(v) = (1− i2v)−n/2

Poisson random variables

P [X = k] =λke−λ

k!, for k = 0, 1, 2, · · ·

φX(v) = exp[λ(eiv − 1)]

Moment-generating property

E[Xk] = (−i)kφ(k)X (0)

φX(v) =∞∑

k=0

E[Xk](iv)k

k!

64

Joint - characteristic functions

φX1,X2(x1, x2) = E

[exp

(i

2∑

k=1

vkXk

)]

φX(v) = E[exp(ivT X)]

v ,

v1

v2

X ,

X1

X2

|φX(v)| ≤ φX(0) = 1

X ∼ n× 1

fX(x) = fX1,X2,··· ,Xn(x1, · · · , xn)

φX1,··· ,Xn(v1, · · · , vn) =∫ +∞

−∞· · ·

∫ +∞

−∞exp(i

n∑

k=1

vkxk)fX1,··· ,Xn(x1, · · · , xn)dx1 · · · dxn

fX1,··· ,Xn(x1, · · · , xn) =1

(2π)n

∫ +∞

−∞· · ·

∫ +∞

−∞exp(−i

n∑

k=1

vkxk)φX1,··· ,Xn(v1, · · · , vn)dv1 · · · dvn

65

Independent random variables

Xk’s are mutually independent ⇔ φX1,X2,··· ,Xn(v1, v2, · · · , vn) =n∏

k=1

φXk(vk)

Moment-generating properties

E[Xm1 Xk

2 ] = (−i)m+k ∂m+kφX1,X2(v1, v2)∂vm

1 ∂vk2

∣∣∣∣v1=v2=0

φX1,X2(v1, v2) =∞∑

m=0

∞∑

k=0

E[Xm1 Xk

2 ](iv1)m

m!(iv2)k

k!

Independent-increment processes

Let Yt, t ≥ 0 be a real random process with stationary and independent increments and letY0 = 0. If, given the time instants

0 = t0 < t1 < t2 < · · · < tn

Xk , Ytk− Ytk−1 , for k = 1, 2, · · · , n

66

Ytn =n∑

k=1

Xk

φYtn(v) =

n∏

k=1

φXk(v)

φYt1 ,Yt2 ,...,Ytn(v1, v2, · · · , vn) =

n∏

k=1

φXk

n∑

j=k

vj

Probability generating function

Let X be a discrete random variable with nonnegative integer possible values.

ψX(z) , E[zX ] =∑

k

P [X = k]zk

E[X] = ψ′X(1)

E[X(X − 1) · · · (X − n + 1)] = ψ(n)X (1)

Joint-probability generating functions

Each of the Xk is a nonnegative, integer-valued random variable.

67

ψX(z) , E[zX11 zX2

2 · · · zXnn ] = E[

n∏

k=1

zXk

k ]

mutually independent ⇔ ψX(z) =n∏

k=1

ψXk(zk)

68

13. The Poisson process

Poisson process

Let Nt , 0 ≤ t < +∞ be a counting random process such that:

a. Nt assumes only nonnegative integer values and

N0 , 0

b. The process has stationary and independent increments,

c.P [Nt+∆t −Nt = 1] = λ∆t + o(∆t) (λ > 0)

d.P [Nt+∆t −Nt > 1] = o(∆t)

where

lim∆t→0

o(∆t)∆t

= 0

It then follows that

P [Nt+∆t −Nt = 0] = 1− λ∆t + o(∆t)

and that

69

P [Nt = k] =e−λt(λt)k

k!, k = 0, 1, 2, · · · ;

That is, Nt is a Poisson random variable. The counting process Nt, 0 ≤ t < ∞ is then called aPoisson counting process.

E[Nt] = λt

var(Nt) = λt

Arrival times

Let Tk be the random variable which describes the arrival time of the kth event counted by thecounting random process Nt, 0 ≤ t < ∞. Then

FTk(t) = 1− FNt(k − 1)

If Nt, 0 ≤ t < ∞ is a Poisson counting process, then

FTk(t) =

1− e−λt∑k−1

j=0(λt)j

j! , t ≥ 0

0 , t < 0

fTk(t) =

λe−λt (λt)k−1

(k−1)! , t ≥ 0

0 , t < 0

70

that is, Tk has an Erlang probability density. In this case:

E[Tk] =k

λ

var(Tk) =k

λ2

φTk(v) =

1(1− iv

λ

)k

Interarrival times

Let Nt, 0 ≤ t < +∞ be a counting process with the arrival times Tk , k = 1, 2, 3, · · · . Thedurations

Z1 , T1

Zk , Tk − Tk−1 , k = 2, 3, 4, · · ·

are called the interarrival times of the counting process. We then have

FZk(τ) = 1− P [Ntk−1+τ −Ntk−1 = 0]

If the counting process has stationary increments, then, for all k,

71

FZk(τ) = 1− P [Nτ = 0]

Further, if the given counting process is Poisson, then

FZk(τ) =

1− e−λτ , τ ≥ 0

0 , τ < 0

fZk(τ) =

λe−λτ , τ ≥ 0

0 , τ < 0

In this case

E[Zk] =1λ

, k = 1, 2, 3, · · ·

In any case

E[Tk] = kE[Zk]

Renewal counting processes

Let Nt , 0 ≤ t < ∞ be a counting process. If the interarrival times of this counting process aremutually independent random variables, all with the same probability distribution function, thenthe given process is called a renewal counting process. The renewal function m(t) of a renewalcounting process is the expected value of that process; that is,

72

m(t) , E[Nt]

and its derivative

λ(t) , dm(t)dt

is called the renewal intensity of the process. It then follows that

m(t) =∞∑

k=1

FTk(t)

and, if the various derivative exist,

λ(t) =∞∑

k=0

fTk(t)

On defining Λ(v) to be the Fourier transform of the renewal intensity, that is,

Λ(v) ,∫ +∞

−∞λ(t)eivtdt

it then follows that

Λ(v) =φZ(v)

1− φZ(v)

73

where φz is the common characteristic function of the interarrival times.

Unordered arrival times

Let Nt, 0 ≤ t < ∞ be a Poisson counting process and suppose that Nt = k : that is, supposethat k events occur by time t. The unordered arrival times U1, U2, · · · , Uk of those k events arethen mutually independent random variables, each of which is uniformly distributed over theinterval (0, t] :

fUi(ui|Nt = k) =

1t , 0 < ui ≤ t

0 , otherwise

for all i = 1, 2, · · · , k.

Filtered Poisson processes

Let Nt, 0 ≤ t < ∞ be a Poisson counting process. The random process Xt, 0 ≤ t < ∞ inwhich

Xt ,Nt∑

j=1

h(t− Uj)

where an event which occurs at time uj generates an outcome h(t− uj) at time t and where therandom variables Uj are the unordered arrival times of the events which occur during the interval(0, t], is called a filtered Poisson process. The mean of a filtered Poisson process is

74

E[Xt] = λ

∫ t

0

h(u)du

the variance is

var(Xt) = λ

∫ t

0

h(u)2du

and the characteristic function of the random variable Xt is

φXt(v) = exp[λ

∫ t

0

(eivh(u) − 1

)du

]

Random partitioning

Let Nt, 0 ≤ t < ∞ be a Poisson counting process and let Xt, 0 ≤ t < ∞ be thecorresponding filtered Poisson process in which

Xt ,Nt∑

j=1

h(t− Uj)

We say that the random process Zt, 0 ≤ t < ∞ is a randomly partitioned filtered Poissonrandom process if

Zt ,Nt∑

j=1

Yjh(t− Uj)

75

where the partitioning random variables Yj are mutually independent random variables which areindependent of the unordered arrival times Uj , and where each of the Yj has the same Bernoulliprobability distribution

P [Yj = 1] = p and P [Yj = 0] = q , 1− p

where 0 < p < 1. In this case,

E[Zt] = pλ

∫ t

0

h(u)du = pE[Xt]

The characteristic function of the randomly partitioned random variable Zt is

φZt(v) = exp[pλ

∫ t

0

(eivh(u) − 1

)du

]

76

14. Gaussian random process

Gaussian random vectors

Y = (Y1, Y2, · · · , Ym)

φY (v) = exp(imTY v − 1

2vT ΣY v)

fY (y) =exp[− 1

2 (y −mY )T ΣY (y −mY )](2π)m/2|ΣY |1/2

Gaussian random processes

The real random process Yt, t ∈ T is said to be a gaussian random process if for every finiteset of time instants tj ∈ T , the corresponding random variables Ytj are jointly gaussian randomvariables.

Narrowband random processes

The random process Xt, −∞ < t < ∞ is said to be a narrowband random process if it has azero mean, is stationary in the wide sense, and if its spectral density SX differ from zero only insome narrowband of width ∆f centered about some frequency f0 where

f0 >> ∆f

A narrowband random process may be represented in terms of an envelope random processVt, −∞ < t < ∞ and a phase random precess Φt, −∞ < t < ∞ by using the relation

77

Xt = Vt cos(ω0t + Φt)

where w0 = 2πf0. Alternatively, a narrowband random process may also be represented in termsof cosine and sine component random processes Xct,−∞ < t < ∞ and Xst,−∞ < t < ∞,respectively, by using the relation

Xt = Xct cos ω0t−Xst sin ω0t

The relations between these two representations are given by the formulas

Xct = Vt cosΦt and Xst = Vt sinΦt

which have the inverses

Vt =√

X2ct + X2

st and Φt = tan−1 Xst

Xct

The random variables Xct, Xst, Xc(t+τ), and Xs(t+τ) have the covariance matrix

R(τ) =

RX(0) 0 Rc(τ) Rcs(τ)

0 RX(0) −Rcs(τ) Rc(τ)

Rc(τ) −Rcs(τ) RX(0) 0

Rcs(τ) Rc(τ) 0 RX(0)

78

where

Rc(τ) = 2∫ +∞

0

SX(f) cos[2π(f − f0)τ ]df

and

Rcs(τ) = 2∫ +∞

0

SX(f) sin[2π(f − f0)τ ]df

Narrowband gaussian processes

The cosine- and sine-component random variables Xct and Xst of a gaussian narrowband randomprocess are independent random variables with zero means, each with a variance equal to RX(0),and a joint-probability density

fXctXst(x, y) =exp

[− x2+y2

2RX(0)

]

2πRX(0)

The envelope and phase random variables Vt and Φt of a gaussian narrowband random processare also independent random variables. The envelope has the Rayleigh probability density

fVt(v) =

vRX(0)exp

[− v2

2RX(0)

], v ≥ 0

0 , otherwise

and the phase is uniformly distributed over [0, 2π] :

79

fΦt(φ) =

12π , 0 ≤ φ ≤ 2π

0 , otherwise

Documents

Text: PROBABILITY AND RANDOM PROCESSES, by …cel.cau.ac.kr/class/rp/RP-ADV01.pdf · · 2016-03-021 Advanced Random Processes Text: PROBABILITY AND RANDOM PROCESSES, by Davenport