48
1 Lecture 1: Monday August 23, 2021 Lecture • Review syllabus and class logistics • Intro & motivation • Probability review

Lecture •Review syllabus and class logistics •Intro

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Lecture •Review syllabus and class logistics •Intro

1

Lecture 1: Monday August 23, 2021

Lecture

• Review syllabus and class logistics• Intro & motivation• Probability review

Page 2: Lecture •Review syllabus and class logistics •Intro

2

Class Websitehttps://barry.ece.gatech.edu/6601/

Page 3: Lecture •Review syllabus and class logistics •Intro

3

Progression

In a nutshell:

• probability theory• random variables X, Y• random vectors X = [X1, X2, ... Xn]T

• random sequences Xk

• random processes X( t )

Page 4: Lecture •Review syllabus and class logistics •Intro

4

Topics• Review Probability (Chapters 1-3)

axioms, Conditional probability, Bayes Theorem• Random Variables (Chapters 4 and 5)

The cdf, pmf (discrete), and pdf (continuous)Expectation and moments, the mgf

• Pairs of Random Variables (Chapter 6)Joint, marginal, and conditionalindependence and correlationLaw of total expectation

• Random Vectors (Chapter 6)Jointly Gaussian random vectorsConditional pdfs for a Gaussian random vectorMinimum mean-square error (MMSE) prediction

• Limit Theorems (Chapter 7)The Central Limit Theorem

Page 5: Lecture •Review syllabus and class logistics •Intro

5

• Discrete-Time Random Sequences (Chapter 9 and 11)Stationarity, ErgodicityAutocorrelation and Power spectral densitySpectral factorization and innovationsLinear prediction

• Continuous-Time Random Processes (Chapter 10 and 11)• The Poisson Process

Discrete-Time Bernoulli Process: Binomial, Geometric, and NegBinContinous-Time Poisson Point Process: Poisson, Exp, and Erlang

• Kalman filters (Chapter 13)• Markov Chains (Chapter 15 and 16)

Page 6: Lecture •Review syllabus and class logistics •Intro

6

Examples• Thermal noise

• Wiener process

• Bernoulli

• Poisson Counting

THERMALNOISE (HISTOGRAM)

... 00101001000000001011010011011010001100110011010001 ...

N( t )

0 t

Page 7: Lecture •Review syllabus and class logistics •Intro

7

• Discrete-Time Random Telegraph

• Random Telegraph

• Random PAM

t

Page 8: Lecture •Review syllabus and class logistics •Intro

8

2 Birds in a Cage

Given that one is male, what is the probability that the other is male?

MM

MF

FM

FF⇒ P(MM) = 1/3

Page 9: Lecture •Review syllabus and class logistics •Intro

9

Why we need this class

The information age: information = –log(probability)

Data analysis, ML, AI

To design

• communication systems• radar, GPS: minimize P(miss) given P(false alarm)?• Queueing systems

airport security, check-out countersaverage wait time, #servers so that waiting time ist0

• Forecasting and predictionweather, financial markets, speech, video (e.g. for compression)

Page 10: Lecture •Review syllabus and class logistics •Intro

10

Sets

A set is a collection of objects called elements.

Examples:

F = {apple, banana}B = {0, 1, 2, }Z = { –3, –2, –1, 0, 1, 2, 3, }R+ = {x : x > 0}A subset B of a set A is itself a set, all of whose elts are in A.

Notation: B ⊂ A

Equality: A = B iff A ⊂ B and B ⊂ A

In our discussion, all sets will be subsets of a universal set, S

A set with n=6 elts (die) has 2n=26 subsets: ø, {1}, {2},{2,3,4,5,6}

Page 11: Lecture •Review syllabus and class logistics •Intro

11

Set OperationsSum (Union): C = A ∪ B is shaded

The set of elts that are in A, or in B, or both

Product (Intersection): C = A ∩ B is shaded

The set of elts that are common to A and B

A

B

S

A

B

S

A∩B

Page 12: Lecture •Review syllabus and class logistics •Intro

12

Disjoint SetsTwo sets A and B are disjoint or mutually exclusive when they share no elts in common:

Equivalent condition: A∩B = ø

A

B

S

Page 13: Lecture •Review syllabus and class logistics •Intro

13

Set Complement

The complement Ac are the elts of S not in A

Properties:

øc = SSc = ø

(Ac)c = AA ∩ Ac = øA ∪ Ac = S

A

S

Ac

Page 14: Lecture •Review syllabus and class logistics •Intro

14

Properties of Set Operations• Commutative A ∪ B = B ∪ A• Associative A ∪ (B ∪ C) = (A ∪ B) ∪ C• Distributive A ∩ (B ∪ C) = (A ∩ B) ∪ (A ∩ C)

• Double complement Acc = A

• Mutual exclusion A ∩ Ac = ø• Inclusion A ∩ S = A

• DeMorgan

? (A ∪ B)c = Ac ∩ Bc

? (A ∩ B)c = Ac ∪ Bc

Page 15: Lecture •Review syllabus and class logistics •Intro

15

ExampleLet S = {1, 2, 3, 4, 5, 6}with subsets A = {2, 4, 6} and B = {1, 2, 3, 4}

Find (A ∩ Bc) ∪ B =________________ ?

(A ∩ Bc) ∪ B = {6} ∪ {1, 2, 3, 4} = {1, 2, 3, 4, 6}

Page 16: Lecture •Review syllabus and class logistics •Intro

16

ExperimentThree components:

• procedure (e.g, flip a coin)• observable (e.g. which sides lands up)• model (e.g., heads and tails equally likely)

Key property:

• Outcome uncertainExample of different observables for the same physical act of flipping a coin:

• observe heads or tails• count bounces• measure settling time

In context of random experiment, we define:

• sample space as the set of possible outcomes• An event is any subset of the sample space• An elementary event is any singleton subset; i.e., a single outcome

Page 17: Lecture •Review syllabus and class logistics •Intro

17

Pop QuizFlip a pair of coins, what is sample space?

It depends on the observable!

Three options:

• order matters ⇒ S1 = {TT, TH, HT, HH}• order does not matter ⇒ S2 = {both tails, both heads, one of each}• distance between coins ⇒ S3 = {d : d > 0}

Page 18: Lecture •Review syllabus and class logistics •Intro

18

Relative Frequency ApproachAssume a finite sample space S ={A, B, C, D}After N trials, let NA = #times that A occurs.

Define the “probabilility of A” by P (A ) = limN NA

N--------

0 200 400 600 800 10000

0.1

0.2

0.3

0.4

BATTING AVERAGE

NUMBER OF AT BATS

P (hit)?

Page 19: Lecture •Review syllabus and class logistics •Intro

19

Properties• 0 NA N ⇒ 0 P(A) 1

• NA + NB + + ND = N ⇒ P(A) + P(B) + + P(D) = 1

• A Impossible ⇒ P(A) = 0

• A Certain ⇒ P(A) = 1

Page 20: Lecture •Review syllabus and class logistics •Intro

20

Limitations of Relative-Freq ApproachHow to handle uncountable sample spaces?

Example experiment: flip a coin, measure settling time ⇒ S = {t : t > 0}After repeated trials

• T1 = 0.1235124234245423554345346351013293127825897523124376• T2 = 3.0021235543453463510132931278258975234245421243769001• T3 = 17.1235293127825897523124376

• T4 = 0.5

• etc.... Now what?

Adopt the axiomatic approach to probability

• based on set theory

Page 21: Lecture •Review syllabus and class logistics •Intro

21

The 3 Axioms of ProbabilityWith respect to a random experiment, define

• a sample space S = set of possible outcomes.• Any measurable subset defines an event.

Example:

ø = the null event = the impossible event = “nothing happened”

S = the certain event “something happened”

The probability P(A) of an event A is a number that satisfies 3 axioms:

(1) P(A ) 0

(2) P(S ) = 1

(3) If A ∩ B = ø then P(A ∪ B ) = P(A ) + P(B )

(4) P(A) 1(5) P(ø) = 0

(6) P(A ∪ B ) = P (A ) + P(B ) – P(A ∩ B)

Page 22: Lecture •Review syllabus and class logistics •Intro

22

CorrolariesCorrolary 4: P(A) 1

Why? From (3), 1 P(S) = P(A∪A ) P(A) + P(A)

P(A) = 1 – P(A) 1

Corrolary 5: P(ø) = 0

Why? P(S) P(S∪ ø) P(S) + P(ø) P(ø) = 0

The axiom only requires that probability be positive.

The corrolary ensures that it is also no bigger than unity.

=(2)

=(3)

(1)

= =(3)

Page 23: Lecture •Review syllabus and class logistics •Intro

23

IntuitionFor finite sample spaces, the axioms are met by assigning a probability value toeach possible outcome, a number between 0 and 1, such that they sum to unity.

Page 24: Lecture •Review syllabus and class logistics •Intro

24

How to Interpret Combined EventsSuppose A, B, C are subsets of S

Interpretation of:

P(A ) = probability that A occurs

P(A ∪ B) = probability that A or B occurs

P(A ∩ B) = probability that A and B occur

Page 25: Lecture •Review syllabus and class logistics •Intro

25

Corrolary 5Corrolary 5: P(A∪B) = P(A) + P(B) – P(A ∩ B)

P(A) = P(A ∩ B) + P(A ∩ B)

P(B) = P(B ∩ A) + P(A ∩ B)

Add these two equations:

P(A) + P(B) = P(A ∩ B) + P(B ∩ A) + P(A ∩ B) + P(A ∩ B)

= P(A∪B ) + P(A ∩ B),Q.E.D.

A

B

S

A ∩ B

A ∩ B

B ∩ A

DISJOINTProof?

Page 26: Lecture •Review syllabus and class logistics •Intro

26

RemarkWhile it is true that A = ø implies that P(A) = 0,

the converse is false: P(A) = 0 does not imply A = ø

Example: settling time after tossing coin, let A = {0.1 seconds exactly}

Page 27: Lecture •Review syllabus and class logistics •Intro

27

Generalize to 3 EventsP(A∪B∪C ) = P(A) + P(B) + P(C)

– P(A ∩ B) – P(A ∩ C) – P(B ∩ C)

+ P(A ∩ B∩ C)

A

B

S

A ∩ B

A ∩ B

B ∩ A

C

Page 28: Lecture •Review syllabus and class logistics •Intro

28

Methods for Calculating ProbabilitiesAny solution to a problem of the form “Find the probability that ” will likely use one of four methods:

(1) Counting method (for finite uniform distributions)P[A ] =

(2) Multiplication Rule (chain rule)

(3) Law of Total Probability (divide-and-conquer)

(4) Combinations of the above three (e.g. using Bayes rule)

size of Asize of S----------------------

Page 29: Lecture •Review syllabus and class logistics •Intro

29

Example: Prob of 2 Pair, Neither Faced?Suppose five cards are drawn from a deck at random from a standard deck. Find the probability that there are two pair, neither of which is a “face” card. [Assume that the 5th card has different rank than others (otherwise it would be a “full house” not “2 pair”). Assume that an ace is not a face card.]

There are |S | = ( 525 ) possible hands, all equally likely

⇒ we can use the counting method ⇒ P(A ) = |A |/|S |The question becomes: what is |A |? In other words, how many distinct handsare there that have two pair, neither of which is a face card?

There are ( 102 ) ways to specify the ranks of the two pairs.

There are ( 42 ) ways to specify the suits of the smaller pair.

There are ( 42 ) ways to specify the suits of the larger pair.

The fifth card can be any of the 48 – 4 cards that remain with a different rank.

⇒ P(A ) = = ≈ 2.7%. --------------------------------------( 10

2 )( 42 )( 4

2 )(44)

( 525 )

712802598960---------------------

Page 30: Lecture •Review syllabus and class logistics •Intro

30

Example: Roll 2 DiceIf 2 dice are distinguishable ⇒ 36 possible outcomes

S1 = {(1,1), (2,1), (3,1), (6,1),(1,2), (2,2), (6,2),(5,6), (6,6)}

If not ⇒ 21 possible outcomes

S2 = {(1,1), (2,1), (3,1), (6,1), (2,2), (6,5), (6,6)}

Which sample space is preferable?

The 1st one, because it ensures that all outcomes are equally likely

⇒ computing the probability of an event reduces to a counting exercise.

Page 31: Lecture •Review syllabus and class logistics •Intro

31

Example: Roll 2 Dice Let A = event that bigger is double smaller.

Let B = event that bigger is smaller + 1.

Find P(A∪B ) = “probability that either A, or B, or both occur”

Approach 1: A∪B has 14 equiprobable elts Pr[A∪B ] = 14/36 = 7/18

Approach 2: Pr[A∪B ] = Pr[A] + Pr[B] – Pr[A ∩ B]

= Pr[A] + Pr[B] – Pr[(1,2) or (2,1)]

= 6/36 + 10/36 – 2/36 = same answer.

2,1first die

second die

2,4

4,21,2

3,6

6,3

AB

Page 32: Lecture •Review syllabus and class logistics •Intro

32

Conditional ProbabilityIn previous example: If we somehow knew that B occured, this would change the probability of A.

This new probability, that takes into account the knowledge that B has occured,is called the conditional probability.

• Notation: P(A |B)

• It is the new probability of A, conditioned on the event B.

When P(B > 0), the conditional probability of an event A, given B, is

P(A|B) = .P (A B)P(B)

---------------------------

Page 33: Lecture •Review syllabus and class logistics •Intro

33

Can Verify: Conditional Probability

Satisfies AxiomsWhen B is taken as the new sample space:

(1) P(A|B) 0

(2) P(B|B) = 1

(3) If A ∩ C = ø then P(A ∪ C |B) = P(A|B) + P(C |B)

Page 34: Lecture •Review syllabus and class logistics •Intro

34

Special Cases

(a) A and B are disjoint ⇒ P(A|B) = 0.

Examples:

• P(head| tail) = 0.

• P(odd|even) = 0.

(b) What happens when P(B ) = 0?

Don’t we have to worry about dividing by zero?

No. P(B ) = 0 means that B is impossible.So P(A|B) doesn’t even make sense.

Page 35: Lecture •Review syllabus and class logistics •Intro

35

Back to Example: Roll 2 Dice Let A = event that bigger is double smaller.

Let B = event that bigger is smaller + 1.

• If B is known to occur, then B is the new sample space. ¦ All events outside of B become impossible¦ probabilities outside of B are set to zero

• B becomes the new sample space¦ probabilities inside of B sum to 1

Each of 10 outcomes in B are equally likely

⇒ P(A |B) = = = .

2,11st die

2nd die

2,4

4,21,2

3,6

6,3

AB

P (A B)P(B)

--------------------------- 2/3610/36--------------- 1

5---

Page 36: Lecture •Review syllabus and class logistics •Intro

36

A

S

⇒P(A|B) = = 0/P(B ) = 1

A

B

S

⇒P(A |B) = = P(B )/P(B ) = 1

A

S

B

B ⇒P(A |B) = = P(A )/P(B ) P(A )

A

S

B

⇒P(A |B) = could be bigger than P(A )

could be smaller than P(A )

(when P(B – A) = )

(when P(A ∩ B) = )

(disjoint)

Conditioning Sometimes Increases,Sometimes Decreases

Page 37: Lecture •Review syllabus and class logistics •Intro

37

Ex: Roll 2 Die, Find P(odd|small)Roll two fair die, and observe the sum.

Let A be the “odd” event; i.e.,

A = {3, 5, 7, 9, 11}.

Let B be the “small” event; i.e.,

B = {2, 3}.

Then P(A|B) =

first diese

cond d

ie

Page 38: Lecture •Review syllabus and class logistics •Intro

38

Ex: Roll 2 Die, Find P(odd|small)Roll two fair die, and observe the sum.

Let A be the “odd” event; i.e.,

A = {3, 5, 7, 9, 11}.

Let B be the “small” event; i.e.,

B = {2, 3}.

Then P(A|B) = = = = .

(not 1/2 as you might first guess)

first diese

cond d

ie

P (A B)P(B)

--------------------------- P(3)P(B)-------------- 2/36

3/36------------ 2

3---

Page 39: Lecture •Review syllabus and class logistics •Intro

39

Chain RuleSolving P(B |A) = for numerator

⇒ .

• This is the chain rule (or the multiplication rule). • Use it to compute the probability that both events A and B occur.

Example: Draw 2 cards from deck, probability both are red cards is

P(first red and second red) = P(first red)P(second red | 1st red)

=

P (A B)P(A)

---------------------------

P(AB) = P(A )P (B |A )

2652------ 25

51------

Page 40: Lecture •Review syllabus and class logistics •Intro

40

Chain Rule = Multiplication RuleTo calculate the probability that 5 events A1A5 happen simultaneously:

P(A1A2A3A4A5) =P(A1)P(A2A3A4A5|A1)

=P(A1)P(A2|A1)P(A3A4A5|A1A2)

=P(A1)P(A2|A1)P(A3|A1A2)P(A4A5|A1A2A3)

= P(A1)P(A2|A1)P(A3|A1A2)P(A4|A1A2A3)P(A5|A1A2A3A4)

Describe tree interpretation.

Page 41: Lecture •Review syllabus and class logistics •Intro

41

ExampleExample: 10 coins in a bag: 2 Red, 3 Blue, 5 Green

Draw 2 coins without replacement. Find probability of (first Green, then Blue).

R (0.2)

1/9

B (0.3)

G (0.5)

3/9

5/9

2/9

2/9

5/9

2/9

3/9

4/9

(0.5)(3/9) = 1/6

Page 42: Lecture •Review syllabus and class logistics •Intro

42

Birthday Problem: A Classic Chain Rule Example

Assume 365 days (ignore leap yrs), all equally likely.

How big must class be to ensure P(at least one match) > 1/2?

Experiment:

• Arrange n students in random order• Let Ai be the event that the i-th birthday does not match a previous one

P(at least one match) = 1 – P(no match)

= 1 – P(A1 ∩ A2 ∩ A3 ∩A4 ∩ ∩ An) = 1 – P(A1)P(A2|A1)P(A3|A1,A2)P(A4|A1, A2, A3)

= 1 – (1) 364365--------- 363

365--------- 362

365--------- 365 n– 1+

365----------------------------

Page 43: Lecture •Review syllabus and class logistics •Intro

43

0 10 20 30 40 500

0.5

1

CLASS SIZE n

Prob

abili

ty o

f at L

east

One

Mat

ch

23

Page 44: Lecture •Review syllabus and class logistics •Intro

44

Law of Total Probability{B1, Bn} is a partition iff (1) they mutually disjoint, and (2) ∪iBi = S.

Then P(A ) = P(∪i(A Bi))

= iP(A Bi)

⇒ “Law of Total Probability”

Special case: P(A ) = P(A B )P(B ) + P(A Bc)P(Bc)

A

P(A ) = iP(A Bi)P(Bi)

Page 45: Lecture •Review syllabus and class logistics •Intro

45

Law of Total Probability“Divide and Conquer”

Given any partition B1 Bn,

P(A) = P(A B1)P(B1) + P(A B2)P(B2) + + P(A Bn)P(Bn).

Example: 10 coins in a bag: 2 Red coins with P(H|R) = 0.53 Blue coins with P(H|B) = 0.65 Green coins with P(H|G) = 0.7

Pick one at random

⇒ P(H) = P(H|R)P(R) + P(H|B)P(B) + P(H|G)P(G)

= (0.5)(0.2) + (0.6)(0.3) + (0.7)(0.5)

= 0.1 + 0.18 + 0.35 = 0.63

Page 46: Lecture •Review syllabus and class logistics •Intro

46

Cheater MeterEvery test is scanned by a cheater meter.

P(flash|cheated) = 0.9.

P(flash|not cheated) = 0.2.

P(cheater) = 0.01.

Suppose a randomly chosen test flashes, how likely is it a cheat?

Let F = event that the test causes light to flash.

Let C = event that the tester cheated.

We want P(C|F ) = =

= = 0.043.

This is “Bayes Rule” or “Bayes Theorem.”

Quiz

P (F C)P(F)

-------------------------- P(F C)P(C)

P(F C)P(C) P(F Cc)P(Cc)+-------------------------------------------------------------------------------

0.9 0.01 0.9 0.01 0.2 0.99 +

----------------------------------------------------------------

Page 47: Lecture •Review syllabus and class logistics •Intro

47

Bayes Theorem Summary

For any partition {A1, A2, An},

P(Ai|B) = .

Bayes theorem turns the conditional probabilities around.

P(B Ai)P(Ai)

----------------------------------------------------------------------------------------------------------P(B|A1)P(A1) + + P(B|An)P(An)

Page 48: Lecture •Review syllabus and class logistics •Intro

48

Coins in a BagExample: 10 coins in a bag: 2 Red coins with Pr[H|R] = 0.5

3 Blue coins with Pr[H|B] = 0.65 Green coins with Pr[H|G] = 0.7

Find: P(G |H) = ?

Solution:

P(G |H) =

=

=

= = 0.648.

P(H G)P(G)

--------------------------------------------------------------------------------------------------------------------------P(H|R)P(R) + P(H|B)P(B) + P(H|G)P(G)

0.7 0.5 0.5 0.2 0.6 0.3 0.7 0.5 + +

-------------------------------------------------------------------------------------------

0.350.1 0.18 0.35+ +--------------------------------------------

0.350.54----------