53
Probability basics DS GA 1002 Probability and Statistics for Data Science http://www.cims.nyu.edu/~cfgranda/pages/DSGA1002_fall17 Carlos Fernandez-Granda

Probability basics - Home Page | NYU Courantcfgranda/pages/DSGA1002... · Probability basics DS GA 1002 Probability and Statistics for Data Science cfgranda/pages/DSGA1002_fall17

  • Upload
    others

  • View
    6

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Probability basics - Home Page | NYU Courantcfgranda/pages/DSGA1002... · Probability basics DS GA 1002 Probability and Statistics for Data Science cfgranda/pages/DSGA1002_fall17

Probability basics

DS GA 1002 Probability and Statistics for Data Sciencehttp://www.cims.nyu.edu/~cfgranda/pages/DSGA1002_fall17

Carlos Fernandez-Granda

Page 2: Probability basics - Home Page | NYU Courantcfgranda/pages/DSGA1002... · Probability basics DS GA 1002 Probability and Statistics for Data Science cfgranda/pages/DSGA1002_fall17

Probability spaces

Conditional probability

Independence

Page 3: Probability basics - Home Page | NYU Courantcfgranda/pages/DSGA1002... · Probability basics DS GA 1002 Probability and Statistics for Data Science cfgranda/pages/DSGA1002_fall17

General approach

Probabilistic modeling

1. Model phenomenon of interest as an experiment with several (possiblyinfinite) mutually exclusive outcomes

2. Group these outcomes in sets called events

3. Assign probabilities to the different events

Page 4: Probability basics - Home Page | NYU Courantcfgranda/pages/DSGA1002... · Probability basics DS GA 1002 Probability and Statistics for Data Science cfgranda/pages/DSGA1002_fall17

Probability space

A probability space is a triple (Ω,F ,P) consisting of

I A sample space Ω, which contains all possible outcomes of theexperiment

I A set of events F , which must be a σ-algebra

I A probability measure P that assigns probabilities to the events in F

Page 5: Probability basics - Home Page | NYU Courantcfgranda/pages/DSGA1002... · Probability basics DS GA 1002 Probability and Statistics for Data Science cfgranda/pages/DSGA1002_fall17

Sample space

Sample spaces can be

I Discrete: coin toss, score of a basketball game, number of people thatshow up at a party . . .

I Continuous: intervals of R or Rn used to model time, position,temperature, . . .

Page 6: Probability basics - Home Page | NYU Courantcfgranda/pages/DSGA1002... · Probability basics DS GA 1002 Probability and Statistics for Data Science cfgranda/pages/DSGA1002_fall17

σ-algebra

A σ-algebra F is a collection of sets in Ω such that

1. If a set S ∈ F then Sc ∈ F

2. If the sets S1,S2 ∈ F , then S1 ∪ S2 ∈ FAlso infinite sequences; if S1,S2, . . . ∈ F then ∪∞i=1Si ∈ F

3. Ω ∈ F

Page 7: Probability basics - Home Page | NYU Courantcfgranda/pages/DSGA1002... · Probability basics DS GA 1002 Probability and Statistics for Data Science cfgranda/pages/DSGA1002_fall17

Basketball game

I Cleveland Cavaliers are playing the Golden State Warriors

I Sample space

Ω := Cavs 1−Warriors 0,Cavs 0−Warriors 1, . . . ,Cavs 101−Warriors 97, . . ..

I Several possible σ-algebras

I If we want high granularity we can choose the power set of scores

I If we only care who wins

F := Cavs win,Warriors win,Cavs or Warriors win, ∅

Page 8: Probability basics - Home Page | NYU Courantcfgranda/pages/DSGA1002... · Probability basics DS GA 1002 Probability and Statistics for Data Science cfgranda/pages/DSGA1002_fall17

Probability measure

Function over the sets in F such that

1. P (S) ≥ 0 for any event S ∈ F

2. If S1,S2 ∈ F are disjoint then

P (S1 ∪ S2) = P (S1) + P (S2)

Also countably infinite sequences of disjoint sets: S1,S2, . . . ∈ F

P(limn→∞

∪ni=1Si

)= lim

n→∞

n∑i=1

P (Si )

3. P (Ω) = 1

Page 9: Probability basics - Home Page | NYU Courantcfgranda/pages/DSGA1002... · Probability basics DS GA 1002 Probability and Statistics for Data Science cfgranda/pages/DSGA1002_fall17

Properties of a probability measure

I P (∅) = 0

I If A ⊆ B then P (A) ≤ P (B)

I P (A ∪ B) = P (A) + P (B)− P (A ∩ B)

I To simplify notation, we set

P (A,B,C ) := P (A ∩ B ∩ C )

Page 10: Probability basics - Home Page | NYU Courantcfgranda/pages/DSGA1002... · Probability basics DS GA 1002 Probability and Statistics for Data Science cfgranda/pages/DSGA1002_fall17

Important

I Probability measure only assigns probabilities to events in theσ-algebra

I Simpler σ-algebras can make our life easy

P (Cavs win) =12

P (Warriors win) =12

P (Cavs or Warriors win) = 1

P (∅) = 0

Page 11: Probability basics - Home Page | NYU Courantcfgranda/pages/DSGA1002... · Probability basics DS GA 1002 Probability and Statistics for Data Science cfgranda/pages/DSGA1002_fall17

Probability spaces

Conditional probability

Independence

Page 12: Probability basics - Home Page | NYU Courantcfgranda/pages/DSGA1002... · Probability basics DS GA 1002 Probability and Statistics for Data Science cfgranda/pages/DSGA1002_fall17

Definition

The conditional probability of an event S ′ ∈ F given S is

P(S ′|S

):=

P (S ′ ∩ S)

P (S)

P (·|S) is a valid probability measure

To simplify notation, we set

P (S |A,B,C ) := P (S |A ∩ B ∩ C )

Page 13: Probability basics - Home Page | NYU Courantcfgranda/pages/DSGA1002... · Probability basics DS GA 1002 Probability and Statistics for Data Science cfgranda/pages/DSGA1002_fall17

Example: Flights and rain

Probabilistic model for late arrivals at an airport

Ω = late and rain, late and no rain,on time and rain, on time and no rain

F = power set of Ω,

P (late, no rain) =220, P (on time, no rain) =

1420,

P (late, rain) =320, P (on time, rain) =

120

Page 14: Probability basics - Home Page | NYU Courantcfgranda/pages/DSGA1002... · Probability basics DS GA 1002 Probability and Statistics for Data Science cfgranda/pages/DSGA1002_fall17

Example: Flights and rain

P (late|rain)

=P (late, rain)

P (rain)

=P (late, rain)

P (late, rain) + P (on time, rain)

=34

Page 15: Probability basics - Home Page | NYU Courantcfgranda/pages/DSGA1002... · Probability basics DS GA 1002 Probability and Statistics for Data Science cfgranda/pages/DSGA1002_fall17

Example: Flights and rain

P (late|rain) =P (late, rain)

P (rain)

=P (late, rain)

P (late, rain) + P (on time, rain)

=34

Page 16: Probability basics - Home Page | NYU Courantcfgranda/pages/DSGA1002... · Probability basics DS GA 1002 Probability and Statistics for Data Science cfgranda/pages/DSGA1002_fall17

Example: Flights and rain

P (late|rain) =P (late, rain)

P (rain)

=P (late, rain)

P (late, rain) + P (on time, rain)

=34

Page 17: Probability basics - Home Page | NYU Courantcfgranda/pages/DSGA1002... · Probability basics DS GA 1002 Probability and Statistics for Data Science cfgranda/pages/DSGA1002_fall17

Example: Flights and rain

P (late|rain) =P (late, rain)

P (rain)

=P (late, rain)

P (late, rain) + P (on time, rain)

=34

Page 18: Probability basics - Home Page | NYU Courantcfgranda/pages/DSGA1002... · Probability basics DS GA 1002 Probability and Statistics for Data Science cfgranda/pages/DSGA1002_fall17

Chain rule

For any pair of events A and B

P (A ∩ B) = P (A) P (B|A) = P (B) P (A|B)

For any sequence of events S1,S2, S3, . . .

P (∩iSi ) = P (S1) P (S2|S1) P (S3|S1 ∩ S2) . . .

=∏i

P(Si | ∩i−1

j=1 Sj

)

Page 19: Probability basics - Home Page | NYU Courantcfgranda/pages/DSGA1002... · Probability basics DS GA 1002 Probability and Statistics for Data Science cfgranda/pages/DSGA1002_fall17

Law of Total Probability

If A1,A2, . . . ∈ F is a partition of Ω

I Ai and Aj are disjoint if i 6= j

I Ω = ∪iAi

For any set S ∈ F

P (S) =∑i

P (S ∩ Ai ) =∑i

P (Ai ) P (S |Ai )

Page 20: Probability basics - Home Page | NYU Courantcfgranda/pages/DSGA1002... · Probability basics DS GA 1002 Probability and Statistics for Data Science cfgranda/pages/DSGA1002_fall17

Example: Flights and rain (continued)

P (rain) = 0.2P (late|rain) = 0.75

P (late|no rain) = 0.125

P (late)

= P (rain) P (late|rain) + P (no rain) P (late|no rain)

= 0.2 · 0.75 + 0.8 · 0.125 = 0.25

Page 21: Probability basics - Home Page | NYU Courantcfgranda/pages/DSGA1002... · Probability basics DS GA 1002 Probability and Statistics for Data Science cfgranda/pages/DSGA1002_fall17

Example: Flights and rain (continued)

P (rain) = 0.2P (late|rain) = 0.75

P (late|no rain) = 0.125

P (late) = P (rain) P (late|rain) + P (no rain) P (late|no rain)

= 0.2 · 0.75 + 0.8 · 0.125 = 0.25

Page 22: Probability basics - Home Page | NYU Courantcfgranda/pages/DSGA1002... · Probability basics DS GA 1002 Probability and Statistics for Data Science cfgranda/pages/DSGA1002_fall17

Example: Flights and rain (continued)

P (rain) = 0.2P (late|rain) = 0.75

P (late|no rain) = 0.125

P (late) = P (rain) P (late|rain) + P (no rain) P (late|no rain)

= 0.2 · 0.75 + 0.8 · 0.125 = 0.25

Page 23: Probability basics - Home Page | NYU Courantcfgranda/pages/DSGA1002... · Probability basics DS GA 1002 Probability and Statistics for Data Science cfgranda/pages/DSGA1002_fall17

Important!

P (A|B) 6= P (B|A)

Page 24: Probability basics - Home Page | NYU Courantcfgranda/pages/DSGA1002... · Probability basics DS GA 1002 Probability and Statistics for Data Science cfgranda/pages/DSGA1002_fall17

Bayes’ Rule

Let A1,A2, . . . ∈ F be a partition of Ω

For any set S ∈ F

P (Ai |S) =P (Ai ) P (S |Ai )∑j P (S |Aj) P (Aj)

Page 25: Probability basics - Home Page | NYU Courantcfgranda/pages/DSGA1002... · Probability basics DS GA 1002 Probability and Statistics for Data Science cfgranda/pages/DSGA1002_fall17

Example: Flights and rain (continued)

P (rain) = 0.2P (late|rain) = 0.75P (late|no rain) = 0.125

P (rain|late) ?

Page 26: Probability basics - Home Page | NYU Courantcfgranda/pages/DSGA1002... · Probability basics DS GA 1002 Probability and Statistics for Data Science cfgranda/pages/DSGA1002_fall17

Example: Flights and rain (continued)

P (rain|late)

=P (rain, late)

P (late)

=P (late|rain) P (rain)

P (late|rain) P (rain) + P (late|no rain) P (no rain)

=0.75 · 0.2

0.75 · 0.2 + 0.125 · 0.8= 0.6

Page 27: Probability basics - Home Page | NYU Courantcfgranda/pages/DSGA1002... · Probability basics DS GA 1002 Probability and Statistics for Data Science cfgranda/pages/DSGA1002_fall17

Example: Flights and rain (continued)

P (rain|late) =P (rain, late)

P (late)

=P (late|rain) P (rain)

P (late|rain) P (rain) + P (late|no rain) P (no rain)

=0.75 · 0.2

0.75 · 0.2 + 0.125 · 0.8= 0.6

Page 28: Probability basics - Home Page | NYU Courantcfgranda/pages/DSGA1002... · Probability basics DS GA 1002 Probability and Statistics for Data Science cfgranda/pages/DSGA1002_fall17

Example: Flights and rain (continued)

P (rain|late) =P (rain, late)

P (late)

=P (late|rain) P (rain)

P (late|rain) P (rain) + P (late|no rain) P (no rain)

=0.75 · 0.2

0.75 · 0.2 + 0.125 · 0.8= 0.6

Page 29: Probability basics - Home Page | NYU Courantcfgranda/pages/DSGA1002... · Probability basics DS GA 1002 Probability and Statistics for Data Science cfgranda/pages/DSGA1002_fall17

Example: Flights and rain (continued)

P (rain|late) =P (rain, late)

P (late)

=P (late|rain) P (rain)

P (late|rain) P (rain) + P (late|no rain) P (no rain)

=0.75 · 0.2

0.75 · 0.2 + 0.125 · 0.8= 0.6

Page 30: Probability basics - Home Page | NYU Courantcfgranda/pages/DSGA1002... · Probability basics DS GA 1002 Probability and Statistics for Data Science cfgranda/pages/DSGA1002_fall17

Probability spaces

Conditional probability

Independence

Page 31: Probability basics - Home Page | NYU Courantcfgranda/pages/DSGA1002... · Probability basics DS GA 1002 Probability and Statistics for Data Science cfgranda/pages/DSGA1002_fall17

Definition

Two sets A,B are independent if

P (A|B) = P (A)

or equivalently

P (A ∩ B) = P (A) P (B)

Page 32: Probability basics - Home Page | NYU Courantcfgranda/pages/DSGA1002... · Probability basics DS GA 1002 Probability and Statistics for Data Science cfgranda/pages/DSGA1002_fall17

Conditional independence

A,B are conditionally independent given C if

P (A|B,C ) = P (A|C )

where P (A|B,C ) := P (A|B ∩ C ), or equivalently

P (A ∩ B|C ) = P (A|C ) P (B|C )

Page 33: Probability basics - Home Page | NYU Courantcfgranda/pages/DSGA1002... · Probability basics DS GA 1002 Probability and Statistics for Data Science cfgranda/pages/DSGA1002_fall17

Conditional independence does not imply independence

Probabilistic model for taxi availability, flight delay and weather

P (rain) = 0.2P (late|rain) = 0.75P (late|no rain) = 0.125P (taxi|rain) = 0.1P (taxi|no rain) = 0.6

Given rain and no rain, late and taxi are conditionally independent

Are they also independent? P (taxi) = P (taxi|late)?

Page 34: Probability basics - Home Page | NYU Courantcfgranda/pages/DSGA1002... · Probability basics DS GA 1002 Probability and Statistics for Data Science cfgranda/pages/DSGA1002_fall17

Conditional independence does not imply independence

P (taxi)

= P (taxi|rain) P (rain) + P (taxi|no rain) P (no rain)

= 0.1 · 0.2 + 0.6 · 0.8 = 0.5

P (taxi|late) =P (taxi, late)

P (late)

=P (taxi, late, rain) + P (taxi, late, no rain)

P (late)

=P (t|l, r) P (l|r) P (r) + P (t|l, no r) P (l|no r) P (no r)

P (l)

=P (taxi|r) P (late|r) P (r) + P (taxi|no r) P (late|no r) P (no r)

P (late)

=0.1 · 0.75 · 0.2 + 0.6 · 0.125 · 0.8

0.25= 0.3

They are not independent

Page 35: Probability basics - Home Page | NYU Courantcfgranda/pages/DSGA1002... · Probability basics DS GA 1002 Probability and Statistics for Data Science cfgranda/pages/DSGA1002_fall17

Conditional independence does not imply independence

P (taxi) = P (taxi|rain) P (rain) + P (taxi|no rain) P (no rain)

= 0.1 · 0.2 + 0.6 · 0.8 = 0.5

P (taxi|late) =P (taxi, late)

P (late)

=P (taxi, late, rain) + P (taxi, late, no rain)

P (late)

=P (t|l, r) P (l|r) P (r) + P (t|l, no r) P (l|no r) P (no r)

P (l)

=P (taxi|r) P (late|r) P (r) + P (taxi|no r) P (late|no r) P (no r)

P (late)

=0.1 · 0.75 · 0.2 + 0.6 · 0.125 · 0.8

0.25= 0.3

They are not independent

Page 36: Probability basics - Home Page | NYU Courantcfgranda/pages/DSGA1002... · Probability basics DS GA 1002 Probability and Statistics for Data Science cfgranda/pages/DSGA1002_fall17

Conditional independence does not imply independence

P (taxi) = P (taxi|rain) P (rain) + P (taxi|no rain) P (no rain)

= 0.1 · 0.2 + 0.6 · 0.8 = 0.5

P (taxi|late) =P (taxi, late)

P (late)

=P (taxi, late, rain) + P (taxi, late, no rain)

P (late)

=P (t|l, r) P (l|r) P (r) + P (t|l, no r) P (l|no r) P (no r)

P (l)

=P (taxi|r) P (late|r) P (r) + P (taxi|no r) P (late|no r) P (no r)

P (late)

=0.1 · 0.75 · 0.2 + 0.6 · 0.125 · 0.8

0.25= 0.3

They are not independent

Page 37: Probability basics - Home Page | NYU Courantcfgranda/pages/DSGA1002... · Probability basics DS GA 1002 Probability and Statistics for Data Science cfgranda/pages/DSGA1002_fall17

Conditional independence does not imply independence

P (taxi) = P (taxi|rain) P (rain) + P (taxi|no rain) P (no rain)

= 0.1 · 0.2 + 0.6 · 0.8 = 0.5

P (taxi|late)

=P (taxi, late)

P (late)

=P (taxi, late, rain) + P (taxi, late, no rain)

P (late)

=P (t|l, r) P (l|r) P (r) + P (t|l, no r) P (l|no r) P (no r)

P (l)

=P (taxi|r) P (late|r) P (r) + P (taxi|no r) P (late|no r) P (no r)

P (late)

=0.1 · 0.75 · 0.2 + 0.6 · 0.125 · 0.8

0.25= 0.3

They are not independent

Page 38: Probability basics - Home Page | NYU Courantcfgranda/pages/DSGA1002... · Probability basics DS GA 1002 Probability and Statistics for Data Science cfgranda/pages/DSGA1002_fall17

Conditional independence does not imply independence

P (taxi) = P (taxi|rain) P (rain) + P (taxi|no rain) P (no rain)

= 0.1 · 0.2 + 0.6 · 0.8 = 0.5

P (taxi|late) =P (taxi, late)

P (late)

=P (taxi, late, rain) + P (taxi, late, no rain)

P (late)

=P (t|l, r) P (l|r) P (r) + P (t|l, no r) P (l|no r) P (no r)

P (l)

=P (taxi|r) P (late|r) P (r) + P (taxi|no r) P (late|no r) P (no r)

P (late)

=0.1 · 0.75 · 0.2 + 0.6 · 0.125 · 0.8

0.25= 0.3

They are not independent

Page 39: Probability basics - Home Page | NYU Courantcfgranda/pages/DSGA1002... · Probability basics DS GA 1002 Probability and Statistics for Data Science cfgranda/pages/DSGA1002_fall17

Conditional independence does not imply independence

P (taxi) = P (taxi|rain) P (rain) + P (taxi|no rain) P (no rain)

= 0.1 · 0.2 + 0.6 · 0.8 = 0.5

P (taxi|late) =P (taxi, late)

P (late)

=P (taxi, late, rain) + P (taxi, late, no rain)

P (late)

=P (t|l, r) P (l|r) P (r) + P (t|l, no r) P (l|no r) P (no r)

P (l)

=P (taxi|r) P (late|r) P (r) + P (taxi|no r) P (late|no r) P (no r)

P (late)

=0.1 · 0.75 · 0.2 + 0.6 · 0.125 · 0.8

0.25= 0.3

They are not independent

Page 40: Probability basics - Home Page | NYU Courantcfgranda/pages/DSGA1002... · Probability basics DS GA 1002 Probability and Statistics for Data Science cfgranda/pages/DSGA1002_fall17

Conditional independence does not imply independence

P (taxi) = P (taxi|rain) P (rain) + P (taxi|no rain) P (no rain)

= 0.1 · 0.2 + 0.6 · 0.8 = 0.5

P (taxi|late) =P (taxi, late)

P (late)

=P (taxi, late, rain) + P (taxi, late, no rain)

P (late)

=P (t|l, r) P (l|r) P (r) + P (t|l, no r) P (l|no r) P (no r)

P (l)

=P (taxi|r) P (late|r) P (r) + P (taxi|no r) P (late|no r) P (no r)

P (late)

=0.1 · 0.75 · 0.2 + 0.6 · 0.125 · 0.8

0.25= 0.3

They are not independent

Page 41: Probability basics - Home Page | NYU Courantcfgranda/pages/DSGA1002... · Probability basics DS GA 1002 Probability and Statistics for Data Science cfgranda/pages/DSGA1002_fall17

Conditional independence does not imply independence

P (taxi) = P (taxi|rain) P (rain) + P (taxi|no rain) P (no rain)

= 0.1 · 0.2 + 0.6 · 0.8 = 0.5

P (taxi|late) =P (taxi, late)

P (late)

=P (taxi, late, rain) + P (taxi, late, no rain)

P (late)

=P (t|l, r) P (l|r) P (r) + P (t|l, no r) P (l|no r) P (no r)

P (l)

=P (taxi|r) P (late|r) P (r) + P (taxi|no r) P (late|no r) P (no r)

P (late)

=0.1 · 0.75 · 0.2 + 0.6 · 0.125 · 0.8

0.25= 0.3

They are not independent

Page 42: Probability basics - Home Page | NYU Courantcfgranda/pages/DSGA1002... · Probability basics DS GA 1002 Probability and Statistics for Data Science cfgranda/pages/DSGA1002_fall17

Conditional independence does not imply independence

P (taxi) = P (taxi|rain) P (rain) + P (taxi|no rain) P (no rain)

= 0.1 · 0.2 + 0.6 · 0.8 = 0.5

P (taxi|late) =P (taxi, late)

P (late)

=P (taxi, late, rain) + P (taxi, late, no rain)

P (late)

=P (t|l, r) P (l|r) P (r) + P (t|l, no r) P (l|no r) P (no r)

P (l)

=P (taxi|r) P (late|r) P (r) + P (taxi|no r) P (late|no r) P (no r)

P (late)

=0.1 · 0.75 · 0.2 + 0.6 · 0.125 · 0.8

0.25= 0.3

They are not independent

Page 43: Probability basics - Home Page | NYU Courantcfgranda/pages/DSGA1002... · Probability basics DS GA 1002 Probability and Statistics for Data Science cfgranda/pages/DSGA1002_fall17

Conditional independence does not imply independence

P (taxi) = P (taxi|rain) P (rain) + P (taxi|no rain) P (no rain)

= 0.1 · 0.2 + 0.6 · 0.8 = 0.5

P (taxi|late) =P (taxi, late)

P (late)

=P (taxi, late, rain) + P (taxi, late, no rain)

P (late)

=P (t|l, r) P (l|r) P (r) + P (t|l, no r) P (l|no r) P (no r)

P (l)

=P (taxi|r) P (late|r) P (r) + P (taxi|no r) P (late|no r) P (no r)

P (late)

=0.1 · 0.75 · 0.2 + 0.6 · 0.125 · 0.8

0.25= 0.3

They are not independent

Page 44: Probability basics - Home Page | NYU Courantcfgranda/pages/DSGA1002... · Probability basics DS GA 1002 Probability and Statistics for Data Science cfgranda/pages/DSGA1002_fall17

Independence does not imply conditional independence

Probabilistic model for mechanical problems, weather and delays

P (rain) = 0.2P (late|rain) = 0.75P (late|no rain) = 0.125P (problem) = 0.1P (late|problem) = 0.7P (late|no problem) = 0.2P (late|no rain, problem) = 0.5

problem and no rain are independent

Are they also conditionally independent given late?P (problem|late, no rain) = P (problem|late) ?

Page 45: Probability basics - Home Page | NYU Courantcfgranda/pages/DSGA1002... · Probability basics DS GA 1002 Probability and Statistics for Data Science cfgranda/pages/DSGA1002_fall17

Independence does not imply conditional independence

P (problem|late)

=P (late, problem)

P (late)

=P (late|p) P (p)

P (late|p) P (p) + P (late|no p) P (no p)

=0.7 · 0.1

0.7 · 0.1 + 0.2 · 0.9= 0.28

P (problem|late, no rain) =P (late, no rain, problem)

P (late, no rain)

=P (late|no rain, p) P (no rain|p) P (p)

P (late|no rain) P (no rain)

=P (late|no rain, problem) P (no rain) P (problem)

P (late|no rain) P (no rain)

=0.5 · 0.10.125

= 0.4

Page 46: Probability basics - Home Page | NYU Courantcfgranda/pages/DSGA1002... · Probability basics DS GA 1002 Probability and Statistics for Data Science cfgranda/pages/DSGA1002_fall17

Independence does not imply conditional independence

P (problem|late) =P (late, problem)

P (late)

=P (late|p) P (p)

P (late|p) P (p) + P (late|no p) P (no p)

=0.7 · 0.1

0.7 · 0.1 + 0.2 · 0.9= 0.28

P (problem|late, no rain) =P (late, no rain, problem)

P (late, no rain)

=P (late|no rain, p) P (no rain|p) P (p)

P (late|no rain) P (no rain)

=P (late|no rain, problem) P (no rain) P (problem)

P (late|no rain) P (no rain)

=0.5 · 0.10.125

= 0.4

Page 47: Probability basics - Home Page | NYU Courantcfgranda/pages/DSGA1002... · Probability basics DS GA 1002 Probability and Statistics for Data Science cfgranda/pages/DSGA1002_fall17

Independence does not imply conditional independence

P (problem|late) =P (late, problem)

P (late)

=P (late|p) P (p)

P (late|p) P (p) + P (late|no p) P (no p)

=0.7 · 0.1

0.7 · 0.1 + 0.2 · 0.9= 0.28

P (problem|late, no rain) =P (late, no rain, problem)

P (late, no rain)

=P (late|no rain, p) P (no rain|p) P (p)

P (late|no rain) P (no rain)

=P (late|no rain, problem) P (no rain) P (problem)

P (late|no rain) P (no rain)

=0.5 · 0.10.125

= 0.4

Page 48: Probability basics - Home Page | NYU Courantcfgranda/pages/DSGA1002... · Probability basics DS GA 1002 Probability and Statistics for Data Science cfgranda/pages/DSGA1002_fall17

Independence does not imply conditional independence

P (problem|late) =P (late, problem)

P (late)

=P (late|p) P (p)

P (late|p) P (p) + P (late|no p) P (no p)

=0.7 · 0.1

0.7 · 0.1 + 0.2 · 0.9= 0.28

P (problem|late, no rain) =P (late, no rain, problem)

P (late, no rain)

=P (late|no rain, p) P (no rain|p) P (p)

P (late|no rain) P (no rain)

=P (late|no rain, problem) P (no rain) P (problem)

P (late|no rain) P (no rain)

=0.5 · 0.10.125

= 0.4

Page 49: Probability basics - Home Page | NYU Courantcfgranda/pages/DSGA1002... · Probability basics DS GA 1002 Probability and Statistics for Data Science cfgranda/pages/DSGA1002_fall17

Independence does not imply conditional independence

P (problem|late) =P (late, problem)

P (late)

=P (late|p) P (p)

P (late|p) P (p) + P (late|no p) P (no p)

=0.7 · 0.1

0.7 · 0.1 + 0.2 · 0.9= 0.28

P (problem|late, no rain)

=P (late, no rain, problem)

P (late, no rain)

=P (late|no rain, p) P (no rain|p) P (p)

P (late|no rain) P (no rain)

=P (late|no rain, problem) P (no rain) P (problem)

P (late|no rain) P (no rain)

=0.5 · 0.10.125

= 0.4

Page 50: Probability basics - Home Page | NYU Courantcfgranda/pages/DSGA1002... · Probability basics DS GA 1002 Probability and Statistics for Data Science cfgranda/pages/DSGA1002_fall17

Independence does not imply conditional independence

P (problem|late) =P (late, problem)

P (late)

=P (late|p) P (p)

P (late|p) P (p) + P (late|no p) P (no p)

=0.7 · 0.1

0.7 · 0.1 + 0.2 · 0.9= 0.28

P (problem|late, no rain) =P (late, no rain, problem)

P (late, no rain)

=P (late|no rain, p) P (no rain|p) P (p)

P (late|no rain) P (no rain)

=P (late|no rain, problem) P (no rain) P (problem)

P (late|no rain) P (no rain)

=0.5 · 0.10.125

= 0.4

Page 51: Probability basics - Home Page | NYU Courantcfgranda/pages/DSGA1002... · Probability basics DS GA 1002 Probability and Statistics for Data Science cfgranda/pages/DSGA1002_fall17

Independence does not imply conditional independence

P (problem|late) =P (late, problem)

P (late)

=P (late|p) P (p)

P (late|p) P (p) + P (late|no p) P (no p)

=0.7 · 0.1

0.7 · 0.1 + 0.2 · 0.9= 0.28

P (problem|late, no rain) =P (late, no rain, problem)

P (late, no rain)

=P (late|no rain, p) P (no rain|p) P (p)

P (late|no rain) P (no rain)

=P (late|no rain, problem) P (no rain) P (problem)

P (late|no rain) P (no rain)

=0.5 · 0.10.125

= 0.4

Page 52: Probability basics - Home Page | NYU Courantcfgranda/pages/DSGA1002... · Probability basics DS GA 1002 Probability and Statistics for Data Science cfgranda/pages/DSGA1002_fall17

Independence does not imply conditional independence

P (problem|late) =P (late, problem)

P (late)

=P (late|p) P (p)

P (late|p) P (p) + P (late|no p) P (no p)

=0.7 · 0.1

0.7 · 0.1 + 0.2 · 0.9= 0.28

P (problem|late, no rain) =P (late, no rain, problem)

P (late, no rain)

=P (late|no rain, p) P (no rain|p) P (p)

P (late|no rain) P (no rain)

=P (late|no rain, problem) P (no rain) P (problem)

P (late|no rain) P (no rain)

=0.5 · 0.10.125

= 0.4

Page 53: Probability basics - Home Page | NYU Courantcfgranda/pages/DSGA1002... · Probability basics DS GA 1002 Probability and Statistics for Data Science cfgranda/pages/DSGA1002_fall17

Independence does not imply conditional independence

P (problem|late) =P (late, problem)

P (late)

=P (late|p) P (p)

P (late|p) P (p) + P (late|no p) P (no p)

=0.7 · 0.1

0.7 · 0.1 + 0.2 · 0.9= 0.28

P (problem|late, no rain) =P (late, no rain, problem)

P (late, no rain)

=P (late|no rain, p) P (no rain|p) P (p)

P (late|no rain) P (no rain)

=P (late|no rain, problem) P (no rain) P (problem)

P (late|no rain) P (no rain)

=0.5 · 0.10.125

= 0.4