32
This week and next Today, Feb. 5 Statistics and epidemiology Discussion of paper due tomorrow The atom Thursday, Feb. 7 The atom and the nucleus Tuesday, Feb. 12 (CJ in London) In-class exam Thursday, Feb. 14 (CJ in London) Guest speaker: PhD student David Syndacker, on energy storage research and on the Initiative for Sustainability and Energy at Northwestern (isen.northwestern.edu) and the Northwestern Energy and Sustainability Consortium (nesc.northwestern.edu) 1

This week and next - Northwestern Universityxrm.phys.northwestern.edu/~jacobsen/phy103w2013/l9.pdf · This week and next •Today, Feb. 5 • Statistics and epidemiology • Discussion

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: This week and next - Northwestern Universityxrm.phys.northwestern.edu/~jacobsen/phy103w2013/l9.pdf · This week and next •Today, Feb. 5 • Statistics and epidemiology • Discussion

This week and next• Today, Feb. 5

• Statistics and epidemiology

• Discussion of paper due tomorrow

• The atom

• Thursday, Feb. 7• The atom and the nucleus

• Tuesday, Feb. 12 (CJ in London)• In-class exam

• Thursday, Feb. 14 (CJ in London)• Guest speaker: PhD student David Syndacker, on energy storage research and

on the Initiative for Sustainability and Energy at Northwestern (isen.northwestern.edu) and the Northwestern Energy and Sustainability Consortium (nesc.northwestern.edu)

1

Page 2: This week and next - Northwestern Universityxrm.phys.northwestern.edu/~jacobsen/phy103w2013/l9.pdf · This week and next •Today, Feb. 5 • Statistics and epidemiology • Discussion

US Households

•Single family dwellings: 81 million (72%)

•Multifamily dwellings: 25 million (22%)

•Mobile homes: 6.7 million (6%)

2

Page 3: This week and next - Northwestern Universityxrm.phys.northwestern.edu/~jacobsen/phy103w2013/l9.pdf · This week and next •Today, Feb. 5 • Statistics and epidemiology • Discussion

3

0.28

0.87

0.43

0.37

0.100.19

0.08

0.720.03

0.09

0.33

0.170.14

0.89

Spaceheating

Spacecooling

Waterheating

Dishwashers

Cooking

Refrigeration

Clothesdryers

Freezers

LightingClotheswashers

TVs andset-topboxes

Computers and networks

Furnace pumps andfans

Other

US Residential electricity use, in quads/year (1 quad=1.06x109 Joules)

Page 4: This week and next - Northwestern Universityxrm.phys.northwestern.edu/~jacobsen/phy103w2013/l9.pdf · This week and next •Today, Feb. 5 • Statistics and epidemiology • Discussion

• Fuel oil:• 0.55 space heating

• 0.11 water heating

• Propane:• 0.26 space heating

• 0.08 water heating

• 0.03 cooking

• 0.14 other

4

Space heating

3.40

Waterheating

1.33

Cooking

0.22

Clothes dryers

0.05

US Natural Gas use, Quads/year(residential)

Page 5: This week and next - Northwestern Universityxrm.phys.northwestern.edu/~jacobsen/phy103w2013/l9.pdf · This week and next •Today, Feb. 5 • Statistics and epidemiology • Discussion

5

4.97

0.87

1.96

0.45

0.350.24

0.72

0.03

0.090.330.17

0.141.03

Space Heating

Spacecooling

Water heating

Refrigerators& freezers

CookingClothes dryers

Lighting

Clothes washers

DishwashersTVs & boxes

Computers &networks Furnace fans & pumps

Other

US total residential energy use (ignoring electrical losses)in quads/year Total: 11.36

Electrical losses: 10.17

Page 6: This week and next - Northwestern Universityxrm.phys.northwestern.edu/~jacobsen/phy103w2013/l9.pdf · This week and next •Today, Feb. 5 • Statistics and epidemiology • Discussion

Last year’s first exam

6

PHY 103 “Physics for Future Presidents” exam 1, April 24, 2012. You are allowed to use acalculator, and to have one page of notes (front and back side).

Some notes: a progression of exponents is kilo, mega, giga, tera, peta, exa. The Watt is the SIunit for power.

1. National statistics suggest that out of 500,000 professors, 28,850 suffer from yakoholism (forwhich the only effective treatment is regular attendance at On And On Anon meetings). Acertain fine midwestern university has 1,890 faculty members, and 131 cases of yakoholism.Is this higher or lower than the value expected from national statistics? What are the chancesthat this is a statistical fluke, rather than a real effect caused by overexposure to The ColorPurple? (The book, of course).Answer: The expected rate is best estimated from the large national statistics:

2.885e4 cases5.0e5 people

· (1890 people) = 109 cases.

About two-thirds of the time (really 68% of the time) you’ll find measurements with peoplewithin � =

p109 = 10.4 of 109, or between 109-10=99 and 109+10=119. It’s only about

1% of the time that you’ll find values that are 2� outside the range, or below 109-2 ·10=89 orabove 109+2 · 10=129. Thus there’s only a 1% or lower chance that the university has moreyakoholics due only to statistical chance; there’s probably something about the color purplethat is truly associated with a decrease in brain function.

1

Page 7: This week and next - Northwestern Universityxrm.phys.northwestern.edu/~jacobsen/phy103w2013/l9.pdf · This week and next •Today, Feb. 5 • Statistics and epidemiology • Discussion

Last year’s first exam (continued)

7

2. In 2009, total US energy use amounted to 94.6 quads (remember that 1 quad=1.06 exajoules).If this were all to have been supplied by buying electricity at a cost of $0.086 per kWh, howmuch would it have cost?Answer: Let’s first convert the cost to dollars per Joule, realizing that a Watt is a Joule persecond:

(8.6⇥ 10�2 $

103 Joules · hours/seconds) · ( 1 hour

3.6⇥ 103 seconds) = 2.4⇥ 10�8 $/Joule.

If all of this energy had been supplied by buying electricity, the cost would have been

(94.6 quads) · (1.06⇥ 1018 Joulesquad

) · (2.4⇥ 10�8 $Joule

) = 2.4⇥ 1012 $

or 2.4 trillion dollars.

3. Address one of the following questions, based on your notes. Include both quantitativeinformation, and discussion.

a. What forms of energy supply can be ultimately traced back to the sun? Considering di-rect use of solar energy today, describe the fraction of USA needs that could potentiallybe supplied by solar energy, and discuss the challenges in making this happen.

b. What are the advantages of liquid chemical fuels over other energy “storage media”?What are the challenges in their long-term supply, outlook for reserves, and what mightyou predict for price and availability in the future?

2

Page 8: This week and next - Northwestern Universityxrm.phys.northwestern.edu/~jacobsen/phy103w2013/l9.pdf · This week and next •Today, Feb. 5 • Statistics and epidemiology • Discussion

This year’s first exam (Tuesday, Feb. 12, 4 pm)• Similar, but different.

• One or two things where you might have to do some simple calculations.

• One or two short essay questions, where you should show a command of key facts and a thoughtful perspective conveyed in a precise discussion. Long opinion statements are less useful than perspectives based on your reflections on information discussed thus far.

• You can bring a single page of notes you have prepared in advance (info on front and back is fine).

• Bring a pen

• Bring your sharp wit and insight.

• Blue books will be provided.

• No cell phone use allowed during the exam.

8

Page 9: This week and next - Northwestern Universityxrm.phys.northwestern.edu/~jacobsen/phy103w2013/l9.pdf · This week and next •Today, Feb. 5 • Statistics and epidemiology • Discussion

Lies, damned lies, and statistics• Origin of the quote? Unclear; popularized by Mark Twain.

• We want to understand ways to represent, and misrepresent, numerical information.

9

Page 10: This week and next - Northwestern Universityxrm.phys.northwestern.edu/~jacobsen/phy103w2013/l9.pdf · This week and next •Today, Feb. 5 • Statistics and epidemiology • Discussion

Coloring data by how we plot it• Consumer price index on a linear scale

10

0

50

100

150

200

250

CPI

1920 1940 1960 1980 2000Year

Page 11: This week and next - Northwestern Universityxrm.phys.northwestern.edu/~jacobsen/phy103w2013/l9.pdf · This week and next •Today, Feb. 5 • Statistics and epidemiology • Discussion

Coloring data by how we plot it• Consumer price index on a logarithmic scale

11

10

100

CPI

1920 1940 1960 1980 2000Year

20

30405060708090

200

98

Page 12: This week and next - Northwestern Universityxrm.phys.northwestern.edu/~jacobsen/phy103w2013/l9.pdf · This week and next •Today, Feb. 5 • Statistics and epidemiology • Discussion

Compared to what?• Oil prices

12

1850

0

20

40

60

80

100

$/ba

rrel o

f oil

18601870188018901900191019201930194019501960197019801990200020102020

Year

Dollars of the day

Page 13: This week and next - Northwestern Universityxrm.phys.northwestern.edu/~jacobsen/phy103w2013/l9.pdf · This week and next •Today, Feb. 5 • Statistics and epidemiology • Discussion

Compared to what?• Oil prices

13

2010 dollars

1850

0 18601870188018901900191019201930194019501960197019801990200020102020

Year

20

40

60

80

100

120

$/ba

rrel o

f oil

Page 14: This week and next - Northwestern Universityxrm.phys.northwestern.edu/~jacobsen/phy103w2013/l9.pdf · This week and next •Today, Feb. 5 • Statistics and epidemiology • Discussion

Suppressing the zero• Oil prices

14

1970 1980 1990 2000 2010Year

10

20

30

40

50

60

$/barrel

Page 15: This week and next - Northwestern Universityxrm.phys.northwestern.edu/~jacobsen/phy103w2013/l9.pdf · This week and next •Today, Feb. 5 • Statistics and epidemiology • Discussion

15

Should we see what we believe?

The constellation Andromeda

Page 16: This week and next - Northwestern Universityxrm.phys.northwestern.edu/~jacobsen/phy103w2013/l9.pdf · This week and next •Today, Feb. 5 • Statistics and epidemiology • Discussion

Finding patterns100x100 pixel image, 500 stars per image (in random positions), random brightness per star

16

Page 17: This week and next - Northwestern Universityxrm.phys.northwestern.edu/~jacobsen/phy103w2013/l9.pdf · This week and next •Today, Feb. 5 • Statistics and epidemiology • Discussion

Poisson and Gaussian statistics

•Poisson distribution: if the average value over many, many tests is n, what’s the probability P of seeing n on one particular test?

•Gaussian distribution: same meaning for n, n, and P, but valid only for large n:

17

“Bell curve”

Page 18: This week and next - Northwestern Universityxrm.phys.northwestern.edu/~jacobsen/phy103w2013/l9.pdf · This week and next •Today, Feb. 5 • Statistics and epidemiology • Discussion

Comparing Poisson (exact) and Gaussian• The Gaussian approximation works well for even very small values of n

18

0.0

0.1

0.2

0.3

0.4

0.5

0.00

0.05

0.10

0.15

0.20

0.00

0.02

0.04

0.06

0.08

0.10

0 10 20 30 40n

20 30 40 50 60 70 80n

0 5 10 15n

0 5 10 15 20 25n

0.00

=20 =50

0 1 2 3 4 5n

0 2 4 6 8n

=5

=10

=1 =2

0.0

0.1

0.2

0.3

0.05

0.10

0.15

0.00

0.01

0.02

0.03

0.04

0.05

0.06

n̄ n̄

n̄ n̄

P(n,

n)¯

P(n,

n)¯

P(n,

n)¯P(

n,n)¯

P(n,

n)¯

P(n,

n)¯

ExactApprox.

Gaussian

Page 19: This week and next - Northwestern Universityxrm.phys.northwestern.edu/~jacobsen/phy103w2013/l9.pdf · This week and next •Today, Feb. 5 • Statistics and epidemiology • Discussion

Properties of Gaussian distributions• The square root of the variance is called sigma, or σ. For counting statistics, σ is the

square root of the mean number of events, or

19

0 5 10 15 20n

0.00

0.05

0.10

0.15

0.20

Prob

abilit

y

68% ofevents

16% ofevents

16% ofevents

For n=10_

n- m=7

_ n+m=

13_

Page 20: This week and next - Northwestern Universityxrm.phys.northwestern.edu/~jacobsen/phy103w2013/l9.pdf · This week and next •Today, Feb. 5 • Statistics and epidemiology • Discussion

The error function erf(x)• What fraction of events are more than x⋅σ away from the mean? This is described by

something called the error function erf(x)

20

0.0 0.5 1.0 1.5 2.0 2.5x

0 1 2 3 4 5x

0.0

0.2

0.4

0.6

0.8

1.0

erf(x)

10-10

10-8

10-6

10-4

10-2

100

1-erf(x)

x 1 2 32[1-erf(x)] 31.5% 0.94% 0.0044%

Page 21: This week and next - Northwestern Universityxrm.phys.northwestern.edu/~jacobsen/phy103w2013/l9.pdf · This week and next •Today, Feb. 5 • Statistics and epidemiology • Discussion

Statistical error, and systematic error• Statistical error: usually given as 1σ. If you survey 1000 “yes-no” voters,

and they are evenly split, then the statistical error on each answer is √1000=31.• The fractional error is 31/1000, or 3.1/100, or 3.1%.

• Two thirds of the time (68%, actually), you’ll get within 3% of the same number of responses.

• Why not the error on √500? Because there is a strict correlation between yes and no voters. However, this is a subtlety that the president need not care about (and the statistical errors are dominated by the smaller number, so for diseases we only use the number of people with the disease rather than the number without), so we’ll just use the number in one result, or √500 in this case, for our purposes.

• Systematic error: non-random sampling, flaws in methodology, etc.• If you poll 1000 drum circle hangers-on at Occupy Michigan Avenue on their

choice for President, you may have 3% statistical error but you will get a biased result.

• Ditto with sampling 1000 Rush Limbaugh listeners.

21

Page 22: This week and next - Northwestern Universityxrm.phys.northwestern.edu/~jacobsen/phy103w2013/l9.pdf · This week and next •Today, Feb. 5 • Statistics and epidemiology • Discussion

Bayesian statistics• One can follow chains of probability. Let’s say that failure 1 in an

airplane happens 1% of the time, failure 2 happens 1% of the time, and you need both failures to occur to crash and die: 0.01·0.01=0.0001, or 0.01%, or 1 chance in 10,000.

• But let’s say that there is a 1% chance of failure 1, but a 1% chance for each of 10 different second failure modes. You now have a 0.01% chance of dying by one particular dual-failure sequence, but there are 10 different types of these sequences so your net chance of dying is 10·0.01%=0.1%.

• Safe design practices try to anticipate and correct all possible failure modes, but it is difficult to predict all possible failure/error sequences.

• Actual airplane crashes usually happen because of a sequence of multiple mistakes, all of which are unlikely to be exactly reproduced even though some other sequence might happen.

22

Page 23: This week and next - Northwestern Universityxrm.phys.northwestern.edu/~jacobsen/phy103w2013/l9.pdf · This week and next •Today, Feb. 5 • Statistics and epidemiology • Discussion

Last year’s first exam

23

PHY 103 “Physics for Future Presidents” exam 1, April 24, 2012. You are allowed to use acalculator, and to have one page of notes (front and back side).

Some notes: a progression of exponents is kilo, mega, giga, tera, peta, exa. The Watt is the SIunit for power.

1. National statistics suggest that out of 500,000 professors, 28,850 suffer from yakoholism (forwhich the only effective treatment is regular attendance at On And On Anon meetings). Acertain fine midwestern university has 1,890 faculty members, and 131 cases of yakoholism.Is this higher or lower than the value expected from national statistics? What are the chancesthat this is a statistical fluke, rather than a real effect caused by overexposure to The ColorPurple? (The book, of course).Answer: The expected rate is best estimated from the large national statistics:

2.885e4 cases5.0e5 people

· (1890 people) = 109 cases.

About two-thirds of the time (really 68% of the time) you’ll find measurements with peoplewithin � =

p109 = 10.4 of 109, or between 109-10=99 and 109+10=119. It’s only about

1% of the time that you’ll find values that are 2� outside the range, or below 109-2 ·10=89 orabove 109+2 · 10=129. Thus there’s only a 1% or lower chance that the university has moreyakoholics due only to statistical chance; there’s probably something about the color purplethat is truly associated with a decrease in brain function.

1

Page 24: This week and next - Northwestern Universityxrm.phys.northwestern.edu/~jacobsen/phy103w2013/l9.pdf · This week and next •Today, Feb. 5 • Statistics and epidemiology • Discussion

Airport scanners

• Airport scanners: radiation dose of about 0.02 µSieverts.

• Radiation: about 5 deaths per 10 billion people per year per µSieverts.

• You take 50 flights and get 1 µSievert per year for 20 years. Your personal chance of dying early due to the extra radiation is

or one chance in 100 million.

• But there are 650 million passengers enplaned per year in the USA (www.transtats.bts.gov):

24

Page 25: This week and next - Northwestern Universityxrm.phys.northwestern.edu/~jacobsen/phy103w2013/l9.pdf · This week and next •Today, Feb. 5 • Statistics and epidemiology • Discussion

Illusory superiority: a systematic errorGarrison Keilor’s Lake Wobegon, where all the women are strong, all the men are good looking, and all the children are above average.

25

87.5%77.1%

92.7%68.7%

Page 26: This week and next - Northwestern Universityxrm.phys.northwestern.edu/~jacobsen/phy103w2013/l9.pdf · This week and next •Today, Feb. 5 • Statistics and epidemiology • Discussion

Finding patterns100x100 pixel image, 500 stars per image (in random positions), random brightness per star

26

Page 27: This week and next - Northwestern Universityxrm.phys.northwestern.edu/~jacobsen/phy103w2013/l9.pdf · This week and next •Today, Feb. 5 • Statistics and epidemiology • Discussion

Poisson statistics and “clusters”• Are there certain towns which have an environmental problem causing

an increase in certain diseases?

27

02468101214161820

1 2 3 4 5 6 7 8 9 10 11 12More

# pl

aces

with

#ca

ses

# cases

1 2 3 4 5 6 7 8 9 10

S1

S3

S5

S7

S9

10x10 places, average=5

012

3

4

5

6

7

8

9

10

11

12

# pe

r pla

ce

• Half of the towns will have a greater-than-average disease incidence!

• About 16% will have an incidence above 1σ!

• Only about 0.5% will have an incidence above 2σ.

• This all assumes statistical variations only.

Page 28: This week and next - Northwestern Universityxrm.phys.northwestern.edu/~jacobsen/phy103w2013/l9.pdf · This week and next •Today, Feb. 5 • Statistics and epidemiology • Discussion

Disease clusters?

28

√67=8

90-67=23

23/8=2.9

Page 29: This week and next - Northwestern Universityxrm.phys.northwestern.edu/~jacobsen/phy103w2013/l9.pdf · This week and next •Today, Feb. 5 • Statistics and epidemiology • Discussion

29

Page 30: This week and next - Northwestern Universityxrm.phys.northwestern.edu/~jacobsen/phy103w2013/l9.pdf · This week and next •Today, Feb. 5 • Statistics and epidemiology • Discussion

Air Canada pilots

30

Page 31: This week and next - Northwestern Universityxrm.phys.northwestern.edu/~jacobsen/phy103w2013/l9.pdf · This week and next •Today, Feb. 5 • Statistics and epidemiology • Discussion

Correlated and uncorrelated events• If a family has two boys, what’s their chance of having their next child be

a girl?

• 50% ! For uncorrelated events, you “start the clock over” each time you “roll the dice”.

31

Page 32: This week and next - Northwestern Universityxrm.phys.northwestern.edu/~jacobsen/phy103w2013/l9.pdf · This week and next •Today, Feb. 5 • Statistics and epidemiology • Discussion

Small n statistics: summary• From a large survey, you get an incidence rate with lower fractional error,

like 28,850 yakoholics per 500,000 professors.

• Apply that to your small sample: (28850/500000)*1890 professors or n=109 expected yakoholics.

• For this expected number of 109, we would expect to have a square root standard deviation σ=√n or √109=10.4.

• We expect that 68% of the time we’ll find a result between n-σ (or 109-10.4=99), and n+σ (or 109+10.4=119).

• We expect that 99% of the time we’ll find a result between n-2σ (or 109-20.8=88), and n+2σ (or 109+20.8=130).

• Now you actually found in this sample of 1890 professors that there were 131 yakoholics. That’s outside the 99% confidence level, or more than 2σ from the expected value n, so this higher rate of yakoholics is like to be “real” rather than just a statistical fluctuation (meaning the rate is truly higher than expected even with small n statistics accounted for).

32