STA258H5 - University of Torontonosedal/sta258/sta258-lec01-02.pdf · instructions given by the...

Preview:

Citation preview

STA258H5

Al Nosedaland Alison Weir

Winter 2017

Al Nosedal and Alison Weir STA258H5 Winter 2017 1 / 88

INTRODUCING R AND ASSESSING NORMALITY

Al Nosedal and Alison Weir STA258H5 Winter 2017 2 / 88

History of R

S: language for data analysis developed at Bell Labs circa 1976

Licensed by AT&T/Lucent to Insightful Corp. Product name: S-plus.

R: initially written and released as an open source software by RossIhaka and Robert Gentleman at U Auckland during 90s (R plays onname S)

Since 1997: international R-core team 1̃5 people & 1000s of codewriters and statisticians happy to share their libraries! AWESOME!

Al Nosedal and Alison Weir STA258H5 Winter 2017 3 / 88

But it’s open source...

Doesn’t that mean it’s lousy? NO!!

Provides hundreds of thousands of functions and algorithms

Lets users fix bugs and add functionality

It is CUTTING EDGE, statistically and every other way.

Ensures that researchers around the world - not just ones in richcountries - are the co-owners of the software tools needed to carry outresearch

Most of R is written in? R! This makes it quite easy to see whatfunctions are actually doing.

Al Nosedal and Alison Weir STA258H5 Winter 2017 4 / 88

What exactly is R?

R is used for data manipulation, statistics, and graphics.It is made of:

operators (+− < − ?) for calculations on vectors, arrays and matrices

a huge collection of functions

facilities for making unlimited types quality graphs

user contributed packages (sets of related functions)

the ability to interface with procedures written in C, C+, orFORTRAN and to write additional primitives.

Al Nosedal and Alison Weir STA258H5 Winter 2017 5 / 88

Advantages of R

Fast and free.

State of the art statistically

Only MATLAB has better graphics.

Lots of other programs use R (like Google).

Active user community

Excellent for simulation, programming, analysis ?

Forces you to think about your analysis.

Easy interface with database storage software

Al Nosedal and Alison Weir STA258H5 Winter 2017 6 / 88

Disadvantages of R

Not user friendly, minimal GUI.

Steep initial learning curve

No commercial support

Easy to make mistakes and not know.

Data prep and cleaning can be messy

Al Nosedal and Alison Weir STA258H5 Winter 2017 7 / 88

The R GUI

Al Nosedal and Alison Weir STA258H5 Winter 2017 8 / 88

Openning a script in R

Al Nosedal and Alison Weir STA258H5 Winter 2017 9 / 88

Homework

This course uses R. R is an open-source computing package which hasseen a huge growth in popularity in the last few years. R can bedownloaded from https://cran.r-project.org

Please, download R and bring your laptop next time.

Al Nosedal and Alison Weir STA258H5 Winter 2017 10 / 88

What is RStudio?

RStudio is a relatively new editor specially targeted at R. RStudio iscross-platform, free and open-source software.

Al Nosedal and Alison Weir STA258H5 Winter 2017 11 / 88

More homework: Obtaining RStudio

Just go to:

http://www.rstudio.com

download the corresponding file, execute it locally and follow theinstructions given by the installer.

Al Nosedal and Alison Weir STA258H5 Winter 2017 12 / 88

R Studio

Al Nosedal and Alison Weir STA258H5 Winter 2017 13 / 88

Getting your data into R Studio

1 Save data file as a something .csv file.

2 Click on Import Dataset in top right window

Al Nosedal and Alison Weir STA258H5 Winter 2017 14 / 88

Working in R Studio

Enter your commands in the console windowArithmetic Operations

2+3;

## [1] 5

3-2;

## [1] 1

3*2;

## [1] 6

Al Nosedal and Alison Weir STA258H5 Winter 2017 15 / 88

Working in R Studio

Enter your commands in the console windowArithmetic Operations

3/2;

## [1] 1.5

3^2;

## [1] 9

Al Nosedal and Alison Weir STA258H5 Winter 2017 16 / 88

Working in R Studio

Enter your commands in the console windowAssignment

To assign a value to a variable use < −Example:

a<-19;

Al Nosedal and Alison Weir STA258H5 Winter 2017 17 / 88

How to use help in R?

If you know which function you want help with simply use help. Example:

help(histogram);

Al Nosedal and Alison Weir STA258H5 Winter 2017 18 / 88

For now: R-Fiddle

R-Fiddle is a programming environment for R available online. It allows usto encode and to run a program written in R. The tool is available at thisURL: http://www.r-fiddle.org

Al Nosedal and Alison Weir STA258H5 Winter 2017 19 / 88

Old Faithful Geyser Data

DescriptionWaiting time between eruptions and the duration of the eruption for theOld Faithful geyser in Yellowstone National Park, Wyoming, USA.(A data frame with 272 observations on 2 variables.)

Al Nosedal and Alison Weir STA258H5 Winter 2017 20 / 88

R Fiddle

Al Nosedal and Alison Weir STA258H5 Winter 2017 21 / 88

R Fiddle

Al Nosedal and Alison Weir STA258H5 Winter 2017 22 / 88

R Fiddle

Al Nosedal and Alison Weir STA258H5 Winter 2017 23 / 88

R Fiddle

Al Nosedal and Alison Weir STA258H5 Winter 2017 24 / 88

Histogram

## Basic plot.

hist(faithful$eruptions);

Al Nosedal and Alison Weir STA258H5 Winter 2017 25 / 88

Histogram

Histogram of faithful$eruptions

faithful$eruptions

Fre

quen

cy

2 3 4 5

020

4060

Al Nosedal and Alison Weir STA258H5 Winter 2017 26 / 88

Histogram (with title)

## Nicer plot.

hist(faithful$eruptions,

main="Duration of Old Faithful Eruptions (min)");

Al Nosedal and Alison Weir STA258H5 Winter 2017 27 / 88

Histogram (with title)

Duration of Old Faithful Eruptions (min)

faithful$eruptions

Fre

quen

cy

2 3 4 5

020

4060

Al Nosedal and Alison Weir STA258H5 Winter 2017 28 / 88

Histogram

## Add axes labels and color.

hist(faithful$eruptions,

main="Duration of Old Faithful Eruptions (min)",

xlab="duration",ylab="count", col="blue");

Al Nosedal and Alison Weir STA258H5 Winter 2017 29 / 88

Histogram

Duration of Old Faithful Eruptions (min)

duration

coun

t

2 3 4 5

020

4060

Al Nosedal and Alison Weir STA258H5 Winter 2017 30 / 88

Boxplot

## Basic plot.

boxplot(faithful$eruptions);

Al Nosedal and Alison Weir STA258H5 Winter 2017 31 / 88

Boxplot

1.5

2.5

3.5

4.5

Al Nosedal and Alison Weir STA258H5 Winter 2017 32 / 88

Boxplot

## Add axes labels and color.

boxplot(faithful$eruptions,

main="Duration of Old Faithful Eruptions (min)",

xlab="duration",ylab="count", col="blue");

Al Nosedal and Alison Weir STA258H5 Winter 2017 33 / 88

Boxplot

1.5

2.5

3.5

4.5

Duration of Old Faithful Eruptions (min)

duration

coun

t

Al Nosedal and Alison Weir STA258H5 Winter 2017 34 / 88

Scatterplot

## Basic plot.

plot(faithful$eruptions,faithful$waiting);

Al Nosedal and Alison Weir STA258H5 Winter 2017 35 / 88

Scatterplot

●●

●●

●● ●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

● ●

● ●

●●

●●

●●●● ●

●●

●●

●●

● ●

● ●

1.5 2.5 3.5 4.5

5070

90

faithful$eruptions

faith

ful$

wai

ting

Al Nosedal and Alison Weir STA258H5 Winter 2017 36 / 88

Scatterplot

## Nicer plot.

plot(faithful$eruptions,faithful$waiting,

main="Eruption Duration vs Waiting Times (mins)",

xlab="duration",ylab="waiting time",

pch=19, col="blue");

Al Nosedal and Alison Weir STA258H5 Winter 2017 37 / 88

Scatterplot

●●

●●

●● ●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

● ●

● ●

●●

●●

●● ●● ●

●●

●●

●●

● ●

● ●

1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0

5070

90Eruption Duration vs Waiting Times (mins)

duration

wai

ting

time

Al Nosedal and Alison Weir STA258H5 Winter 2017 38 / 88

Making a panel of graphs

If you want more than one graph in a panel.

par(mfrow=c(nrow,ncol ) )

# where nrow= number of rows

# and ncol=number of columns;

Al Nosedal and Alison Weir STA258H5 Winter 2017 39 / 88

Panel of graphs

par(mfrow=c(1,2) )

hist(faithful$eruptions,

main="Duration (min)",

xlab="duration",ylab="count", col="blue");

hist(faithful$waiting,

main="Waiting (min)",

xlab="waiting time",ylab="count", col="red");

Al Nosedal and Alison Weir STA258H5 Winter 2017 40 / 88

Panel of graphs

Duration (min)

duration

coun

t

2 3 4 5

020

4060

Waiting (min)

waiting time

coun

t

40 70 100

010

3050

Al Nosedal and Alison Weir STA258H5 Winter 2017 41 / 88

Panel of graphs

prototype<-rnorm(1000,mean=0,sd=1);

par(mfrow=c(2,2) )

hist(prototype,

main="prototype",

col="orange");

hist(faithful$eruptions,

main="Duration (min)",

xlab="duration",ylab="count", col="blue");

hist(faithful$waiting,

main="Waiting (min)",

xlab="waiting time",ylab="count", col="red");

Al Nosedal and Alison Weir STA258H5 Winter 2017 42 / 88

Panel of graphs

prototype

prototype

Fre

quen

cy

−2 0 2 4

0

Duration (min)

duration

coun

t

2 3 4 5

0

Waiting (min)

waiting time

coun

t

40 60 80 100

0

Al Nosedal and Alison Weir STA258H5 Winter 2017 43 / 88

Problem

How much do people with a bachelor’s degree (but no higher degree)earn? Here are the incomes of 15 such people, chosen at random by theCensus Bureau in March 2002 and asked how much they earned in 2001.Most people reported their incomes to the nearest thousand dollars, so wehave rounded their responses to thousands of dollars: 110 25 50 50 55 3035 30 4 32 50 30 32 74 60.How could we find the ”typical” income for people with a bachelor’sdegree (but no higher degree)?

Al Nosedal and Alison Weir STA258H5 Winter 2017 44 / 88

Measuring center: the median

The median M is the midpoint of a distribution, the number such thathalf the observations are smaller and the other half are larger. To find themedian of the distribution:Arrange all observations in order of size, from smallest to largest.If the number of observations n is odd, the median M is the centerobservation in the ordered list. Find the location of the median bycounting n+1

2 observations up from the bottom of the list.If the number of observations n is even, the median M is the mean of thetwo center observations in the ordered list. Find the location of themedian by counting n+1

2 observations up from the bottom of the list.

Al Nosedal and Alison Weir STA258H5 Winter 2017 45 / 88

Income Problem (Median)

We know that if we want to find the median, M, we have to order ourobservations from smallest to largest: 4 25 30 30 30 32 32 35 50 50 50 5560 74 110. Let’s find the location of Mlocation of M = n+1

2 = 15+12 = 8

Therefore, M = x8 = 35 (x8= 8th observation on our ordered list).

Al Nosedal and Alison Weir STA258H5 Winter 2017 46 / 88

The quartiles Q1 and Q3

To calculate the quartiles:Arrange the observations in increasing order and locate the median M inthe ordered list of observations.The first quartile Q1 is the median of the observations whose position inthe ordered list is to the left of the location of the overall median.The third quartile Q3 is the median of the observations whose position inthe ordered list is to the right of the location of the overall median.

Al Nosedal and Alison Weir STA258H5 Winter 2017 47 / 88

Income Problem (Q1)

Data:4 25 30 30 30 32 32 35 50 50 50 55 60 74 110.From previous work, we know that M = x8 = 35.This implies that the first half of our data has n1 = 7 observations. Let usfind the location of Q1:location of Q1 = n1+1

2 = 7+12 = 4.

This means that Q1 = x4 = 30.

Al Nosedal and Alison Weir STA258H5 Winter 2017 48 / 88

Income Problem (Q3)

Data:4 25 30 30 30 32 32 35 50 50 50 55 60 74 110.From previous work, we know that M = x8 = 35.This implies that the first half of our data has n2 = 7 observations. Let usfind the location of Q3:location of Q3 = n2+1

2 = 7+12 = 4.

This means that Q3 = 55.

Al Nosedal and Alison Weir STA258H5 Winter 2017 49 / 88

Five-number summary

The five-number summary of a distribution consists of the smallestobservation, the first quartile, the median, the third quartile, and thelargest observation, written in order from smallest to largest. In symbols,the five-number summary ismin Q1 M Q3 MAX .

Al Nosedal and Alison Weir STA258H5 Winter 2017 50 / 88

Income Problem (five-number summary)

Data: 4 25 30 30 30 32 32 35 50 50 50 55 60 74 110. The five-numbersummary for our income problem is given by:4 30 35 55 110

Al Nosedal and Alison Weir STA258H5 Winter 2017 51 / 88

R Code

# Step 1. Entering Data;

income=c(4,25,30,30,30,32,32,35,50,50,50,55,60,74,110);

Al Nosedal and Alison Weir STA258H5 Winter 2017 52 / 88

R Code

# Step 2. Finding five-number summary;

fivenum(income);

Al Nosedal and Alison Weir STA258H5 Winter 2017 53 / 88

R Code

## [1] 4.0 30.0 35.0 52.5 110.0

Note. Sometimes, R will give you a slightly different five-number summary.

Al Nosedal and Alison Weir STA258H5 Winter 2017 54 / 88

Box plot

A boxplot is a graph of the five-number summary.A central box spans the quartiles Q1 and Q3.A line in the box marks the median M.Lines extended from the box out to the smallest and largest observations.

Al Nosedal and Alison Weir STA258H5 Winter 2017 55 / 88

Boxplot

par(mfrow=c(1,3) )

boxplot(prototype,

main="prototype",

col="orange");

boxplot(faithful$eruptions,

main="eruption duration ",

col="blue");

boxplot(faithful$waiting,

main="time between eruptions",

col="red");

Al Nosedal and Alison Weir STA258H5 Winter 2017 56 / 88

Boxplot

●●

−2

02

4

prototype

1.5

2.0

2.5

3.0

3.5

4.0

4.5

5.0

eruption duration

5060

7080

90

time between eruptions

Al Nosedal and Alison Weir STA258H5 Winter 2017 57 / 88

Boxplot

●●●

●●

−2

02

4

prototype

Should be:

Symmetric

≈ 5% outliers

Tails ≈ 1.5 IQR

Al Nosedal and Alison Weir STA258H5 Winter 2017 58 / 88

The 68-95-99.7 rule

In the Normal distribution with mean µ and standard deviation σ:Approximately 68% of the observations fall within σ of the mean µ.Approximately 95% of the observations fall within 2σ of µ.Approximately 99.7% of the observations fall within 3σ of µ.Note. The 68-95-99.7 rule is also know as the empirical rule.

Al Nosedal and Alison Weir STA258H5 Winter 2017 59 / 88

Example N(µ = 0, σ = 1)

−3 −2 −1 0 1 2 3

0.0

0.1

0.2

0.3

0.4

Roughly 68%

Al Nosedal and Alison Weir STA258H5 Winter 2017 60 / 88

Example N(µ = 0, σ = 1)

−3 −2 −1 0 1 2 3

0.0

0.1

0.2

0.3

0.4

Roughly 95%

Al Nosedal and Alison Weir STA258H5 Winter 2017 61 / 88

R Code

meanP<-mean(prototype);

sdP<-sqrt(var(prototype));

lower<-meanP-sdP;

upper<-meanP+sdP;

N<-length(prototype);

100*length(prototype[ lower<prototype & prototype<upper])/N;

## [1] 69.8

Al Nosedal and Alison Weir STA258H5 Winter 2017 62 / 88

R Code

meanP<-mean(prototype);

sdP<-sqrt(var(prototype));

lower<-meanP-2*sdP;

upper<-meanP+2*sdP;

N<-length(prototype);

100*length(prototype[ lower<prototype & prototype<upper])/N;

## [1] 94.9

Al Nosedal and Alison Weir STA258H5 Winter 2017 63 / 88

R Code

meanP<-mean(prototype);

sdP<-sqrt(var(prototype));

lower<-meanP-3*sdP;

upper<-meanP+3*sdP;

N<-length(prototype);

100*length(prototype[ lower<prototype & prototype<upper])/N;

## [1] 99.5

Al Nosedal and Alison Weir STA258H5 Winter 2017 64 / 88

R Code

eruptions<-faithful$eruptions;

meanE<-mean(eruptions);

sdE<-sqrt(var(eruptions));

lower<-meanE-sdE;

upper<-meanE+sdE;

N<-length(eruptions);

100*length(eruptions[ lower<eruptions & eruptions<upper])/N;

## [1] 55.14706

Al Nosedal and Alison Weir STA258H5 Winter 2017 65 / 88

R Code

eruptions<-faithful$eruptions;

meanE<-mean(eruptions);

sdE<-sqrt(var(eruptions));

lower<-meanE-2*sdE;

upper<-meanE+2*sdE;

N<-length(eruptions);

100*length(eruptions[ lower<eruptions & eruptions<upper])/N;

## [1] 100

Al Nosedal and Alison Weir STA258H5 Winter 2017 66 / 88

R Code

eruptions<-faithful$eruptions;

meanE<-mean(eruptions);

sdE<-sqrt(var(eruptions));

lower<-meanE-3*sdE;

upper<-meanE+3*sdE;

N<-length(eruptions);

100*length(eruptions[ lower<eruptions & eruptions<upper])/N;

## [1] 100

Al Nosedal and Alison Weir STA258H5 Winter 2017 67 / 88

Homework?

Modify R Code given above and see what happens with waiting time.

Al Nosedal and Alison Weir STA258H5 Winter 2017 68 / 88

Q-Q Plot (Example)

A sample of n = 10 observations gives the values in the following table:

Ordered Observations Probability levels Standard Normalx(j) (j − 1/2)/n Quantiles q(j)-1 0.05 -1.645

-0.10 0.15 -1.0360.16 0.25 -0.6740.41 0.35 -0.3850.62 0.45 -0.1250.80 0.55 0.1251.26 0.65 0.3851.54 0.75 0.6741.71 0.85 1.0362.30 0.95 1.645

Here, for example, P[Z ≤ 0.385] =∫ 0.385−∞

1√2πe−z

2/2dz = 0.65.

Al Nosedal and Alison Weir STA258H5 Winter 2017 69 / 88

Q-Q Plot (Example)

Let us now construct the Q-Q plot and comment on its appearance. TheQ-Q plot for the foregoing data, which is a plot of the ordered data x(j)against the normal quantiles is shown below. The pairs of points(q(j), x(j)) lie very nearly along a straight line, and we would not reject thenotion that these data are Normally distributed-particularly with a samplesize as small as n = 10.

Al Nosedal and Alison Weir STA258H5 Winter 2017 70 / 88

R Code

## Ordered observations;

obs<-c(-1,-0.1,0.16,0.41,0.62,0.80,1.26,1.54,1.71,2.30);

n<-length(obs);

## Corresponding probability values;

prob.levels<-(seq(1:n)-0.5)/n;

## Standard Normal Quantiles;

norm.quantiles<-qnorm(prob.levels);

Al Nosedal and Alison Weir STA258H5 Winter 2017 71 / 88

R Code

## Q-Q plot;

plot(norm.quantiles,obs,

xlab=expression(q[(j)]),

ylab=expression(x[(j)]),

main="Ours",col="blue",pch=19);

## Q-Q plot (using R function);

qqnorm(obs,col="blue",pch=19);

Al Nosedal and Alison Weir STA258H5 Winter 2017 72 / 88

R Code

●●

●●

●●

−1.5 −0.5 0.0 0.5 1.0 1.5

−1.

00.

01.

02.

0Ours

q(j)

x (j)

Al Nosedal and Alison Weir STA258H5 Winter 2017 73 / 88

R Code

## Q-Q plot (using R function);

qqnorm(obs,col="blue",pch=19);

Al Nosedal and Alison Weir STA258H5 Winter 2017 74 / 88

R Code

●●

●●

●●

−1.5 −1.0 −0.5 0.0 0.5 1.0 1.5

−1.

00.

01.

02.

0Normal Q−Q Plot

Theoretical Quantiles

Sam

ple

Qua

ntile

s

Al Nosedal and Alison Weir STA258H5 Winter 2017 75 / 88

Q-Q plots

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

● ●

● ●●●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●●

●●

●●

●●

●●

●●●

●●

● ●

●●

●●

●●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●●

●●●

●●

●●●

●●●

●●

●●

●●

●●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●●

●●

●●●

●●●

●●

●●●●

●●

●●

●●

●●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●●●

●●●

●●

●●

●●●

●●

●●

●●

●●●

●●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●●

●●●

●●

●●●

●●

●●●

●●

●●●

●●●

●●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●●

●●

●●

●●

●●●●

●●●●

●●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

−3 −2 −1 0 1 2 3

−3

−1

12

3prototype

Theoretical Quantiles

Sam

ple

Qua

ntile

s

Al Nosedal and Alison Weir STA258H5 Winter 2017 76 / 88

Q-Q plots

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

−3 −2 −1 0 1 2 3

1.5

2.5

3.5

4.5

Duration (min)

duration

coun

t

Al Nosedal and Alison Weir STA258H5 Winter 2017 77 / 88

Q-Q plots

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●●●

●●

●●

●●

●●

●●

−3 −2 −1 0 1 2 3

5070

90Waiting (min)

waiting time

coun

t

Al Nosedal and Alison Weir STA258H5 Winter 2017 78 / 88

If plots are OK → data could come from a Normal distribution . . .but it could come from some other distribution. So good plots don’tprove data came from a Normal distribution

If plots are not OK → data probably does not come from a Normaldistribution, we can’t assume data is from Normal population

Al Nosedal and Alison Weir STA258H5 Winter 2017 79 / 88

Panel of graphs

n=30, N(0,1)

y1

Fre

quen

cy

−3 −1 1 2

0

n=30, N(0,1)

y2

Fre

quen

cy

−3 −1 1 3

0

n=30, N(0,1)

y3

Fre

quen

cy

−3 −1 1 2

0

n=30, N(0,1)

y4

Fre

quen

cy

−2 0 1 2

06

Al Nosedal and Alison Weir STA258H5 Winter 2017 80 / 88

Q-Q plots

●●

●●

●●●

−2 0 2

−2

−1

01

2n=30, N(0,1)

Theoretical Quantiles

Sam

ple

Qua

ntile

s

●●

●●

●●

−2 0 2

−2

−1

01

2

n=30, N(0,1)

Theoretical Quantiles

Sam

ple

Qua

ntile

s

Al Nosedal and Alison Weir STA258H5 Winter 2017 81 / 88

Q-Q plots

●●

●●

●●

●●

●●

●●

−2 0 2

−2

−1

01

n=30, N(0,1)

Theoretical Quantiles

Sam

ple

Qua

ntile

s

● ●

●●

●●

●●

●●

−2 0 2

−2

−1

01

2

n=30, N(0,1)

Theoretical Quantiles

Sam

ple

Qua

ntile

s

Al Nosedal and Alison Weir STA258H5 Winter 2017 82 / 88

Panel of graphs

n=70, N(0,1)

y1

Fre

quen

cy

−2 0 1 2 3

0

n=70, N(0,1)

y2

Fre

quen

cy

−4 −2 0 2

0

n=70, N(0,1)

y3

Fre

quen

cy

−3 −1 1 2

0

n=70, N(0,1)

y4

Fre

quen

cy

−2 0 1 2

0

Al Nosedal and Alison Weir STA258H5 Winter 2017 83 / 88

QQ plots

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

−2 0 2

−1

01

23

n=70, N(0,1)

Theoretical Quantiles

Sam

ple

Qua

ntile

s

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

−2 0 2

−3

−1

12

n=70, N(0,1)

Theoretical Quantiles

Sam

ple

Qua

ntile

s

Al Nosedal and Alison Weir STA258H5 Winter 2017 84 / 88

QQ plots

●●

●●

●●●

●●●

●●

●●

●●

●●●

●●

●●●●

−2 0 2

−2

01

2n=70, N(0,1)

Theoretical Quantiles

Sam

ple

Qua

ntile

s

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

−2 0 2

−1

01

2

n=70, N(0,1)

Theoretical Quantiles

Sam

ple

Qua

ntile

s

Al Nosedal and Alison Weir STA258H5 Winter 2017 85 / 88

Panel of graphs

n=120, N(0,1)

y1

Fre

quen

cy

−2 0 1 2

0

n=120, N(0,1)

y2

Fre

quen

cy

−4 −2 0 2

0

n=120, N(0,1)

y3

Fre

quen

cy

−3 −1 1 2

0

n=120, N(0,1)

y4

Fre

quen

cy

−4 −2 0 2

0

Al Nosedal and Alison Weir STA258H5 Winter 2017 86 / 88

QQ plots

●●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

−2 0 2

−2

−1

01

n=120, N(0,1)

Theoretical Quantiles

Sam

ple

Qua

ntile

s

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●●

●●

●●●

−2 0 2

−4

−2

01

2

n=120, N(0,1)

Theoretical Quantiles

Sam

ple

Qua

ntile

s

Al Nosedal and Alison Weir STA258H5 Winter 2017 87 / 88

QQ plots

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

−2 0 2

−3

−1

01

2n=120, N(0,1)

Theoretical Quantiles

Sam

ple

Qua

ntile

s

●●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

−2 0 2

−3

−1

12

n=120, N(0,1)

Theoretical Quantiles

Sam

ple

Qua

ntile

s

Al Nosedal and Alison Weir STA258H5 Winter 2017 88 / 88

Recommended