34
Statistical Graphics About R Graphics in R The rgl package Exploring Data with Statistical Graphics in R Duncan Murdoch Department of Statistical and Actuarial Sciences University of Western Ontario November 23, 2011 1 of 32

Exploring Data with Statistical Graphics in Rrdc.uwo.ca/.../presentation_slides/2011-12/Murdoch-RGraphics2011.pdf · Exploring Data with Statistical Graphics in R ... they should

Embed Size (px)

Citation preview

Page 1: Exploring Data with Statistical Graphics in Rrdc.uwo.ca/.../presentation_slides/2011-12/Murdoch-RGraphics2011.pdf · Exploring Data with Statistical Graphics in R ... they should

Statistical GraphicsAbout R

Graphics in RThe rgl package

Exploring Data with Statistical Graphics in R

Duncan Murdoch

Department of Statistical and Actuarial SciencesUniversity of Western Ontario

November 23, 2011

1 of 32

Page 2: Exploring Data with Statistical Graphics in Rrdc.uwo.ca/.../presentation_slides/2011-12/Murdoch-RGraphics2011.pdf · Exploring Data with Statistical Graphics in R ... they should

Statistical GraphicsAbout R

Graphics in RThe rgl package

1 Statistical Graphics

2 About RR HistoryWhat is R today?

3 Graphics in ROverviewExamples

4 The rgl package

2 of 32

Page 3: Exploring Data with Statistical Graphics in Rrdc.uwo.ca/.../presentation_slides/2011-12/Murdoch-RGraphics2011.pdf · Exploring Data with Statistical Graphics in R ... they should

Statistical GraphicsAbout R

Graphics in RThe rgl package

1 Statistical Graphics

2 About RR HistoryWhat is R today?

3 Graphics in ROverviewExamples

4 The rgl package

3 of 32

Page 4: Exploring Data with Statistical Graphics in Rrdc.uwo.ca/.../presentation_slides/2011-12/Murdoch-RGraphics2011.pdf · Exploring Data with Statistical Graphics in R ... they should

Statistical GraphicsAbout R

Graphics in RThe rgl package

Example

In STAT 1024, I use multiple choice exams. I use multipleversions of the test to make it harder to cheat.The software our department uses (developed by others)prints tables of correct answers for each version of the test;we can look at those to recognize problem questions.Doing it graphically is better...

4 of 32

Page 5: Exploring Data with Statistical Graphics in Rrdc.uwo.ca/.../presentation_slides/2011-12/Murdoch-RGraphics2011.pdf · Exploring Data with Statistical Graphics in R ... they should

Statistical GraphicsAbout R

Graphics in RThe rgl package

Example

In STAT 1024, I use multiple choice exams. I use multipleversions of the test to make it harder to cheat.The software our department uses (developed by others)prints tables of correct answers for each version of the test;we can look at those to recognize problem questions.Doing it graphically is better...

4 of 32

Page 6: Exploring Data with Statistical Graphics in Rrdc.uwo.ca/.../presentation_slides/2011-12/Murdoch-RGraphics2011.pdf · Exploring Data with Statistical Graphics in R ... they should

Statistical GraphicsAbout R

Graphics in RThe rgl package

Example

In STAT 1024, I use multiple choice exams. I use multipleversions of the test to make it harder to cheat.The software our department uses (developed by others)prints tables of correct answers for each version of the test;we can look at those to recognize problem questions.Doing it graphically is better...

4 of 32

Page 7: Exploring Data with Statistical Graphics in Rrdc.uwo.ca/.../presentation_slides/2011-12/Murdoch-RGraphics2011.pdf · Exploring Data with Statistical Graphics in R ... they should

Statistical GraphicsAbout R

Graphics in RThe rgl package

A B C D E

Q23 96% correct

A B C D E

350

574

790

889

A B C D EA B C D EA B C D E A B C D E

Q21 92% correct

A B C D E

350

574

790

889

A B C D EA B C D EA B C D E

A B C D E

Q4 89% correct

A B C D E

350

574

790

889

A B C D EA B C D EA B C D E A B C D E

Q3 89% correct

A B C D E

350

574

790

889

A B C D EA B C D EA B C D E

A B C D E

Q5 88% correct

A B C D E

350

574

790

889

A B C D EA B C D EA B C D E A B C D E

Q14 84% correct

A B C D E

350

574

790

889

A B C D EA B C D EA B C D E

5 of 32

Page 8: Exploring Data with Statistical Graphics in Rrdc.uwo.ca/.../presentation_slides/2011-12/Murdoch-RGraphics2011.pdf · Exploring Data with Statistical Graphics in R ... they should

Statistical GraphicsAbout R

Graphics in RThe rgl package

A B C D E

Q25 40% correct

A B C D E

350

574

790

889

A B C D EA B C D EA B C D E A B C D E

Q28 39% correct

A B C D E

350

574

790

889

A B C D EA B C D EA B C D E

A B C D E

Q9 38% correct

A B C D E

350

574

790

889

A B C D EA B C D EA B C D E A B C D E

Q29 33% correct

A B C D E

350

574

790

889

A B C D EA B C D EA B C D E

A B C D E

Q13 33% correct

A B C D E

350

574

790

889

A B C D EA B C D EA B C D E A B C D E

Q16 15% correct

A B C D E

350

574

790

889

A B C D EA B C D EA B C D E

6 of 32

Page 9: Exploring Data with Statistical Graphics in Rrdc.uwo.ca/.../presentation_slides/2011-12/Murdoch-RGraphics2011.pdf · Exploring Data with Statistical Graphics in R ... they should

Statistical GraphicsAbout R

Graphics in RThe rgl package

R HistoryWhat is R today?

1 Statistical Graphics

2 About RR HistoryWhat is R today?

3 Graphics in ROverviewExamples

4 The rgl package

7 of 32

Page 10: Exploring Data with Statistical Graphics in Rrdc.uwo.ca/.../presentation_slides/2011-12/Murdoch-RGraphics2011.pdf · Exploring Data with Statistical Graphics in R ... they should

Statistical GraphicsAbout R

Graphics in RThe rgl package

R HistoryWhat is R today?

What is R?

R is a software environment for statistical computing andgraphics.It compiles and runs on a wide variety of UNIX platforms,Windows and MacOS.It is free (in both senses).Its web site is http://www.r-project.org.

8 of 32

Page 11: Exploring Data with Statistical Graphics in Rrdc.uwo.ca/.../presentation_slides/2011-12/Murdoch-RGraphics2011.pdf · Exploring Data with Statistical Graphics in R ... they should

Statistical GraphicsAbout R

Graphics in RThe rgl package

R HistoryWhat is R today?

First there was S...

S began as a project at Bell Laboratories in 1976, involvingJohn Chambers, Rick Becker, Doug Dunn, Paul Tukey, andGraham Wilkinson.

We wanted users to be able to begin in an interactiveenvironment, where they did not consciously think ofthemselves as programming. Then as their needsbecame clearer and their sophistication increased,they should be able to slide gradually intoprogramming, when the language and system aspectswould become more important.

— John Chambers, in Stages in the Evolution of S

9 of 32

Page 12: Exploring Data with Statistical Graphics in Rrdc.uwo.ca/.../presentation_slides/2011-12/Murdoch-RGraphics2011.pdf · Exploring Data with Statistical Graphics in R ... they should

Statistical GraphicsAbout R

Graphics in RThe rgl package

R HistoryWhat is R today?

The Birth of R

Around 1992, Ross Ihaka and Robert Gentleman at theUniversity of Auckland needed statistical software for ateaching lab. S-PLUS (a commercial implementation of S)didn’t run on MacOS; they decided to write their own. Ross wasvery impressed with Scheme (a Lisp dialect); they based theinitial code on the design of Scheme interpreters.

To make the interpreter useful, we had to add datastructures to support statistical work and to choose auser interface. We wanted a command driveninterface and, since we were both very familiar with S,it seemed natural to use an S-like syntax.

— Ross Ihaka, in R: Past and Future History

10 of 32

Page 13: Exploring Data with Statistical Graphics in Rrdc.uwo.ca/.../presentation_slides/2011-12/Murdoch-RGraphics2011.pdf · Exploring Data with Statistical Graphics in R ... they should

Statistical GraphicsAbout R

Graphics in RThe rgl package

R HistoryWhat is R today?

R is a Moving Target

R has two “new feature” releases and approximately two“patch” releases per year.Since October 2004, there have been about 1800 NEWSentries about changes to R. Most of these are “tweaking”:making the code more consistent, bug-free, efficient, etc.We try to balance back-compatibility with progress: mostold code will still work, but not all.

11 of 32

Page 14: Exploring Data with Statistical Graphics in Rrdc.uwo.ca/.../presentation_slides/2011-12/Murdoch-RGraphics2011.pdf · Exploring Data with Statistical Graphics in R ... they should

Statistical GraphicsAbout R

Graphics in RThe rgl package

R HistoryWhat is R today?

Contributed Packages!

There are about 3350 contributed packages on CRAN(http://cran.r-project.org), 1100 packages inBioconductor (http://www.bioconductor.org), manyother packages in private use.Publishing an R package has become an accepted (almostexpected) way to disseminate statistical research.

12 of 32

Page 15: Exploring Data with Statistical Graphics in Rrdc.uwo.ca/.../presentation_slides/2011-12/Murdoch-RGraphics2011.pdf · Exploring Data with Statistical Graphics in R ... they should

Statistical GraphicsAbout R

Graphics in RThe rgl package

OverviewExamples

1 Statistical Graphics

2 About RR HistoryWhat is R today?

3 Graphics in ROverviewExamples

4 The rgl package

13 of 32

Page 16: Exploring Data with Statistical Graphics in Rrdc.uwo.ca/.../presentation_slides/2011-12/Murdoch-RGraphics2011.pdf · Exploring Data with Statistical Graphics in R ... they should

Statistical GraphicsAbout R

Graphics in RThe rgl package

OverviewExamples

Several Different Systems

R has two underlying graphics systems:Classic S graphics is an “ink on paper” model: functionsadd things to the current graphics device, possibly coveringwhat was there, but only erasing the whole frame at once.Paul Murrell’s grid graphics is based on a hierarchicalmodel of a scene, with transformations possible at eachlevel. It is much more flexible than classic graphics,allowing objects to be changed, but is not itself interactive.

14 of 32

Page 17: Exploring Data with Statistical Graphics in Rrdc.uwo.ca/.../presentation_slides/2011-12/Murdoch-RGraphics2011.pdf · Exploring Data with Statistical Graphics in R ... they should

Statistical GraphicsAbout R

Graphics in RThe rgl package

OverviewExamples

High Level Packages

Classic high level graphics are supported by the graphpackage.Deepayan Sarkar’s lattice package implements aversion of Cleveland’s trellis graphics using grid.Hadley Wickham’s ggplot2 package implements aversion of Leland Wilkinson’s Grammar of Graphics usinggrid.

15 of 32

Page 18: Exploring Data with Statistical Graphics in Rrdc.uwo.ca/.../presentation_slides/2011-12/Murdoch-RGraphics2011.pdf · Exploring Data with Statistical Graphics in R ... they should

Statistical GraphicsAbout R

Graphics in RThe rgl package

OverviewExamples

Other Systems

rggobi is an interface to the GGobi package forinteractive graphics.rgl is a package that provides rotatable 3-D graphics, withsome classic S-like plotting functions.scatterplot3d does static 3-D graphics within theclassic graphics engine.iplots does interactive, linked graphics using Java.

16 of 32

Page 19: Exploring Data with Statistical Graphics in Rrdc.uwo.ca/.../presentation_slides/2011-12/Murdoch-RGraphics2011.pdf · Exploring Data with Statistical Graphics in R ... they should

Statistical GraphicsAbout R

Graphics in RThe rgl package

OverviewExamples

Learning about Graphics

I won’t cover all the graphics systems today, only classicgraphics and rgl.

For more depth I highly recommend Paul Murrell’s book RGraphics to learn either classic S graphics or grid.Deepayan Sarkar’s Lattice: Multivariate Data Visualizationwith R describes that system.Other books on R are listed at http://www.r-project.org/doc/bib/R-books.html.

17 of 32

Page 20: Exploring Data with Statistical Graphics in Rrdc.uwo.ca/.../presentation_slides/2011-12/Murdoch-RGraphics2011.pdf · Exploring Data with Statistical Graphics in R ... they should

Statistical GraphicsAbout R

Graphics in RThe rgl package

OverviewExamples

Scatterplots

> plot(cars$speed, cars$dist, col="blue",+ main="Main title", xlim = c(0, 30))

●●● ●

●●

●●●

● ●

●●

0 5 10 15 20 25 30

020

4060

8010

012

0

Main title

cars$speed

cars

$dis

t

18 of 32

Page 21: Exploring Data with Statistical Graphics in Rrdc.uwo.ca/.../presentation_slides/2011-12/Murdoch-RGraphics2011.pdf · Exploring Data with Statistical Graphics in R ... they should

Statistical GraphicsAbout R

Graphics in RThe rgl package

OverviewExamples

Bar plots

> barplot(VADeaths, beside=TRUE, legend=TRUE,+ ylab="Deaths per 1000", ylim=c(0, 90))

Rural Male Rural Female Urban Male Urban Female

50−5455−5960−6465−6970−74

Dea

ths

per

1000

020

4060

80

19 of 32

Page 22: Exploring Data with Statistical Graphics in Rrdc.uwo.ca/.../presentation_slides/2011-12/Murdoch-RGraphics2011.pdf · Exploring Data with Statistical Graphics in R ... they should

Statistical GraphicsAbout R

Graphics in RThe rgl package

OverviewExamples

Dot charts

> dotchart(VADeaths, xlab="Deaths per 1000",+ xlim=c(0, 75))

50−5455−5960−6465−6970−74

50−5455−5960−6465−6970−74

50−5455−5960−6465−6970−74

50−5455−5960−6465−6970−74

●●

●●

●●

●●

●●

●●

●●

●●

●Rural Male

Rural Female

Urban Male

Urban Female

0 20 40 60

Deaths per 1000

20 of 32

Page 23: Exploring Data with Statistical Graphics in Rrdc.uwo.ca/.../presentation_slides/2011-12/Murdoch-RGraphics2011.pdf · Exploring Data with Statistical Graphics in R ... they should

Statistical GraphicsAbout R

Graphics in RThe rgl package

OverviewExamples

Histograms

> x <- rnorm(10000)> hist(x, breaks="Scott")

Histogram of x

x

Fre

quen

cy

−4 −2 0 2 4

020

040

060

080

0

21 of 32

Page 24: Exploring Data with Statistical Graphics in Rrdc.uwo.ca/.../presentation_slides/2011-12/Murdoch-RGraphics2011.pdf · Exploring Data with Statistical Graphics in R ... they should

Statistical GraphicsAbout R

Graphics in RThe rgl package

OverviewExamples

QQ Plots

> x <- rt(1000, 3)> qqnorm(x); qqplot(rnorm(1000), x)

●●

●●

●●●

●●

●●

●●

●●●

●●

● ●●●● ●

●●

●●

●● ●●●

●●●

●●

● ●●

●●

● ●●●

●●

●●

●● ●●

●●

●●

●●

●●

●●● ●●

●●●

●● ●

●●

●●●

●●

●●

●●

●●● ●

● ●

● ●●●

●●●●

●●

●●●

● ●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

● ●●●

●● ●

● ●●●●

●●

●●

●●

●●●

● ●

● ●●

●●

●●

●●●

●●

●●

● ●●●

●●●

● ●

●●

●●

●●

●●

●●

●●

●●●

●●● ●●

● ●

●●

●●

●●●

●●●

● ●●●

●●●

●●

●●●●●

●●

●● ●● ●

●●

●●

●●●

●●

●●

●●

●●

●●

●●●●

●●●

●●●

●●

●●● ●● ●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ● ●

●●

●●●●

●●●

●●●

●● ●●

●●

●●

●●●

●●

●●

● ●

●●

●●●

●●

●●●

●●

●●

●●●●

●●

●●

●●●

●●

●●●● ●

●●

●●●

●●●

●●

●●●●

●●

●●●

●●●

●● ●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●● ●●

●●

●● ● ●

●●

●●

●●●

● ●●

●●

●●●

●●

●●●

●● ●●●

●●

●●●

●● ●●

●●

●●

●●●

●●●

●●

●●●

●●

●●

●●●

●●● ●

●●

●● ●

●●

●●

●● ●

●● ● ●●

●●

●●●●

●●

●●

●●●

● ● ●●

●●

●●●●

● ●●

●●●

●●●

●●

●●

●●

● ●●●●●●

●●●

●●● ●

●●

●●

●●

●●●

●●

●●

● ●●

●●

● ● ●●●●

●●●

●● ●

●●●●

●●

● ●●

●●

●●

●●

●●

●●●●

●● ● ●

●●

●●●

●●

●●

●●

●●

●●

●●

●●●

●● ●● ●

●●

●●

● ●●

●●

●●

● ●

●●

●●

●●

●●●●●

●●●

●●●

●● ●

●●

●●

●●

●●

●●

●● ●

●●●

●●

●●

●●

●●

●●●

●●

●● ●

●● ●●●

●●

●●

●●

●● ●

● ●●

−3 −2 −1 0 1 2 3

−15

−10

−5

05

1015

Normal Q−Q Plot

Theoretical Quantiles

Sam

ple

Qua

ntile

s

●●

●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●

●●●●●●●●

●●

−3 −2 −1 0 1 2 3

−15

−10

−5

05

1015

rnorm(1000)

x

22 of 32

Page 25: Exploring Data with Statistical Graphics in Rrdc.uwo.ca/.../presentation_slides/2011-12/Murdoch-RGraphics2011.pdf · Exploring Data with Statistical Graphics in R ... they should

Statistical GraphicsAbout R

Graphics in RThe rgl package

OverviewExamples

Box Plots

> boxplot(Sepal.Length ~ Species, data=iris)> # More on the formula interface later!

setosa versicolor virginica

4.5

5.0

5.5

6.0

6.5

7.0

7.5

8.0

23 of 32

Page 26: Exploring Data with Statistical Graphics in Rrdc.uwo.ca/.../presentation_slides/2011-12/Murdoch-RGraphics2011.pdf · Exploring Data with Statistical Graphics in R ... they should

Statistical GraphicsAbout R

Graphics in RThe rgl package

OverviewExamples

Mosaic Plots

> mosaicplot(~ Sex + Age + Survived,+ data = Titanic, shade = TRUE)

Sta

ndar

dize

dR

esid

uals

:<

−4

−4:

−2

−2:

00:

22:

4>

4

Titanic

Sex

Age

Male Female

Chi

ldA

dult

No Yes No Yes

24 of 32

Page 27: Exploring Data with Statistical Graphics in Rrdc.uwo.ca/.../presentation_slides/2011-12/Murdoch-RGraphics2011.pdf · Exploring Data with Statistical Graphics in R ... they should

Statistical GraphicsAbout R

Graphics in RThe rgl package

OverviewExamples

Maps

> library(mapdata)> map("worldHires",+ projection = "azequalarea",+ orientation = c(43, -81.25, 0),+ col = sample(terrain.colors(100)),+ fill =TRUE)

The maps package is by Ray Brownrigg, based on S code by Rick Becker and Allan Wilks.

25 of 32

Page 28: Exploring Data with Statistical Graphics in Rrdc.uwo.ca/.../presentation_slides/2011-12/Murdoch-RGraphics2011.pdf · Exploring Data with Statistical Graphics in R ... they should

Statistical GraphicsAbout R

Graphics in RThe rgl package

OverviewExamples

26 of 32

Page 29: Exploring Data with Statistical Graphics in Rrdc.uwo.ca/.../presentation_slides/2011-12/Murdoch-RGraphics2011.pdf · Exploring Data with Statistical Graphics in R ... they should

Statistical GraphicsAbout R

Graphics in RThe rgl package

OverviewExamples

Correlation Plots—plotcorr() in ellipse package

wt

cyl

disp

hp

carbqsec

gear

am

vs

dratmpg

wt

cyl

disp

hp carb

qsec

gear

am vs drat

mpg

27 of 32

Page 30: Exploring Data with Statistical Graphics in Rrdc.uwo.ca/.../presentation_slides/2011-12/Murdoch-RGraphics2011.pdf · Exploring Data with Statistical Graphics in R ... they should

Statistical GraphicsAbout R

Graphics in RThe rgl package

1 Statistical Graphics

2 About RR HistoryWhat is R today?

3 Graphics in ROverviewExamples

4 The rgl package

28 of 32

Page 31: Exploring Data with Statistical Graphics in Rrdc.uwo.ca/.../presentation_slides/2011-12/Murdoch-RGraphics2011.pdf · Exploring Data with Statistical Graphics in R ... they should

Statistical GraphicsAbout R

Graphics in RThe rgl package

The rgl package

Daniel Adler and I wrote this over the last few years. It’s awrapper for the OpenGL graphics rendering library:

Can display solid 3D objects, with real-time rotation,hidden surface removal, etc.High level functions corresponding to classic S 2Dgraphics: plot3d(), etc.Low level functions for flashy computer graphics effects.

29 of 32

Page 32: Exploring Data with Statistical Graphics in Rrdc.uwo.ca/.../presentation_slides/2011-12/Murdoch-RGraphics2011.pdf · Exploring Data with Statistical Graphics in R ... they should

Statistical GraphicsAbout R

Graphics in RThe rgl package

> x <- sort(rnorm(1000)); y <- rnorm(1000)> z <- rnorm(1000) + atan2(x,y)> plot3d(x,y,z, col=rainbow(1000), type="s")

30 of 32

Page 33: Exploring Data with Statistical Graphics in Rrdc.uwo.ca/.../presentation_slides/2011-12/Murdoch-RGraphics2011.pdf · Exploring Data with Statistical Graphics in R ... they should

Statistical GraphicsAbout R

Graphics in RThe rgl package

Now some demos...

31 of 32

Page 34: Exploring Data with Statistical Graphics in Rrdc.uwo.ca/.../presentation_slides/2011-12/Murdoch-RGraphics2011.pdf · Exploring Data with Statistical Graphics in R ... they should

Statistical GraphicsAbout R

Graphics in RThe rgl package

Summary

Graphical displays often give new insights into data.R provides a very flexible and rich system for producinggraphics.

32 of 32