59
STAT 408 - Week 7 Advanced R Graphics and ggplot2 Advanced R Graphics ggplot2 STAT 408 - Week 7 Advanced R Graphics and ggplot2 February 22, 2018

STAT 408 - Week 7 - Montana State Universitymath.montana.edu/ahoegh/teaching/stat408/lecture... · STAT 408 - Week 7 Advanced R Graphics and ggplot2 Advanced R Graphics ggplot2 NCAA

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: STAT 408 - Week 7 - Montana State Universitymath.montana.edu/ahoegh/teaching/stat408/lecture... · STAT 408 - Week 7 Advanced R Graphics and ggplot2 Advanced R Graphics ggplot2 NCAA

STAT 408 -Week 7

Advanced RGraphics

and ggplot2

Advanced RGraphics

ggplot2

STAT 408 - Week 7Advanced R Graphics and ggplot2

February 22, 2018

Page 2: STAT 408 - Week 7 - Montana State Universitymath.montana.edu/ahoegh/teaching/stat408/lecture... · STAT 408 - Week 7 Advanced R Graphics and ggplot2 Advanced R Graphics ggplot2 NCAA

STAT 408 -Week 7

Advanced RGraphics

and ggplot2

Advanced RGraphics

ggplot2

Advanced R Graphics

Page 3: STAT 408 - Week 7 - Montana State Universitymath.montana.edu/ahoegh/teaching/stat408/lecture... · STAT 408 - Week 7 Advanced R Graphics and ggplot2 Advanced R Graphics ggplot2 NCAA

STAT 408 -Week 7

Advanced RGraphics

and ggplot2

Advanced RGraphics

ggplot2

NCAA Basketball data

We will use data from the NCAA basketball tournament from2011 - 2016.url <- 'http://www.math.montana.edu/ahoegh/teaching/stat408/datasets/TourneyDetailedResults.csv'hoops <- read.csv(url,stringsAsFactors = FALSE)hoops.2011 <- filter(hoops, Season >= 2011)head(hoops.2011)

## Season Daynum Wteam Wscore Lteam Lscore Wloc Numot Wfgm Wfga Wfgm3 Wfga3## 1 2011 134 1155 70 1412 52 N 0 26 50 4 13## 2 2011 134 1421 81 1114 77 N 1 27 54 4 12## 3 2011 135 1427 70 1106 61 N 0 23 54 4 16## 4 2011 135 1433 59 1425 46 N 0 20 59 9 24## 5 2011 136 1139 60 1330 58 N 0 22 54 7 26## 6 2011 136 1140 74 1459 66 N 0 24 61 6 22## Wftm Wfta Wor Wdr Wast Wto Wstl Wblk Wpf Lfgm Lfga Lfgm3 Lfga3 Lftm Lfta## 1 14 16 4 25 17 12 10 2 12 18 48 12 24 4 7## 2 23 28 6 29 17 15 6 1 25 24 56 9 29 20 26## 3 20 30 10 30 14 13 4 0 14 22 62 7 26 10 12## 4 10 15 17 23 11 9 5 3 24 15 38 1 9 15 25## 5 9 11 18 14 11 15 8 1 21 16 45 5 15 21 27## 6 20 24 10 29 14 10 9 1 22 22 56 4 19 18 24## Lor Ldr Last Lto Lstl Lblk Lpf## 1 7 22 10 19 4 3 10## 2 7 26 19 16 6 2 24## 3 11 28 13 12 6 3 22## 4 5 26 4 15 2 6 20## 5 13 16 8 15 9 2 17## 6 7 29 12 13 4 4 20

Page 4: STAT 408 - Week 7 - Montana State Universitymath.montana.edu/ahoegh/teaching/stat408/lecture... · STAT 408 - Week 7 Advanced R Graphics and ggplot2 Advanced R Graphics ggplot2 NCAA

STAT 408 -Week 7

Advanced RGraphics

and ggplot2

Advanced RGraphics

ggplot2

Compute annual averages

points <- hoops.2011 %>%group_by(Season) %>%summarise(Win.Points = mean(Wscore),

Lose.Points = mean(Lscore),Win.3Pt = mean(Wfgm3),Lose.3pt = mean(Lfgm3))

head(points)

## # A tibble: 6 x 5## Season Win.Points Lose.Points Win.3Pt Lose.3pt## <int> <dbl> <dbl> <dbl> <dbl>## 1 2011 73.19403 61.91045 7.179104 5.955224## 2 2012 71.40299 61.52239 5.970149 5.925373## 3 2013 72.20896 59.34328 6.701493 5.313433## 4 2014 73.85075 62.92537 6.194030 5.462687## 5 2015 72.92537 62.62687 6.343284 6.223881## 6 2016 78.34328 65.35821 7.179104 6.522388

Page 5: STAT 408 - Week 7 - Montana State Universitymath.montana.edu/ahoegh/teaching/stat408/lecture... · STAT 408 - Week 7 Advanced R Graphics and ggplot2 Advanced R Graphics ggplot2 NCAA

STAT 408 -Week 7

Advanced RGraphics

and ggplot2

Advanced RGraphics

ggplot2

Plot types

2011 2012 2013 2014 2015 2016

020

4060

Average Points Scored by Losing Teams in NCAA Tournament

Season

Ave

rage

Poi

nts

Sco

red

Page 6: STAT 408 - Week 7 - Montana State Universitymath.montana.edu/ahoegh/teaching/stat408/lecture... · STAT 408 - Week 7 Advanced R Graphics and ggplot2 Advanced R Graphics ggplot2 NCAA

STAT 408 -Week 7

Advanced RGraphics

and ggplot2

Advanced RGraphics

ggplot2

Plot types

plot(points$Lose.Points~points$Season,ylim=c(0,max(points$Lose.Points)),ylab='Average Points Scored', xlab='Season',main='Average Points Scored by Losing Teams \n in NCAA Tournament', cex.lab=.8, cex.main = .8)

Page 7: STAT 408 - Week 7 - Montana State Universitymath.montana.edu/ahoegh/teaching/stat408/lecture... · STAT 408 - Week 7 Advanced R Graphics and ggplot2 Advanced R Graphics ggplot2 NCAA

STAT 408 -Week 7

Advanced RGraphics

and ggplot2

Advanced RGraphics

ggplot2

Plot types

2011 2012 2013 2014 2015 2016

020

4060

Average Points Scored by Losing Teams in NCAA Tournament

Season

Ave

rage

Poi

nts

Sco

red

Page 8: STAT 408 - Week 7 - Montana State Universitymath.montana.edu/ahoegh/teaching/stat408/lecture... · STAT 408 - Week 7 Advanced R Graphics and ggplot2 Advanced R Graphics ggplot2 NCAA

STAT 408 -Week 7

Advanced RGraphics

and ggplot2

Advanced RGraphics

ggplot2

Plot types

plot(points$Lose.Points~points$Season,ylim=c(0,max(points$Lose.Points)),ylab='Average Points Scored', xlab='Season', type='h',lwd=3,main='Average Points Scored by Losing Teams \n in NCAA Tournament',cex.lab=.8, cex.main = .8)

Page 9: STAT 408 - Week 7 - Montana State Universitymath.montana.edu/ahoegh/teaching/stat408/lecture... · STAT 408 - Week 7 Advanced R Graphics and ggplot2 Advanced R Graphics ggplot2 NCAA

STAT 408 -Week 7

Advanced RGraphics

and ggplot2

Advanced RGraphics

ggplot2

Lines

2011 2012 2013 2014 2015 2016

020

4060

Average Points Scored by Losing Teams in NCAA Tournament

Season

Ave

rage

Poi

nts

Sco

red

Page 10: STAT 408 - Week 7 - Montana State Universitymath.montana.edu/ahoegh/teaching/stat408/lecture... · STAT 408 - Week 7 Advanced R Graphics and ggplot2 Advanced R Graphics ggplot2 NCAA

STAT 408 -Week 7

Advanced RGraphics

and ggplot2

Advanced RGraphics

ggplot2

Lines

plot(points$Lose.Points~points$Season,ylim=c(0,max(points$Lose.Points)),ylab='Average Points Scored', xlab='Season', type='n',main='Average Points Scored by Losing Teams \n in NCAA Tournament',cex.lab=.8, cex.main = .8)

lines(points$Lose.Points~points$Season,col='red',lwd=1)lines(points$Win.Points~points$Season,col='blue',lwd=1,lty=2)

Page 11: STAT 408 - Week 7 - Montana State Universitymath.montana.edu/ahoegh/teaching/stat408/lecture... · STAT 408 - Week 7 Advanced R Graphics and ggplot2 Advanced R Graphics ggplot2 NCAA

STAT 408 -Week 7

Advanced RGraphics

and ggplot2

Advanced RGraphics

ggplot2

Lines and Legends

2011 2012 2013 2014 2015 2016

040

80

Average Points Scored in NCAA Tournament

Season

Ave

rage

Poi

nts

Sco

red

Winning TeamLosing Team

Page 12: STAT 408 - Week 7 - Montana State Universitymath.montana.edu/ahoegh/teaching/stat408/lecture... · STAT 408 - Week 7 Advanced R Graphics and ggplot2 Advanced R Graphics ggplot2 NCAA

STAT 408 -Week 7

Advanced RGraphics

and ggplot2

Advanced RGraphics

ggplot2

Lines and Legends

plot(points$Lose.Points~points$Season,ylim=c(0,max(points$Win.Points)),ylab='Average Points Scored', xlab='Season', type='n',main='Average Points Scored in NCAA Tournament',cex.lab=.8, cex.main = .8)

lines(points$Lose.Points~points$Season,col='red',lwd=1)lines(points$Win.Points~points$Season,col='blue',lwd=1,lty=2)legend('bottomright',legend=c('Winning Team','Losing Team'),col=c('blue','red'),

lwd=c(1,1), lty=c(2,1), cex = .7)

Page 13: STAT 408 - Week 7 - Montana State Universitymath.montana.edu/ahoegh/teaching/stat408/lecture... · STAT 408 - Week 7 Advanced R Graphics and ggplot2 Advanced R Graphics ggplot2 NCAA

STAT 408 -Week 7

Advanced RGraphics

and ggplot2

Advanced RGraphics

ggplot2

Points

2011 2012 2013 2014 2015 2016

040

80

Average Points Scored in NCAA Tournament

Season

Ave

rage

Poi

nts

Sco

red

Average Winning TeamAverage Losing TeamIndividual Winning Team

Page 14: STAT 408 - Week 7 - Montana State Universitymath.montana.edu/ahoegh/teaching/stat408/lecture... · STAT 408 - Week 7 Advanced R Graphics and ggplot2 Advanced R Graphics ggplot2 NCAA

STAT 408 -Week 7

Advanced RGraphics

and ggplot2

Advanced RGraphics

ggplot2

Points

plot(points$Lose.Points~points$Season,ylim=c(0,max(points$Win.Points)),ylab='Average Points Scored', xlab='Season', type='l', lwd=1,col='red',main='Average Points Scored in NCAA Tournament',cex.lab=.8, cex.main = .8)

lines(points$Win.Points~points$Season,col='blue',lwd=1,lty=2)points(hoops.2011$Wscore~hoops.2011$Season,pch=16,col=rgb(0,0,.5,.1))legend('bottomright',legend=c('Average Winning Team','Average Losing Team', 'Individual Winning Team'),col=c('blue','red',rgb(0,0,.5)),

lwd=c(1,1,NA), lty=c(2,1,NA), pch=c(NA,NA,16), cex=.7)

Page 15: STAT 408 - Week 7 - Montana State Universitymath.montana.edu/ahoegh/teaching/stat408/lecture... · STAT 408 - Week 7 Advanced R Graphics and ggplot2 Advanced R Graphics ggplot2 NCAA

STAT 408 -Week 7

Advanced RGraphics

and ggplot2

Advanced RGraphics

ggplot2

Annotation

2011 2012 2013 2014 2015 2016

040

80

Average Points Scored in NCAA Tournament

Season

Ave

rage

Poi

nts

Sco

red

Shot Clock Shortened

Average Winning TeamAverage Losing TeamIndividual Winning Team

Page 16: STAT 408 - Week 7 - Montana State Universitymath.montana.edu/ahoegh/teaching/stat408/lecture... · STAT 408 - Week 7 Advanced R Graphics and ggplot2 Advanced R Graphics ggplot2 NCAA

STAT 408 -Week 7

Advanced RGraphics

and ggplot2

Advanced RGraphics

ggplot2

Annotation

plot(points$Lose.Points~points$Season,ylim=c(0,max(points$Win.Points)),ylab='Average Points Scored', xlab='Season', type='l', lwd=1,col='red',main='Average Points Scored in NCAA Tournament', cex.lab=.8, cex.main = .8)

lines(points$Win.Points~points$Season,col='blue',lwd=1,lty=2)points(hoops.2011$Wscore~hoops.2011$Season,pch=16,col=rgb(0,0,.5,.1))text(2015,40,'Shot Clock Shortened', cex=.6)arrows(x0=2015, y0=42, x1=2016, y1=70, length=0.1, lwd=2)legend('bottomright',legend=c('Average Winning Team','Average Losing Team', 'Individual Winning Team'),col=c('blue','red',rgb(0,0,.5)),

lwd=c(1,1,NA), lty=c(2,1,NA), pch=c(NA,NA,16), cex=.6)

Page 17: STAT 408 - Week 7 - Montana State Universitymath.montana.edu/ahoegh/teaching/stat408/lecture... · STAT 408 - Week 7 Advanced R Graphics and ggplot2 Advanced R Graphics ggplot2 NCAA

STAT 408 -Week 7

Advanced RGraphics

and ggplot2

Advanced RGraphics

ggplot2

Axes

Average Points Scored in NCAA Tournament

Season

Ave

rage

Poi

nts

Sco

red

Average Winning TeamAverage Losing Team

Page 18: STAT 408 - Week 7 - Montana State Universitymath.montana.edu/ahoegh/teaching/stat408/lecture... · STAT 408 - Week 7 Advanced R Graphics and ggplot2 Advanced R Graphics ggplot2 NCAA

STAT 408 -Week 7

Advanced RGraphics

and ggplot2

Advanced RGraphics

ggplot2

Axes

plot(points$Lose.Points~points$Season,ylim=c(0,max(points$Win.Points)), axes=F,ylab='Average Points Scored', xlab='Season', type='b',pch=16, lwd=1,col='red',main='Average Points Scored in NCAA Tournament', cex.main=.75, cex.lab=.75)

lines(points$Win.Points~points$Season,col='blue',lwd=1,lty=2,type='b',pch=17)legend('bottomright',legend=c('Average Winning Team','Average Losing Team'),col=c('blue','red'),

lwd=1, lty=c(2,1),pch=c(17,16), cex=.7)

Page 19: STAT 408 - Week 7 - Montana State Universitymath.montana.edu/ahoegh/teaching/stat408/lecture... · STAT 408 - Week 7 Advanced R Graphics and ggplot2 Advanced R Graphics ggplot2 NCAA

STAT 408 -Week 7

Advanced RGraphics

and ggplot2

Advanced RGraphics

ggplot2

Axes

Average Points Scored in NCAA Tournament

Season

Ave

rage

Poi

nts

Sco

red

Average Winning TeamAverage Losing Team

040

80

10−11 12−13 14−15

Page 20: STAT 408 - Week 7 - Montana State Universitymath.montana.edu/ahoegh/teaching/stat408/lecture... · STAT 408 - Week 7 Advanced R Graphics and ggplot2 Advanced R Graphics ggplot2 NCAA

STAT 408 -Week 7

Advanced RGraphics

and ggplot2

Advanced RGraphics

ggplot2

Axes

plot(points$Lose.Points~points$Season,ylim=c(0,max(points$Win.Points)), axes=F,ylab='Average Points Scored', xlab='Season', type='b',pch=16, lwd=1,col='red',main='Average Points Scored in NCAA Tournament', cex.main=.75, cex.lab=.75)

lines(points$Win.Points~points$Season,col='blue',lwd=3,lty=2,type='b',pch=17)legend('bottomleft',legend=c('Average Winning Team','Average Losing Team'),col=c('blue','red'),

lwd=1, lty=c(2,1),pch=c(17,16), cex=.7)axis(4)axis(1, at = 2011:2016,labels=c('10-11','11-12','12-13','13-14','14-15','15-16'))

Page 21: STAT 408 - Week 7 - Montana State Universitymath.montana.edu/ahoegh/teaching/stat408/lecture... · STAT 408 - Week 7 Advanced R Graphics and ggplot2 Advanced R Graphics ggplot2 NCAA

STAT 408 -Week 7

Advanced RGraphics

and ggplot2

Advanced RGraphics

ggplot2

Axes

Average Points Scored in NCAA Tournament

Season

Ave

rage

Poi

nts

Sco

red

Average Winning TeamAverage Losing Team

040

80

10−11 12−13 14−15

Page 22: STAT 408 - Week 7 - Montana State Universitymath.montana.edu/ahoegh/teaching/stat408/lecture... · STAT 408 - Week 7 Advanced R Graphics and ggplot2 Advanced R Graphics ggplot2 NCAA

STAT 408 -Week 7

Advanced RGraphics

and ggplot2

Advanced RGraphics

ggplot2

Axes

plot(points$Lose.Points~points$Season,ylim=c(0,max(points$Win.Points)), axes=F,ylab='Average Points Scored', xlab='Season', type='b',pch=16, lwd=1,col='red',main='Average Points Scored in NCAA Tournament', cex.main=.75, cex.lab=.75)

lines(points$Win.Points~points$Season,col='blue',lwd=3,lty=2,type='b',pch=17)legend('bottomleft',legend=c('Average Winning Team','Average Losing Team'),col=c('blue','red'),

lwd=1, lty=c(2,1),pch=c(17,16), cex=.7)axis(4)axis(1, at = 2011:2016,labels=c('10-11','11-12','12-13','13-14','14-15','15-16'))box()

Page 23: STAT 408 - Week 7 - Montana State Universitymath.montana.edu/ahoegh/teaching/stat408/lecture... · STAT 408 - Week 7 Advanced R Graphics and ggplot2 Advanced R Graphics ggplot2 NCAA

STAT 408 -Week 7

Advanced RGraphics

and ggplot2

Advanced RGraphics

ggplot2

Superimposed Plots

plot(density(hoops.2011$Wscore),xlab='points',ylab='', main='histogram and superimposed density curve \n for points scored by winning team',lwd=3, cex.main=.8, cex.lab=.8)hist(hoops.2011$Wscore,add=T,probability = T)

40 60 80 100

0.00

0.02

0.04

histogram and superimposed density curve for points scored by winning team

points

Page 24: STAT 408 - Week 7 - Montana State Universitymath.montana.edu/ahoegh/teaching/stat408/lecture... · STAT 408 - Week 7 Advanced R Graphics and ggplot2 Advanced R Graphics ggplot2 NCAA

STAT 408 -Week 7

Advanced RGraphics

and ggplot2

Advanced RGraphics

ggplot2

Expression

plot(density(hoops.2011$Wscore),ylab=expression(beta[2]),xlab='',main='Examples with Expression',axes=F, type='n')

box()text(70,.023, expression(sum(theta[i]^2, i=1, n)),cex=2)

Examples with Expression

β 2 ∑1

nθi

2

Page 25: STAT 408 - Week 7 - Montana State Universitymath.montana.edu/ahoegh/teaching/stat408/lecture... · STAT 408 - Week 7 Advanced R Graphics and ggplot2 Advanced R Graphics ggplot2 NCAA

STAT 408 -Week 7

Advanced RGraphics

and ggplot2

Advanced RGraphics

ggplot2

R-Markdown Captions

Captions in R Markdown

caption in R header

Figure 1: Write caption here

Page 26: STAT 408 - Week 7 - Montana State Universitymath.montana.edu/ahoegh/teaching/stat408/lecture... · STAT 408 - Week 7 Advanced R Graphics and ggplot2 Advanced R Graphics ggplot2 NCAA

STAT 408 -Week 7

Advanced RGraphics

and ggplot2

Advanced RGraphics

ggplot2

Exercise: Advanced Plotting

Use the Seattle Housing Data Sethttp://math.montana.edu/ahoegh/teaching/stat408/datasets/SeattleHousing.csv to create an interestinggraphic, include informative titles, labels, and add an annotation.

Page 27: STAT 408 - Week 7 - Montana State Universitymath.montana.edu/ahoegh/teaching/stat408/lecture... · STAT 408 - Week 7 Advanced R Graphics and ggplot2 Advanced R Graphics ggplot2 NCAA

STAT 408 -Week 7

Advanced RGraphics

and ggplot2

Advanced RGraphics

ggplot2

Solution: Advanced Plotting

Houses Sold in Seattle

Sales Price (million $)

0 1 2.5 4 5.5 7

Most homes sell for less than one million

Page 28: STAT 408 - Week 7 - Montana State Universitymath.montana.edu/ahoegh/teaching/stat408/lecture... · STAT 408 - Week 7 Advanced R Graphics and ggplot2 Advanced R Graphics ggplot2 NCAA

STAT 408 -Week 7

Advanced RGraphics

and ggplot2

Advanced RGraphics

ggplot2

Solution: Advanced Plotting

Seattle.in <- read.csv('http://math.montana.edu/ahoegh/teaching/stat408/datasets/SeattleHousing.csv',stringsAsFactors = F)

hist(Seattle.in$price,prob=T,breaks="FD", ylab='', col='forestgreen',xlab='Sales Price (million $)', main='Houses Sold in Seattle', axes=F)

axis(1, at = c(0,500000,1000000,2500000,4000000,5500000,7000000), labels =c('0','.5','1','2.5','4','5.5','7'))

arrows(x0=2500000, y0=1.5e-6, x1=1500000, y1=.5e-6, length=0.1, lwd=2)text(2500000,1.8e-6,'Most homes sell for \n less than one million',cex=.8)

Page 29: STAT 408 - Week 7 - Montana State Universitymath.montana.edu/ahoegh/teaching/stat408/lecture... · STAT 408 - Week 7 Advanced R Graphics and ggplot2 Advanced R Graphics ggplot2 NCAA

STAT 408 -Week 7

Advanced RGraphics

and ggplot2

Advanced RGraphics

ggplot2

ggplot2

Page 30: STAT 408 - Week 7 - Montana State Universitymath.montana.edu/ahoegh/teaching/stat408/lecture... · STAT 408 - Week 7 Advanced R Graphics and ggplot2 Advanced R Graphics ggplot2 NCAA

STAT 408 -Week 7

Advanced RGraphics

and ggplot2

Advanced RGraphics

ggplot2

ggplot2 Overview

Why ggplot2?

Advantages of ggplot2consistent underlying grammar of graphics (Wilkinson,2005)plot specification at a high level of abstractionvery flexibletheme system for polishing plot appearance

Page 31: STAT 408 - Week 7 - Montana State Universitymath.montana.edu/ahoegh/teaching/stat408/lecture... · STAT 408 - Week 7 Advanced R Graphics and ggplot2 Advanced R Graphics ggplot2 NCAA

STAT 408 -Week 7

Advanced RGraphics

and ggplot2

Advanced RGraphics

ggplot2

Grammar of Graphics

The basic idea: independently specify plot building blocks andcombine them to create just about any kind of graphical displayyou want.

Building blocks of a graph include:

dataaesthetic mappinggeometric objectstatistical transformationsscalescoordinate systemposition adjustmentsfaceting

Page 32: STAT 408 - Week 7 - Montana State Universitymath.montana.edu/ahoegh/teaching/stat408/lecture... · STAT 408 - Week 7 Advanced R Graphics and ggplot2 Advanced R Graphics ggplot2 NCAA

STAT 408 -Week 7

Advanced RGraphics

and ggplot2

Advanced RGraphics

ggplot2

ggplot2 VS Base Graphics

Compared to base graphics, ggplot2

is more verbose for simple / canned graphicsis less verbose for complex / custom graphicsdoes not have methods (data should always be in adata.frame)uses a different system for adding plot elements

Page 33: STAT 408 - Week 7 - Montana State Universitymath.montana.edu/ahoegh/teaching/stat408/lecture... · STAT 408 - Week 7 Advanced R Graphics and ggplot2 Advanced R Graphics ggplot2 NCAA

STAT 408 -Week 7

Advanced RGraphics

and ggplot2

Advanced RGraphics

ggplot2

Aesthetic Mapping

Aesthetics are things that you can see. Examples include:

position (i.e., on the x and y axes)color (“outside” color)fill (“inside” color)shape (of points)linetypesize

Aesthetic mappings are set with the aes() function.

Page 34: STAT 408 - Week 7 - Montana State Universitymath.montana.edu/ahoegh/teaching/stat408/lecture... · STAT 408 - Week 7 Advanced R Graphics and ggplot2 Advanced R Graphics ggplot2 NCAA

STAT 408 -Week 7

Advanced RGraphics

and ggplot2

Advanced RGraphics

ggplot2

Geometic Objects (geom)

Geometric objects are the actual marks we put on a plot.Examples include:

points (geom_point)lines (geom_line)boxplot (geom_boxplot)

A plot must have at least one geom; there is no upper limit.You can add a geom to a plot using the + operator

Page 35: STAT 408 - Week 7 - Montana State Universitymath.montana.edu/ahoegh/teaching/stat408/lecture... · STAT 408 - Week 7 Advanced R Graphics and ggplot2 Advanced R Graphics ggplot2 NCAA

STAT 408 -Week 7

Advanced RGraphics

and ggplot2

Advanced RGraphics

ggplot2

Graphical Primitives/ ggplot

graph.a <- ggplot(data = hoops.2011, aes(Lfgm,Wfgm))graph.a

15

20

25

30

35

40

10 20 30

Lfgm

Wfg

m

Page 36: STAT 408 - Week 7 - Montana State Universitymath.montana.edu/ahoegh/teaching/stat408/lecture... · STAT 408 - Week 7 Advanced R Graphics and ggplot2 Advanced R Graphics ggplot2 NCAA

STAT 408 -Week 7

Advanced RGraphics

and ggplot2

Advanced RGraphics

ggplot2

Adding Geoms: geom_point()

graph.a + geom_point()

15

20

25

30

35

40

10 20 30

Lfgm

Wfg

m

Page 37: STAT 408 - Week 7 - Montana State Universitymath.montana.edu/ahoegh/teaching/stat408/lecture... · STAT 408 - Week 7 Advanced R Graphics and ggplot2 Advanced R Graphics ggplot2 NCAA

STAT 408 -Week 7

Advanced RGraphics

and ggplot2

Advanced RGraphics

ggplot2

Adding Geoms: geom_smooth()

graph.a + geom_point() + geom_smooth(method = 'loess')

15

20

25

30

35

40

10 20 30

Lfgm

Wfg

m

Page 38: STAT 408 - Week 7 - Montana State Universitymath.montana.edu/ahoegh/teaching/stat408/lecture... · STAT 408 - Week 7 Advanced R Graphics and ggplot2 Advanced R Graphics ggplot2 NCAA

STAT 408 -Week 7

Advanced RGraphics

and ggplot2

Advanced RGraphics

ggplot2

Adding Geoms: geom_rug()

graph.a + geom_point() + geom_smooth(method = 'loess') + geom_rug()

15

20

25

30

35

40

10 20 30

Lfgm

Wfg

m

Page 39: STAT 408 - Week 7 - Montana State Universitymath.montana.edu/ahoegh/teaching/stat408/lecture... · STAT 408 - Week 7 Advanced R Graphics and ggplot2 Advanced R Graphics ggplot2 NCAA

STAT 408 -Week 7

Advanced RGraphics

and ggplot2

Advanced RGraphics

ggplot2

Adding Geoms: geom_density2d()

graph.a + geom_point() + geom_smooth(method = 'loess') + geom_rug() + geom_density2d()

15

20

25

30

35

40

10 20 30

Lfgm

Wfg

m

Page 40: STAT 408 - Week 7 - Montana State Universitymath.montana.edu/ahoegh/teaching/stat408/lecture... · STAT 408 - Week 7 Advanced R Graphics and ggplot2 Advanced R Graphics ggplot2 NCAA

STAT 408 -Week 7

Advanced RGraphics

and ggplot2

Advanced RGraphics

ggplot2

Adding Geoms: geom_jitter()

graph.a + geom_rug() + geom_density2d() + geom_jitter()

15

20

25

30

35

40

10 20 30

Lfgm

Wfg

m

Page 41: STAT 408 - Week 7 - Montana State Universitymath.montana.edu/ahoegh/teaching/stat408/lecture... · STAT 408 - Week 7 Advanced R Graphics and ggplot2 Advanced R Graphics ggplot2 NCAA

STAT 408 -Week 7

Advanced RGraphics

and ggplot2

Advanced RGraphics

ggplot2

Adding Geoms: labs()

graph.a + geom_rug() + geom_density2d() +geom_jitter() + labs(x='Losing Team Field Goals Made', y = 'Winning Team Field Goals Made')

15

20

25

30

35

40

10 20 30

Losing Team Field Goals Made

Win

ning

Tea

m F

ield

Goa

ls M

ade

Page 42: STAT 408 - Week 7 - Montana State Universitymath.montana.edu/ahoegh/teaching/stat408/lecture... · STAT 408 - Week 7 Advanced R Graphics and ggplot2 Advanced R Graphics ggplot2 NCAA

STAT 408 -Week 7

Advanced RGraphics

and ggplot2

Advanced RGraphics

ggplot2

Scales: xlim() and ylim()

graph.a + geom_rug() + geom_density2d() +geom_jitter() + labs(x='Losing Team Field Goals Made', y = 'Winning Team Field Goals Made') +xlim(c(0,max(hoops.2011$Wfgm))) + ylim(c(0,max(hoops.2011$Wfgm)))

0

10

20

30

40

0 10 20 30 40

Losing Team Field Goals Made

Win

ning

Tea

m F

ield

Goa

ls M

ade

Page 43: STAT 408 - Week 7 - Montana State Universitymath.montana.edu/ahoegh/teaching/stat408/lecture... · STAT 408 - Week 7 Advanced R Graphics and ggplot2 Advanced R Graphics ggplot2 NCAA

STAT 408 -Week 7

Advanced RGraphics

and ggplot2

Advanced RGraphics

ggplot2

Themes

graph.a + geom_point() + theme_bw() + labs(x='Losing Team Field Goals Made', y = 'Winning Team Field Goals Made')

15

20

25

30

35

40

10 20 30

Losing Team Field Goals Made

Win

ning

Tea

m F

ield

Goa

ls M

ade

Page 44: STAT 408 - Week 7 - Montana State Universitymath.montana.edu/ahoegh/teaching/stat408/lecture... · STAT 408 - Week 7 Advanced R Graphics and ggplot2 Advanced R Graphics ggplot2 NCAA

STAT 408 -Week 7

Advanced RGraphics

and ggplot2

Advanced RGraphics

ggplot2

More about aes

graph.a + geom_jitter(col = 'firebrick4')

15

20

25

30

35

40

10 20 30

Lfgm

Wfg

m

Page 45: STAT 408 - Week 7 - Montana State Universitymath.montana.edu/ahoegh/teaching/stat408/lecture... · STAT 408 - Week 7 Advanced R Graphics and ggplot2 Advanced R Graphics ggplot2 NCAA

STAT 408 -Week 7

Advanced RGraphics

and ggplot2

Advanced RGraphics

ggplot2

More about aes

graph.a + geom_jitter(aes(col = as.factor(Season)))

Page 46: STAT 408 - Week 7 - Montana State Universitymath.montana.edu/ahoegh/teaching/stat408/lecture... · STAT 408 - Week 7 Advanced R Graphics and ggplot2 Advanced R Graphics ggplot2 NCAA

STAT 408 -Week 7

Advanced RGraphics

and ggplot2

Advanced RGraphics

ggplot2

More about aes

graph.a + geom_jitter(aes(col = as.factor(Season)))

15

20

25

30

35

40

10 20 30

Lfgm

Wfg

m

as.factor(Season)

2011

2012

2013

2014

2015

2016

Page 47: STAT 408 - Week 7 - Montana State Universitymath.montana.edu/ahoegh/teaching/stat408/lecture... · STAT 408 - Week 7 Advanced R Graphics and ggplot2 Advanced R Graphics ggplot2 NCAA

STAT 408 -Week 7

Advanced RGraphics

and ggplot2

Advanced RGraphics

ggplot2

More about aes

graph.a + geom_jitter(aes(col = as.factor(Season)), size=3,alpha=.4)

15

20

25

30

35

40

10 20 30

Lfgm

Wfg

m

as.factor(Season)

2011

2012

2013

2014

2015

2016

Page 48: STAT 408 - Week 7 - Montana State Universitymath.montana.edu/ahoegh/teaching/stat408/lecture... · STAT 408 - Week 7 Advanced R Graphics and ggplot2 Advanced R Graphics ggplot2 NCAA

STAT 408 -Week 7

Advanced RGraphics

and ggplot2

Advanced RGraphics

ggplot2

More about aes

graph.a + geom_jitter(aes(shape = as.factor(Season),col=Wscore), size=3,alpha=.4)

15

20

25

30

35

40

10 15 20 25 30 35

Lfgm

Wfg

m

2012

2013

2014

2015

2016

60

70

80

90

100

Wscore

Page 49: STAT 408 - Week 7 - Montana State Universitymath.montana.edu/ahoegh/teaching/stat408/lecture... · STAT 408 - Week 7 Advanced R Graphics and ggplot2 Advanced R Graphics ggplot2 NCAA

STAT 408 -Week 7

Advanced RGraphics

and ggplot2

Advanced RGraphics

ggplot2

Faceting

graph.a + facet_wrap(~Season)

Page 50: STAT 408 - Week 7 - Montana State Universitymath.montana.edu/ahoegh/teaching/stat408/lecture... · STAT 408 - Week 7 Advanced R Graphics and ggplot2 Advanced R Graphics ggplot2 NCAA

STAT 408 -Week 7

Advanced RGraphics

and ggplot2

Advanced RGraphics

ggplot2

Faceting

graph.a + facet_wrap(~Season) + geom_jitter(alpha=.5, aes(color=Wfgm3))

2014 2015 2016

2011 2012 2013

10 20 30 10 20 30 10 20 30

152025303540

152025303540

Lfgm

Wfg

m

0

4

8

12

16Wfgm3

Page 51: STAT 408 - Week 7 - Montana State Universitymath.montana.edu/ahoegh/teaching/stat408/lecture... · STAT 408 - Week 7 Advanced R Graphics and ggplot2 Advanced R Graphics ggplot2 NCAA

STAT 408 -Week 7

Advanced RGraphics

and ggplot2

Advanced RGraphics

ggplot2

Maps

library(maps)usa <- map_data("usa")usa.map <- ggplot() + geom_polygon(data = usa, aes(x=long, y = lat, group = group)) +

coord_fixed(1.3)usa.map

25

30

35

40

45

50

−120 −100 −80

long

lat

Page 52: STAT 408 - Week 7 - Montana State Universitymath.montana.edu/ahoegh/teaching/stat408/lecture... · STAT 408 - Week 7 Advanced R Graphics and ggplot2 Advanced R Graphics ggplot2 NCAA

STAT 408 -Week 7

Advanced RGraphics

and ggplot2

Advanced RGraphics

ggplot2

Maps

labs <- data.frame(long = c(-111.0429, -105.2211, -80.4139, -91.7857,-121.7405),stringsAsFactors = FALSE,lat = c(45.6770, 39.7555, 37.2296,43.3033,38.5449),names = c("Bozeman, MT", "Golden, CO", "Blacksburg, VA",'Decorah, IA', "Davis, CA") )

usa.map + geom_point(data = labs, aes(x = long, y = lat), color = "yellow", size = 4) +annotate(geom='text', y=44,x=-111,label='Bozeman,MT',col='white')

Bozeman,MT

25

30

35

40

45

50

−120 −100 −80

long

lat

Page 53: STAT 408 - Week 7 - Montana State Universitymath.montana.edu/ahoegh/teaching/stat408/lecture... · STAT 408 - Week 7 Advanced R Graphics and ggplot2 Advanced R Graphics ggplot2 NCAA

STAT 408 -Week 7

Advanced RGraphics

and ggplot2

Advanced RGraphics

ggplot2

More Maps: ggmap

library(ggmap)

## Google Maps API Terms of Service: http://developers.google.com/maps/terms.

## Please cite ggmap if you use it: see citation("ggmap") for details.

myMap <- get_map(location = c(lon = - 110.8281,lat = 44.4605),source = "google",maptype = "terrain", crop = FALSE,zoom = 7)

## Source : https://maps.googleapis.com/maps/api/staticmap?center=44.4605,-110.8281&zoom=7&size=640x640&scale=2&maptype=terrain&language=en-EN

# plot mapyellowstone.map <- ggmap(myMap)

Page 54: STAT 408 - Week 7 - Montana State Universitymath.montana.edu/ahoegh/teaching/stat408/lecture... · STAT 408 - Week 7 Advanced R Graphics and ggplot2 Advanced R Graphics ggplot2 NCAA

STAT 408 -Week 7

Advanced RGraphics

and ggplot2

Advanced RGraphics

ggplot2

More Maps: ggmap

yellowstone.map

42

43

44

45

46

−114 −112 −110 −108

lon

lat

Page 55: STAT 408 - Week 7 - Montana State Universitymath.montana.edu/ahoegh/teaching/stat408/lecture... · STAT 408 - Week 7 Advanced R Graphics and ggplot2 Advanced R Graphics ggplot2 NCAA

STAT 408 -Week 7

Advanced RGraphics

and ggplot2

Advanced RGraphics

ggplot2

Exercise: ggplot2

Now use ggplot2 to create an interesting graph using theSeattle Housing data set.

Page 56: STAT 408 - Week 7 - Montana State Universitymath.montana.edu/ahoegh/teaching/stat408/lecture... · STAT 408 - Week 7 Advanced R Graphics and ggplot2 Advanced R Graphics ggplot2 NCAA

STAT 408 -Week 7

Advanced RGraphics

and ggplot2

Advanced RGraphics

ggplot2

Solution: ggplot2

Housing price depends on zipcode

0

1

2

3

4

5

6

7

0 2500 5000 7500 10000 12500

Living Space (square foot)

Sal

es P

rice

(mill

ion

dolla

rs)

zipcode

98010

98014

98024

98032

98039

98070

98102

98109

98148

Seattle Housing Sales: Price vs. Square Footage Living Space

Page 57: STAT 408 - Week 7 - Montana State Universitymath.montana.edu/ahoegh/teaching/stat408/lecture... · STAT 408 - Week 7 Advanced R Graphics and ggplot2 Advanced R Graphics ggplot2 NCAA

STAT 408 -Week 7

Advanced RGraphics

and ggplot2

Advanced RGraphics

ggplot2

Solution: ggplot2

library(ggplot2)Seattle.in$zipcode <- as.factor(Seattle.in$zipcode)graph.a <- ggplot(data = Seattle.in, aes(sqft_living,price))

graph.a + geom_jitter(aes(col = zipcode)) + theme(plot.title = element_text(size=20))+geom_smooth(method='loess')+ggtitle('Seattle Housing Sales: Price vs. Square Footage Living Space') +ylab('Sales Price (million dollars)') +xlab('Living Space (square foot)')+scale_y_continuous(breaks=c(seq(0,7000000,by=1000000)), labels=as.character(0:7)) +annotate('text',3500,6000000, label = 'Housing price depends on zipcode') +annotate("rect", xmin = 0, xmax = 7250, ymin = 5500000, ymax = 6500000, alpha = .6) +geom_segment(aes(x=3500, xend=3500, y=5500000, yend=3000000),

arrow = arrow(length = unit(0.5, "cm")))

Page 58: STAT 408 - Week 7 - Montana State Universitymath.montana.edu/ahoegh/teaching/stat408/lecture... · STAT 408 - Week 7 Advanced R Graphics and ggplot2 Advanced R Graphics ggplot2 NCAA

STAT 408 -Week 7

Advanced RGraphics

and ggplot2

Advanced RGraphics

ggplot2

Solution: map

## Source : https://maps.googleapis.com/maps/api/staticmap?center=Seattle&zoom=8&size=640x640&scale=2&maptype=terrain&language=en-EN

## Source : https://maps.googleapis.com/maps/api/geocode/json?address=Seattle

46.5

47.0

47.5

48.0

48.5

−124 −123 −122 −121

lon

lat

price

2e+06

4e+06

6e+06

Location of Housing Sales in Seattle, WA

Page 59: STAT 408 - Week 7 - Montana State Universitymath.montana.edu/ahoegh/teaching/stat408/lecture... · STAT 408 - Week 7 Advanced R Graphics and ggplot2 Advanced R Graphics ggplot2 NCAA

STAT 408 -Week 7

Advanced RGraphics

and ggplot2

Advanced RGraphics

ggplot2

Solution: map

myMap <- get_map(location = 'Seattle',source = "google",maptype = "terrain", crop = FALSE,zoom = 8)

Seattle.map <- ggmap(myMap)Seattle.map + geom_point(data=Seattle.in, aes(x=long, y=lat, size=price), alpha=.25) +

labs(title = 'Location of Housing Sales in Seattle, WA') + theme(plot.title = element_text(size=9))