View
6
Download
0
Category
Preview:
Citation preview
Visualization Basics!Part I!
"The simple graph has brought more information to the data analyst’s mind than any other device.”
-John Tukey
A Grammar of Graphics
A small, concise language to describe the components of a graphic.
We can move beyond named graphics (e.g. scatterplot) and gain a deeper understanding.
www.gapminder.org
mpg dataset
Question
Confer with your group.
What relationship do you expect to see between engine size (displ) and mileage (hwy)?
Your Turn #1
Write and run this code in your script to make a graph. Pay strict attention to spelling, capitalization, and parentheses! ggplot(data = mpg) + geom_point(mapping = aes(x = displ, y = hwy))
ggplot(data = mpg) + geom_point(mapping = aes(x = displ, y = hwy))
1. “Ini1alize”aplotwithggplot()2. Addlayerswithgeom_func1ons ggplot(data = mpg) + geom_point(mapping = aes(x = displ, y = hwy))
ggplot(data = mpg) + geom_point(mapping = aes(x = displ, y = hwy))
data +beforenewline
typeoflayer aes() xvariable yvariable
ggplot(data = mpg) + geom_point(mapping = aes(x = displ, y = hwy))
A Template ggplot(data = <DATA>) + <GEOM_FUNCTION>(mapping = aes(<MAPPINGS>))
Mappings
ggplot(data = mpg) + geom_point(mapping = aes(x = displ, y = hwy))
WhydothesecarsgetbeEergasmileage?
Aesthetics
size shape color
Visual Space Data Spacecolor classredbrowngreenaquabluevioletpink
2seatercompactmidsizeminivanpickupsubcompactsuv
Aesthetics
ggplot(data = mpg) + geom_point(mapping = aes(x = displ, y = hwy, color = class))
ggplot(data = mpg) + geom_point(mapping = aes(x = displ, y = hwy, size = class))
ggplot(data = mpg) + geom_point(mapping = aes(x = displ, y = hwy, shape = class))
ggplot(data = mpg) + geom_point(mapping = aes(x = displ, y = hwy, alpha = class))
aesthe1cproperty
variabletomapitto
ggplot(data = mpg) + geom_point(mapping = aes(x = displ, y = hwy, color = class))
Legendaddedautoma1cally
Your Turn 2ggplot(data = mpg) + geom_point(mapping = aes(x = displ, y = hwy)) In your script, edit the code above by adding color, size,alpha, and shape aesthetics your graph. Experiment.
Do different things happen when you map aesthetics to categorical and numerical variables?
What happens when you add more aesthetics?
Color
Size
Shape
Categorical Numerical
Set vs. Map
Howcanwemakethisplot?
ggplot(data = mpg) + geom_point(mapping = aes(x = displ, y = hwy, color = class))
Insidetheaes()mapsthe
aesthe1ctoavariable
ggplot(data = mpg) + geom_point(mapping = aes(x = displ, y = hwy), color = “blue”)
Outsidetheaes()setstheaesthe1ctoa
value
ggplot(data = mpg) + geom_point(mapping = aes(x = displ, y = hwy, color = class))
ggplot(data = mpg) + geom_point(mapping = aes(x = displ, y = hwy, color = class))
ggplot(data = mpg) + geom_point(mapping = aes(x = displ, y = hwy), color = “blue”)
ggplot(data = mpg) + geom_point(mapping = aes(x = displ, y = hwy, color = “blue”))
Uh-oh!Thisshouldbe
movedoutsideaes()
ggplot(data = mpg) + geom_point(mapping = aes(x = displ, y = hwy), color = class)
Uh-oh!Thisshouldbe
movedinsideaes()
Errormessagescanbeweird.Inthiscase,classhasanothermeaninginRthat
leadstothiserror.
Geoms
How are these plots similar?
How are these plots similar?
The x variable, y variable, and data
How are these plots different?
How are these plots different?
The geometric object (geom). This is the visual object used to represent the data.
geoms
ggplot(data = <DATA>) + <GEOM_FUNCTION>(mapping = aes(<MAPPINGS>))
geom_functions
Each geom_function requires a mapping
argument
geom_ function aestheticsEach geom_ function has
aesthetics that can be mapped to variables, or set to values.
Some aesthetics are required. The rest are optional
{
Your Turn 3Withyourpartner,decidehowtoreplacethisscaEerplotwithonethatdrawsboxplots.Usethecheatsheet.Tryyourbestguess.
ggplot(data = mpg) + geom_point(mapping = aes(x = class, y = hwy))
ggplot(data = mpg) + geom_boxplot(mapping = aes(x = displ, y = hwy))
Your Turn 4Withyourpartner,makethehistogramofhwybelow.Usethecheatsheet.Hint:donotsupplyayvariable.
ggplot(data = mpg) + geom_histogram(mapping = aes(x = hwy))
ggplot(data = mpg) + geom_histogram(mapping = aes(x = hwy), binwidth = 2)
Recommended