Visualization Basicswebspace.ship.edu/lebryant/mat219/pdf/slides-02-vis-basics-I-part1.pdfPart I!...

Preview:

Citation preview

Visualization Basics!Part I!

"The simple graph has brought more information to the data analyst’s mind than any other device.”

-John Tukey

A Grammar of Graphics

A small, concise language to describe the components of a graphic.

We can move beyond named graphics (e.g. scatterplot) and gain a deeper understanding.

www.gapminder.org

mpg dataset

Question

Confer with your group.

What relationship do you expect to see between engine size (displ) and mileage (hwy)?

Your Turn #1

Write and run this code in your script to make a graph. Pay strict attention to spelling, capitalization, and parentheses! ggplot(data = mpg) + geom_point(mapping = aes(x = displ, y = hwy))

ggplot(data = mpg) + geom_point(mapping = aes(x = displ, y = hwy))

1.  “Ini1alize”aplotwithggplot()2.  Addlayerswithgeom_func1ons ggplot(data = mpg) + geom_point(mapping = aes(x = displ, y = hwy))

ggplot(data = mpg) + geom_point(mapping = aes(x = displ, y = hwy))

data +beforenewline

typeoflayer aes() xvariable yvariable

ggplot(data = mpg) + geom_point(mapping = aes(x = displ, y = hwy))

A Template ggplot(data = <DATA>) + <GEOM_FUNCTION>(mapping = aes(<MAPPINGS>))

Mappings

ggplot(data = mpg) + geom_point(mapping = aes(x = displ, y = hwy))

WhydothesecarsgetbeEergasmileage?

Aesthetics

size shape color

Visual Space Data Spacecolor classredbrowngreenaquabluevioletpink

2seatercompactmidsizeminivanpickupsubcompactsuv

Aesthetics

ggplot(data = mpg) + geom_point(mapping = aes(x = displ, y = hwy, color = class))

ggplot(data = mpg) + geom_point(mapping = aes(x = displ, y = hwy, size = class))

ggplot(data = mpg) + geom_point(mapping = aes(x = displ, y = hwy, shape = class))

ggplot(data = mpg) + geom_point(mapping = aes(x = displ, y = hwy, alpha = class))

aesthe1cproperty

variabletomapitto

ggplot(data = mpg) + geom_point(mapping = aes(x = displ, y = hwy, color = class))

Legendaddedautoma1cally

Your Turn 2ggplot(data = mpg) + geom_point(mapping = aes(x = displ, y = hwy)) In your script, edit the code above by adding color, size,alpha, and shape aesthetics your graph. Experiment.

Do different things happen when you map aesthetics to categorical and numerical variables?

What happens when you add more aesthetics?

Color

Size

Shape

Categorical Numerical

Set vs. Map

Howcanwemakethisplot?

ggplot(data = mpg) + geom_point(mapping = aes(x = displ, y = hwy, color = class))

Insidetheaes()mapsthe

aesthe1ctoavariable

ggplot(data = mpg) + geom_point(mapping = aes(x = displ, y = hwy), color = “blue”)

Outsidetheaes()setstheaesthe1ctoa

value

ggplot(data = mpg) + geom_point(mapping = aes(x = displ, y = hwy, color = class))

ggplot(data = mpg) + geom_point(mapping = aes(x = displ, y = hwy, color = class))

ggplot(data = mpg) + geom_point(mapping = aes(x = displ, y = hwy), color = “blue”)

ggplot(data = mpg) + geom_point(mapping = aes(x = displ, y = hwy, color = “blue”))

Uh-oh!Thisshouldbe

movedoutsideaes()

ggplot(data = mpg) + geom_point(mapping = aes(x = displ, y = hwy), color = class)

Uh-oh!Thisshouldbe

movedinsideaes()

Errormessagescanbeweird.Inthiscase,classhasanothermeaninginRthat

leadstothiserror.

Geoms

How are these plots similar?

How are these plots similar?

The x variable, y variable, and data

How are these plots different?

How are these plots different?

The geometric object (geom). This is the visual object used to represent the data.

geoms

ggplot(data = <DATA>) + <GEOM_FUNCTION>(mapping = aes(<MAPPINGS>))

geom_functions

Each geom_function requires a mapping

argument

geom_ function aestheticsEach geom_ function has

aesthetics that can be mapped to variables, or set to values.

Some aesthetics are required. The rest are optional

{

Your Turn 3Withyourpartner,decidehowtoreplacethisscaEerplotwithonethatdrawsboxplots.Usethecheatsheet.Tryyourbestguess.

ggplot(data = mpg) + geom_point(mapping = aes(x = class, y = hwy))

ggplot(data = mpg) + geom_boxplot(mapping = aes(x = displ, y = hwy))

Your Turn 4Withyourpartner,makethehistogramofhwybelow.Usethecheatsheet.Hint:donotsupplyayvariable.

ggplot(data = mpg) + geom_histogram(mapping = aes(x = hwy))

ggplot(data = mpg) + geom_histogram(mapping = aes(x = hwy), binwidth = 2)

Recommended