Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
Data Science: Advanced-R Boot CampData Reshaping and Subsetting
Chuck Cartledge, PhDChuck Cartledge, PhDChuck Cartledge, PhDChuck Cartledge, PhDChuck Cartledge, PhDChuck Cartledge, PhDChuck Cartledge, PhDChuck Cartledge, PhDChuck Cartledge, PhDChuck Cartledge, PhDChuck Cartledge, PhDChuck Cartledge, PhDChuck Cartledge, PhDChuck Cartledge, PhDChuck Cartledge, PhDChuck Cartledge, PhDChuck Cartledge, PhDChuck Cartledge, PhDChuck Cartledge, PhDChuck Cartledge, PhDChuck Cartledge, PhD
23 February 202023 February 202023 February 202023 February 202023 February 202023 February 202023 February 202023 February 202023 February 202023 February 202023 February 202023 February 202023 February 202023 February 202023 February 202023 February 202023 February 202023 February 202023 February 202023 February 202023 February 2020
1/17
2/17
Intro. Simple ways Data frame reshaping Hands-on Q & A Conclusion References Files
Table of contents (1 of 1)
1 Intro.
2 Simple ways
3 Data frame reshaping
4 Hands-onLooking at “old” data
5 Q & A
6 Conclusion7 References8 Files
c©Old Dominion University
3/17
Intro. Simple ways Data frame reshaping Hands-on Q & A Conclusion References Files
What are we going to cover?
We’re going to talk about moulding thedata we have into data we want.
Look at lots of different ways tomodify data
Look at lots of different ways toextract data
look at lots of different ways toreshape data
c©Old Dominion University
4/17
Intro. Simple ways Data frame reshaping Hands-on Q & A Conclusion References Files
Atomic vectors, and lists
Subscripts: positive,negative, ordered, duplicate,logical, named
x
5/17
Intro. Simple ways Data frame reshaping Hands-on Q & A Conclusion References Files
Data frames behave differently
1 When subsetting with asingle index, they behavelike lists and index thecolumns, so df[1:2] selectsthe first two columns.
2 When subsetting with twoindices, they behave likematrices, so df[1:3, ] selectsthe first three rows (and allthe columns).
rm(list=ls())
df
6/17
Intro. Simple ways Data frame reshaping Hands-on Q & A Conclusion References Files
Make things go away, or not
Assigning the reserved valueNULL to anelement/dimension willremove thatelement/dimension
To assign the reserved valueNULL to anelement/dimension, encloseit in a list()
x
7/17
Intro. Simple ways Data frame reshaping Hands-on Q & A Conclusion References Files
A lookup table based on abbreviations
Use a look-up table to translatefrom abbreviation to full text.The look-up table has namedentries that correspond exactlywith the items to be “looked-up.”The entire column is returned foreach matched entry.
rm(list=ls())
x
8/17
Intro. Simple ways Data frame reshaping Hands-on Q & A Conclusion References Files
cbind() and rbind()
The simplest case is when wehave two datasets with eitheridentical columns (both thenumber of and names) or thesame number of rows. In thiscase, either rbind or cbind workgreat.
sport
9/17
Intro. Simple ways Data frame reshaping Hands-on Q & A Conclusion References Files
Different types of data set joins
Generally there are sevendifferent types:
1 Inner: elements common toboth sets
2 Left outer: elements not inright
3 Right outer: elements not inleft
4 Full outer: all elements in rightand left, but no commonelements
5 Right anti: elements in rightouter and not left inner
6 Left anti: elements in left outerand not right inner
7 Anti inner: elements in leftouter and right outer and notinner
Image from [1].
c©Old Dominion University
10/17
Intro. Simple ways Data frame reshaping Hands-on Q & A Conclusion References Files
Same image.
Image from [1].c©Old Dominion University
11/17
Intro. Simple ways Data frame reshaping Hands-on Q & A Conclusion References Files
Translate SQL joins into R
Generally there are seven different types:
1 Inner
2 Left outer
3 Right outer
4 Full outer
5 Right anti
6 Left anti
7 Anti inner
See code in attached
"snippet" file.
Ideas from [2].
c©Old Dominion University
12/17
Intro. Simple ways Data frame reshaping Hands-on Q & A Conclusion References Files
Extracting data from data frame
Lots of different ways:
Direct selection based on logicals
base::subset(x, subset, select,drop = FALSE, . . . ) uses logicalto select rows
base::transform( data, . . . )creates or modifies new columns
plyr::arrange(df, . . . ) combinessubset() and transform()
reshape2::melt(data, . . . ) genericfunction that calls specifics basedon data type
reshape2::dcast(data, formula,func, . . . ) cast “melted” datainto a data frame
ChickWeight[(ChickWeight$Diet==4)&
(ChickWeight$Time==21),]
subset(ChickWeight, Diet==4 &
Time == 21)
subset(airquality, Temp > 80,
select = c(Ozone, Temp))
with(airquality, subset(Ozone,
Temp > 80))
transform(airquality,
new = -Ozone,
Temp = (Temp-32)/1.8)
arrange(mtcars, cyl, disp)
arrange(mtcars, cyl, desc(disp))
See embedded file.
c©Old Dominion University
13/17
Intro. Simple ways Data frame reshaping Hands-on Q & A Conclusion References Files
Looking at “old” data
With the Motor Trend cars dataset:
Write a script that:
Converts the mtcars dataset wt column into pounds
Identifies the most fuel efficient vehicle by transmission typeand number of carburetors
Creates a data frame with all the column data for the vehiclesidentified in the previous requirement
c©Old Dominion University
14/17
Intro. Simple ways Data frame reshaping Hands-on Q & A Conclusion References Files
Q & A time.
Q: What was the greatestachievement in taxidermy?A: The Royal Canadian MountedPolice.
c©Old Dominion University
15/17
Intro. Simple ways Data frame reshaping Hands-on Q & A Conclusion References Files
What have we covered?
Looked at different ways to createdata frame, from raw data or otherdata framesLooked at different functions thatdo the same things
Next: String manipulations
c©Old Dominion University
16/17
Intro. Simple ways Data frame reshaping Hands-on Q & A Conclusion References Files
References (1 of 1)
[1] deadman87, Sql joins explained (x-post r/sql),https://www.reddit.com/r/programming/comments/
1xlqeu/sql_joins_explained_xpost_rsql/, 2014.
[2] Dan Goldstein, How to join (merge) data frames (inner, outer,left, right),https://stackoverflow.com/questions/1299871/how-
to-join-merge-data-frames-inner-outer-left-right,2009.
c©Old Dominion University
https://www.reddit.com/r/programming/comments/1xlqeu/sql_joins_explained_xpost_rsql/https://www.reddit.com/r/programming/comments/1xlqeu/sql_joins_explained_xpost_rsql/https://stackoverflow.com/questions/1299871/how-to-join-merge-data-frames-inner-outer-left-righthttps://stackoverflow.com/questions/1299871/how-to-join-merge-data-frames-inner-outer-left-right
17/17
Intro. Simple ways Data frame reshaping Hands-on Q & A Conclusion References Files
Files of interest
1 Code snippets
c©Old Dominion University
## First codesrm(list=ls())
x