Upload
terence-price
View
213
Download
0
Embed Size (px)
DESCRIPTION
Learn R Toolkit D Kelly O'DayExcel & R WorldsMod 2 - Excel & R Worlds: 3 Excel & R Worlds 30 Minute Video – Walks through Excel & R features Click video image to start video
Citation preview
D Kelly O'Day Excel & R Worlds Mod 2 - Excel & R Worlds: 1
Learn R Toolkit
Module 2Moving Between Excel & R Worlds
Do
See & Hear Read
Learning
PowerPoint must be in View Show Mode to See videos and hyperlinks
D Kelly O'Day Excel & R Worlds Mod 2 - Excel & R Worlds: 2
Learn R Toolkit
Module 2 Moving Between Excel & R WorldsWhat We’ll Cover
• Identify Essential Concepts that Excel Users need to understand to learn R
• Establish correspondence between R and Excel terminology
• Show same calculations and plots in Excel and R– Start with Excel workbook, basic calculations and simple plot– Export the Workbook to CSV file– Reproduce calculations, plots in R
D Kelly O'Day Excel & R Worlds Mod 2 - Excel & R Worlds: 3
Learn R Toolkit
Excel & R Worlds30 Minute Video – Walks through Excel & R features
Click video image to start video
D Kelly O'Day Excel & R Worlds Mod 2 - Excel & R Worlds: 4
Learn R Toolkit
Comparison of Excel & R Worlds
Excel RDisplay Look at Data Look at Script
D Kelly O'Day Excel & R Worlds Mod 2 - Excel & R Worlds: 5
Learn R Toolkit
Display
• Excel World – See data right on screen– Formulas are not visible, need to select cell to see or use Control ~
D Kelly O'Day Excel & R Worlds Mod 2 - Excel & R Worlds: 6
Learn R Toolkit
Display
• Excel World – See data right on screen– Formulas are not visible, need to select cell to see or use Control ~
• R World – See script on screen, comparable to seeing formula view in Excel– View Data Options:
• Text Editor (Notepad.exe)• Can view my_data data.frame with R functions:
– my_data shows all data– head(my_data,n) shows first n rows – tail(my_data,n) shows last n rows
D Kelly O'Day Excel & R Worlds Mod 2 - Excel & R Worlds: 7
Learn R Toolkit
Comparison of Excel & R Worlds
Excel RDisplay Look at Data Look at Script
User Interface Point & Click Command Line
D Kelly O'Day Excel & R Worlds Mod 2 - Excel & R Worlds: 8
Learn R Toolkit
User InterfacePoint & Click versus Command Line
• Excel users very familiar & comfortable with point & click
• To make a chart in Excel 2003, we:– Click chart icon– We follow chart wizard
• By answering wizard prompts, we pass chart parameters to the Excel chart engine
D Kelly O'Day Excel & R Worlds Mod 2 - Excel & R Worlds: 9
Learn R Toolkit
User InterfacePoint & Click versus Command Line
• Excel users very familiar & comfortable with point & click
• To make a chart in Excel 2003, we:– Click chart icon– We follow chart wizard
• By answering wizard prompts, we pass chart parameters to the Excel chart engine
• R’s approach is to have user directly specify chart parameters in function call:
> plot(anom ~ yr, data = my_Data, type = “l”, col = “red”)
D Kelly O'Day Excel & R Worlds Mod 2 - Excel & R Worlds: 10
Learn R Toolkit
Comparison of Excel & R Worlds
Excel RDisplay Look at Data Look at Script
User Interface Point & Click Command Line
Data Structure WorkbookWorksheet
ColumnCell
Data.frame (List, Matrix, Array)Vector
Vector[I]
D Kelly O'Day Excel & R Worlds Mod 2 - Excel & R Worlds: 11
Learn R Toolkit
Data Structure
• Everything is by address: row/col or range name
• Range(B2:B22)• Sheet1!range(“B2:B22”)• B6
Excel World R World• Everything is an object with a name • Column of data is called vector with
name (default is V1)• Group of vectors called data frame• data$Glob or data[2]• data$Glob[5] or data[5,2]
D Kelly O'Day Excel & R Worlds Mod 2 - Excel & R Worlds: 12
Learn R Toolkit
Data Structure
Excel R
Data group Worksheet dataset with* Same # rows
data.frame
Column of data Column BB2:B100
Vector
Data value in range B3 Vector[3]
Dynamic Data Range
Define Name=offset()
Vectors dynamic
D Kelly O'Day Excel & R Worlds Mod 2 - Excel & R Worlds: 13
Learn R Toolkit
Comparison of Excel & R Worlds
Excel RDisplay Look at Data Look at Script
User Interface Point & Click Command Line
Data Structure WorkbookWorksheet
ColumnCell
Data.frame (List, Matrix, Array)Vector
Vector[I]
Data Types numericcharacterlogical
numericcharacterlogicalfactor
D Kelly O'Day Excel & R Worlds Mod 2 - Excel & R Worlds: 14
Learn R Toolkit
Data Types
• Excel & R Have 3 similar data types• Numeric (1, 11, 19.2)• Character (“Joe”, “Bill”)• Logical (T,F)
• R has Factor Data Type
D Kelly O'Day Excel & R Worlds Mod 2 - Excel & R Worlds: 15
Learn R Toolkit
Factors – What Are They? Why Are They important?
• Data sets often contain categorical data that subdivides data
– Gender (Male, Female)
– Language (English, French, Chinese, etc) – Energy Source (fossil, renewable)
Why use factor data type?1. Data integrity – limits data to
specified levels2. Easy to summarize data by factor
levels3. Multivariate plotting – distinguish
by factor
• Character data type does not ensure data integrity (M/F; m/f, male/female)
• R’s factor () function good for categorical data
>gender_f <- factor(gender, levels = c(“M”, “F”)– Establishes factor gender_f– Sets acceptable values with levels argument
to M or F
D Kelly O'Day Excel & R Worlds Mod 2 - Excel & R Worlds: 16
Learn R Toolkit
Comparison of Excel & R Worlds
Excel RDisplay Look at Data Look at Script
User Interface Point & Click Command Line
Data Structure WorkbookWorksheet
ColumnCell
Data.frame (List, Matrix, Array)Vector
Vector[I]
Data Types numericcharacterlogical
numericcharacterlogicalfactor
Data Summary & Query by Criteria
sumif(); sumproduct();{Array formulas}; Pivot Table
Index (,,,Match(,,),)
as.factor()subset(), tapply()
which()
D Kelly O'Day Excel & R Worlds Mod 2 - Excel & R Worlds: 17
Learn R Toolkit
How to Summarize Data by Factor
Menu
Array formulas are not user friendly
Excel World R World
D Kelly O'Day Excel & R Worlds Mod 2 - Excel & R Worlds: 18
Learn R Toolkit
Comparison of Excel & R Worlds
Excel RDisplay Look at Data Look at Script
User Interface Point & Click Command Line
Data Structure WorkbookWorksheet
ColumnCell
Data.frame (List, Matrix, Array)Vector
Vector[I]
Data Types numericcharacterlogical
numericcharacterlogicalfactor
Data Summary & Query by Criteria
sumif(); sumproduct();{Array formulas}; Pivot Table
=Index(,, Match,,),)
as.factor()subset(), tapply()
which()
Missing Values Inconsistent Handling Proper Handling
D Kelly O'Day Excel & R Worlds Mod 2 - Excel & R Worlds: 19
Learn R Toolkit
Missing Data
• Excel has serious shortcomings when it comes to handling missing data– Blanks can be converted to zero when blank cell is referenced in
formula– Functions like regression can not handle blank cells in range
• R addresses missing data directly– User specifies code for missing data in read.table()– User specifies how to handle missing data in calculations
Video on next slide shows how Excel & R handle missing data
D Kelly O'Day Excel & R Worlds Mod 2 - Excel & R Worlds: 20
Learn R Toolkit
Missing Data
Click video image to start video
D Kelly O'Day Excel & R Worlds Mod 2 - Excel & R Worlds: 21
Learn R Toolkit
Comparison of Excel & R Worlds
Excel RDisplay Look at Data Look at Script
User Interface Point & Click Command Line
Data Structure WorkbookWorksheet
ColumnCell
Data.frame (List, Matrix, Array)Vector
Vector[I]
Data Types numericcharacterlogical
numericcharacterlogicalfactor
Data Summary by Criteria
sumif(); sumproduct();{Array formulas}; Pivot Table
subset()
Missing Values Inconsistent Handling Proper Handling