Upload
mildred-grace-wood
View
378
Download
0
Embed Size (px)
Citation preview
ARGUMENT 'FUN' IS MISSING, WITH NO DEFAULT: AN R WORKSHOP
Outline
The R “sales pitch” R Basics Data Management Descriptive Statistics in R Inferential Statistics in R
General Linear Model Generalized Linear Model Hierarchical Linear Modeling Latent Variable Modeling
Why Should I Use R?
Free 99
It’s as powerful as SAS and as user friendly as SPSS…really…
You ain’t cool unless you use R
It’s free…seriously
R Basics
• Do not write code directly into the R interface!
• #Comment #StatsAreCool #Rarrrgh• Yes the # lets you add comments to your
code• R is case sensitive
• A ≠ a • <- is the assignment operator
• A <- 3; a <- 4
R Basics
• Creating objects in R – Creating a scalar • X <- 2
– Creating a vector • X <- c(2,2,4,5)
– Creating a matrix • X <- matrix(c(1,1,2,2,3,3),nrow=2, ncol=3)• Y <- matrix(c(1,1,1,1,1,1),nrow=3,ncol=2)
– Creating a dataframe • A <- c(1,2,3,4)• B <- c('T','F','T','F')• ds <- data.frame(A,B)
R Basics
Arithmetic 2 + 2; 2-2; 2*3;2/3
Boolean Operators 2 > 3; 3 < 6; 4 == 4
Matrix Algebra X%*%Y t(X) ginv(X)
R Basics
Packages in R Like SPSS modules, but free… Upside: Thousands of packages to do just
about anything Downside: Placing your trust in freeware…
which I’m fine with, but some aren’t library(MASS)
ginv(X)
I’m an import-exporter: Database Management Importing from a text file
Dataset <- read.table(‘filelocation.txt’) Importing from a csv file
Dataset <- read.csv(‘filelocation.csv’) Foreign package to read SPSS data files
package(foreign) Dataset <- read.spss(‘filelocation.sps’)
Database Management
Exporting R dataframes to csv write.csv(dataframe, ‘filelocation.csv’)
Exporting R dataframe to text file write.table(dataframe, ‘filelocation.txt’)
Variables in a dataframe Adding: ds$C <- c(4,3,2,1) Deleting: ds <- ds[,-3] Referencing: ds$A or ds[,1]
Database Management
Indexing Dataframes ds[,2] gives you column 2 of ds ds[1,] gives you row 1 of ds ds[2,2] gives you row 2 column 2 of ds
Descriptive Statistics
Measures of central tendency Mean – mean(X) Median – med(X) Mode – table(X) (A little round about, but oh
well) Measures of dispersion
var(X) sd(X)
Descriptive Statistics
Measures of Covariation cov(X,Y) – Covariance cor(X,Y) – Correlation
Caution!
I will not be talking about any of the theoretical underpinnings as to when or why you should use one statistical method over another.
We’ll just be doing some PnP statistics…
General Linear Model
Read Edwards & Lambert, 2007
X
M
Y
Z
Generalized Linear Model
Uses the generalized linear modeling function glm() Can handle dvs that are binomial, poisson,
multinomial, guassian
glm(y ~ x1 + x2, family=binomial, data=LRDS)
Hierarchical Linear Model
HLM allows you to look at between and within group variation Employees nested within organizations Repeated measures nested within an
individual Variance Components Analysis
Latent Variable Modeling
LV1
X1X2 X3
X4
LV2
Y1Y2 Y3
Y4
LV3
Y5Y6 Y7
Y8
First we have to setup a measurement model:
Latent Variable Modeling
LV1
LV2
LV3
Then we have to setup the structural model: