34
An introduction to R Jorge Cimentada and Basilio Moreno 6th of July 2019 Rstudio is just a nice software to run R! A factor is just a way of saying that a variable has unique values! And they can be ordered. elm <- c ( "Good" , "Bad" , "Medium" ) ( elm_factor <- factor ( elm, levels = c ( "Bad" , "Medium" , "Good" ) , ordered = T )) [1] Good Bad Medium Levels: Bad < Medium < Good This has consequences table ( elm ) elm Bad Good Medium 1 1 1 table ( elm_factor ) elm_factor Bad Medium Good 1 1 1

An introduction to R · What is R? R is a programming language designed to do data analysis, usually interactive. R is helpful for.. Getting that darn excel/stata file into R (importing)

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: An introduction to R · What is R? R is a programming language designed to do data analysis, usually interactive. R is helpful for.. Getting that darn excel/stata file into R (importing)

AnintroductiontoRJorgeCimentadaandBasilioMoreno

6thofJuly2019

RstudioisjustanicesoftwaretorunR!

Afactorisjustawayofsayingthatavariablehas

uniquevalues!Andtheycanbeordered.

elm<-c("Good","Bad","Medium")(elm_factor<-factor(elm,levels=c("Bad","Medium","Good"),ordered=T))

[1]GoodBadMediumLevels:Bad<Medium<Good

Thishasconsequencestable(elm)

elmBadGoodMedium111

table(elm_factor)

elm_factorBadMediumGood111

Page 2: An introduction to R · What is R? R is a programming language designed to do data analysis, usually interactive. R is helpful for.. Getting that darn excel/stata file into R (importing)

HowtoinstallR?Luckily,youguyshaveRandRstudioinstalled,soyou

don'thavetoworryaboutthis!

Butifyouwanttoinstallitathome,pleasefollow

ThatguidecanhelpyouinstallRRstudioAndswirl,apackageinwhichyoucoulddoabunchofexercisesashomework!

thisguide

Page 3: An introduction to R · What is R? R is a programming language designed to do data analysis, usually interactive. R is helpful for.. Getting that darn excel/stata file into R (importing)

WhatisR?Risaprogramminglanguagedesignedtododata

analysis,usuallyinteractive.

Rishelpfulfor..Gettingthatdarnexcel/statafileintoR(importing)Turningthatveryuglydatasetintosomethingtoworkwith(datacleaning)Automatingyourweeklyreports(automatingtasks)Analyzingdata(modeling)Creatingnicelyformatteddocuments(communicatingresults)Buildingyourowncommandstodospecificthings(functions)

Buildingverycreativegraphics

Amongmanythings…

Page 4: An introduction to R · What is R? R is a programming language designed to do data analysis, usually interactive. R is helpful for.. Getting that darn excel/stata file into R (importing)

Andso..whatisRstudio?

Page 5: An introduction to R · What is R? R is a programming language designed to do data analysis, usually interactive. R is helpful for.. Getting that darn excel/stata file into R (importing)

Andso..whatisRstudio?

Page 6: An introduction to R · What is R? R is a programming language designed to do data analysis, usually interactive. R is helpful for.. Getting that darn excel/stata file into R (importing)

Let'sgettoitthen!Risaninteractivelanguage.Thatmeansthatifyoutype

anumber,youwillgetanumber.#Input10

[1]10

#Input5

[1]5

Page 7: An introduction to R · What is R? R is a programming language designed to do data analysis, usually interactive. R is helpful for.. Getting that darn excel/stata file into R (importing)

IntroductiontoRobjectsRisalsoacalculator

TrytypingtheseoperationsinR:5+510-510*520/10(10*20)-5/2+22^3

Beforewecontinue,whattypeofoperationsarethese?

Answersinnextslide!

Page 8: An introduction to R · What is R? R is a programming language designed to do data analysis, usually interactive. R is helpful for.. Getting that darn excel/stata file into R (importing)

IntroductiontoRobjectsAdditionSubtractionMultiplicationDivisionAcombinationofallExponentiation

NumbersinRarecallednumerics.

Forexample:is.numeric(10)is.numeric(10+20)is.numeric(10/2)

Page 9: An introduction to R · What is R? R is a programming language designed to do data analysis, usually interactive. R is helpful for.. Getting that darn excel/stata file into R (importing)

IntroductiontoRobjectsHavingsinglenumbers,like10,isnotveryuseful.

Wewantsomethingsimilartoacolumnofadataset,likeageorincome.

Wecandothatwithc(),whichstandsforconcatenate.

Readthisexpressionas:concatenatethesenumbersintoasingleobject.

c(32,34,18,22,65)

[1]3234182265

Page 10: An introduction to R · What is R? R is a programming language designed to do data analysis, usually interactive. R is helpful for.. Getting that darn excel/stata file into R (importing)

IntroductiontoRobjectsWecanalsogiveitaname,likeage.

Whydidn'ttheresultgetprinted?Whereisthisageobjectat?Whatisformallytheageobject?

age<-c(32,34,18,22,65)

Page 11: An introduction to R · What is R? R is a programming language designed to do data analysis, usually interactive. R is helpful for.. Getting that darn excel/stata file into R (importing)

IntroductiontoRobjectsWejustcreatedourfirstvariable!Thetypical

SAS/Excel/Statacolumn.

InR,theseobjectsarecalled'vectors'.

Vectorscanhaveseveralflavours:Numerics(wejustsawone)LogicalsCharactersFactors

Page 12: An introduction to R · What is R? R is a programming language designed to do data analysis, usually interactive. R is helpful for.. Getting that darn excel/stata file into R (importing)

IntroductiontoRobjectsSupposetheseagesbelongtocertainpeople.Wecan

createacharactervectorwiththeirnames.

Followingthisguideline,createityourself.Createacharactervectorwithc()IncludethenamesPaul,Maria,Andres,RobertoandAliciainsidewrapeverynameinquoteslikethis“Paul”,“Maria”,etc…ThiswillmakeRunderstandthatinputascharacters.

Page 13: An introduction to R · What is R? R is a programming language designed to do data analysis, usually interactive. R is helpful for.. Getting that darn excel/stata file into R (importing)

IntroductiontoRobjectsAnswer:

Wecanalsogiveitaname,likeparticipants.

c("Paul","Maria","Andres","Robert","Alicia")

[1]"Paul""Maria""Andres""Robert""Alicia"

participants<-c("Paul","Maria","Andres","Robert","Alicia")

Page 14: An introduction to R · What is R? R is a programming language designed to do data analysis, usually interactive. R is helpful for.. Getting that darn excel/stata file into R (importing)

IntroductiontoRobjectsCharactervectorsarefilledbystrings,like“Paul”or

“Maria”.

Canwedooperationswithstrings?

Makessense..wecan'taddanyletters.

Alright,we'reset.Concatenatethenumericvectorageandparticipants.

"Paul"+"Maria"

Errorin"Paul"+"Maria":non-numericargumenttobinaryoperator

Page 15: An introduction to R · What is R? R is a programming language designed to do data analysis, usually interactive. R is helpful for.. Getting that darn excel/stata file into R (importing)

IntroductiontoRobjects

What'stheproblemwiththisresult?

c(age,participants)

[1]"32""34""18""22""65""Paul""Maria"[8]"Andres""Robert""Alicia"

Page 16: An introduction to R · What is R? R is a programming language designed to do data analysis, usually interactive. R is helpful for.. Getting that darn excel/stata file into R (importing)

ThisbreaksanRlaw!Wejoinedanumericvectorandacharactervector.

VectorscanONLYbeofoneclass.c(1,"one")#forcestocharactervector

[1]"1""one"

c(1,"1")#notethatthefirstoneisanumeric,whilethesecondisacharacter

[1]"1""1"

Page 17: An introduction to R · What is R? R is a programming language designed to do data analysis, usually interactive. R is helpful for.. Getting that darn excel/stata file into R (importing)

IntroductiontoRobjectsNow,whichofthesepeoplehasanageabove20?

That'salogicalvector.

Contrarytocharacterandnumericvectors,logicalvectorscanonlyhavethreevalues:

TRUEFALSENA(whichstandsfor“Notavailable”.)

age>20

[1]TRUETRUEFALSETRUETRUE

Page 18: An introduction to R · What is R? R is a programming language designed to do data analysis, usually interactive. R is helpful for.. Getting that darn excel/stata file into R (importing)

IntroductiontoRobjectslogicalscanbecreatedmanuallyorwithalogical

statement.

Theaboveexpressiontestsforthelogicalstatement.

Forexample,

c(TRUE,FALSE,TRUE,TRUE)

[1]TRUEFALSETRUETRUE

age<60

[1]TRUETRUETRUETRUEFALSE

3234182265TRUETRUETRUETRUEFALSE

Page 19: An introduction to R · What is R? R is a programming language designed to do data analysis, usually interactive. R is helpful for.. Getting that darn excel/stata file into R (importing)

IntroductiontoRobjectsYoucanalsowriteTorFasshortabbreviationsofTRUE

andFALSE.

Whichiscomparing:

Butbehindthescenes,TRUEandTarejusta1andFandFALSEarejusta0.

Whatistheresultofthis?

c(T,T,F,T)==c(TRUE,TRUE,FALSE,TRUE)

[1]TRUETRUETRUETRUE

TRUETRUEFALSETRUE"T""T""F""T"

T+5TRUE-5FALSE+TRUET+T-FALSE

Page 20: An introduction to R · What is R? R is a programming language designed to do data analysis, usually interactive. R is helpful for.. Getting that darn excel/stata file into R (importing)

IntroductiontoRobjectsNowthatyouknowthat..whatwouldbetheclassofthe

followingvectors?

numeric:TRUEiscoercedto1character:“FALSE”isastring,can'tbeturnedtoanumberlogical:bothelementsarelogical!numeric:FALSEiscoercedto0

c(5,TRUE)c(5,"FALSE")c(FALSE,TRUE)c(1,FALSE)

Page 21: An introduction to R · What is R? R is a programming language designed to do data analysis, usually interactive. R is helpful for.. Getting that darn excel/stata file into R (importing)

IntroductiontoRobjectsWhatdoweknowsofar?

NumericvectorsCharactervectorsLogicalvectorsHowtoassignanametothesevectorsVectorscancontainonlyoneclassofdata

What'smissing?Factors

Page 22: An introduction to R · What is R? R is a programming language designed to do data analysis, usually interactive. R is helpful for.. Getting that darn excel/stata file into R (importing)

IntroductiontoRobjectsFactorsareR'swayofstoringcategoricalvariables.

Categoriessuchas:'Male'and'Female'or'Married'and'Divorced''Good','Middle'and'High'

gender<-c("Male","Female","Male","Male","Female")

#Canbeturnedinto

gender<-factor(gender)

Page 23: An introduction to R · What is R? R is a programming language designed to do data analysis, usually interactive. R is helpful for.. Getting that darn excel/stata file into R (importing)

IntroductiontoRobjects

Page 24: An introduction to R · What is R? R is a programming language designed to do data analysis, usually interactive. R is helpful for.. Getting that darn excel/stata file into R (importing)

IntroductiontoRobjectsFactorsareusefulforsomespecificoperationslike:

ChangingorderoflevelsfortermsinmodellingChangingorderofaxislabelsinplotsAmongotherthings..

Inmanycasesyoucanusecharacterstodowhatyouwouldwantwithfactors!

Page 25: An introduction to R · What is R? R is a programming language designed to do data analysis, usually interactive. R is helpful for.. Getting that darn excel/stata file into R (importing)

IntroductiontoRobjectsNow,haveyounoticedthatwe'vebeenassigningnames

tothings?

Thenameageholdsalltheseelementsinside.Howdoweknowwhereallthevariableswe'vecreatedare?

Let'saskRwhatobjectscanbelistedfromourworkspaceorenvironment.

age

[1]3234182265

ls()

[1]"age""elm""elm_factor""gender"[5]"lgl""participants"

Page 26: An introduction to R · What is R? R is a programming language designed to do data analysis, usually interactive. R is helpful for.. Getting that darn excel/stata file into R (importing)

IntroductiontoRobjectsSofar,wehaveabunchofvariablesscatteredaround

ourworkspace.Thisisusuallynothewaytogo!Wewanttogroupsimilarthingsinthesameplace.

AdataframeisusuallytheprimarystructureofanalysisinR

our_df<-data.frame(name=participants,age=age,gender=gender,age_60=lgl)our_df

nameagegenderage_601Paul32MaleTRUE2Maria34FemaleTRUE3Andres18MaleTRUE4Robert22MaleTRUE5Alicia65FemaleFALSE

Page 27: An introduction to R · What is R? R is a programming language designed to do data analysis, usually interactive. R is helpful for.. Getting that darn excel/stata file into R (importing)

IntroductiontoRobjectsIt'simportantthatyouunderstandthethingthatdefines

adataframe.Adataframehasrowsandcolumns,moretechnicallycalleddimensions.Dataframeshavetwodimensions.

dim(our_df)

[1]54

nrow(our_df)

[1]5

ncol(our_df)

[1]4

Page 28: An introduction to R · What is R? R is a programming language designed to do data analysis, usually interactive. R is helpful for.. Getting that darn excel/stata file into R (importing)

IntroductiontoRobjectsDataframesareverydistinctivebecausetheycanhold

anytypeofvector.

Matricescannot!

Matricesareverysimilartodataframes.Theyhavesamenumberofdimensions.Youcanchooserows/columnsinsimilarways.

our_matrix<-matrix(1:20,ncol=4,nrow=5)our_matrix

[,1][,2][,3][,4][1,]161116[2,]271217[3,]381318[4,]491419[5,]5101520

Page 29: An introduction to R · What is R? R is a programming language designed to do data analysis, usually interactive. R is helpful for.. Getting that darn excel/stata file into R (importing)

IntroductiontoRobjectsFinally,we'remissingthesecretingridientthedifferentiatesbothmatricesanddataframes.

Lists

Page 30: An introduction to R · What is R? R is a programming language designed to do data analysis, usually interactive. R is helpful for.. Getting that darn excel/stata file into R (importing)

IntroductiontoRobjectsThinkoflistsasabagthatcanstoreanything.

Thisisabagthathas3objects.AcharachterAfactorAnumeric

our_list<-list(names=participants,gender=gender,age=age)our_list

$names[1]"Paul""Maria""Andres""Robert""Alicia"

$gender[1]MaleFemaleMaleMaleFemaleLevels:FemaleMale

$age[1]3234182265

Page 31: An introduction to R · What is R? R is a programming language designed to do data analysis, usually interactive. R is helpful for.. Getting that darn excel/stata file into R (importing)

IntroductiontoRobjectsThinkoutsidethebox…whenIsayanything,Imean

ANYTHING!complex_list<-list(df=our_df[1:3,],matrix=our_df[1:3,],avg_age=mean(age))complex_list

$dfnameagegenderage_601Paul32MaleTRUE2Maria34FemaleTRUE3Andres18MaleTRUE

$matrixnameagegenderage_601Paul32MaleTRUE2Maria34FemaleTRUE3Andres18MaleTRUE

$avg_age[1]34.2

Page 32: An introduction to R · What is R? R is a programming language designed to do data analysis, usually interactive. R is helpful for.. Getting that darn excel/stata file into R (importing)

IntroductiontoRobjectsTosumup,thesearethe4typesofdatastructures

availableinR.

Page 33: An introduction to R · What is R? R is a programming language designed to do data analysis, usually interactive. R is helpful for.. Getting that darn excel/stata file into R (importing)

IntroductiontoRobjectsNowI'mgonnarockyourworld…

Adataframeisalist(becauseitcanhaveanyclass)witharowandcolumndimensions.

data.frame(our_list)

namesgenderage1PaulMale322MariaFemale343AndresMale184RobertMale225AliciaFemale65

Page 34: An introduction to R · What is R? R is a programming language designed to do data analysis, usually interactive. R is helpful for.. Getting that darn excel/stata file into R (importing)

Tobecontinued….