14
Chapter 2 What Type of Data Are We Dealing With? 1 Ihis Cllapter .... Distinguishing between discrete and continuous variables .... Understanding nominal, ordinal, interval and ratio levels measurement .... Knowing the dillerence between independent and dependent variables and covariates When you conducta research study in psychology, you normally collect data on a number oí variables. A variable is something you measure that can have a dilíerent value írom person to person or across time, like age, sell-esteem ancl weight. Data is the information that you gather about a variable. For example, il you gather information about the age a group oí people then the list oí their ages is your research data. (Not everything that you can measure is a variable, though, as you can read about in the 'Constantly unlnter esttng' sidebar, later in this chapter.) The data that you collect on ali the variables oí interest in a research study is oíten known as a data set - a collection oí information about severa! variables. A data set oíten contains informalion on severa! dilíerent types variables, and being able to distinguish between these variables is the

Libro en ingles

Embed Size (px)

Citation preview

Page 1: Libro en ingles

Chapter 2 What Type of Data AreWe Dealing With?

1 Ihis Cllapter .... Distinguishing between discrete and continuous variables.... Understanding nominal, ordinal, interval and ratio levels oí measurement.... Knowing the dillerence between independent and dependent variables and covariates

When you conducta research study in psychology, you normally collectdata on a number oí variables. A variable is something you measure that can have a dilíerent value írom person to person or across time, like age,sell-esteem ancl weight. Data is the information that you gather about avariable. For example, il you gather information about the age oí a group oípeople then the list oí their ages is your research data. (Not everything thatyou can measure is a variable, though, as you can read about in the'Constantly unlnter esttng' sidebar, later in this chapter.)

The data that you collect on ali the variables oí interest in a research studyis oíten known as a data set - a collection oí information about severa!variables. A data set oíten contains informalion on severa! dilíerent typesoí variables, and being able to distinguish between these variables is theessential íirst step in your analysis. No matter how complex your statisticalanalysis becomes, the íirsl queslion you always need to address is: wliattype of variables do I have?Thereíore, you can't be conJiclent aboutconducttng stalistical analysís unless you understand how to distinguishbetween variables. This is a basic skill that you musl know beloreattempting anything else in statislics. lf you can gel a handle on variables, slalistics suddenly seems a lot less coníusíng.

You can classify a variable in psychological research by """'Type: Discrete or continuous"""'Level of measurement: Nominal, ordinal, interval or ratio """'Its role in the research study: lndependent, dependent or covariate In this chapter, we discuss each of these ways of classifying a variable.

Understanding Discrete andContinuous Variables In classiíying a variable, you consider whether the variable measuresdiscrete categories or a continuum oí seores.

Page 2: Libro en ingles

Discrete variables, sometimes callee! categorical variables, are variablesthat contain separate and distinct categories. For example, a person'sgender is a discrete variable. Nonnally, gender is described as mal e oríemale in research studies. So, the variable 'gender' has two categories -mal e and íemale - so gender is a calegorical (discrele) variable. Imagine that we collect information about the age oí a group oí people (as part oí a research study rather than general nosiness). We could simply ask

people to record their age in years on a questionnaire. This is an example of a continuous variable. Age in years is a continuous variable because it'snot separated into distinct categories - time proceeds continuously- it has no breaks and you can always place your age along a conttnuum.Therefore, someone might record her age as 21 years old; another personmight record her age as 21.5 years old; another person might record her age as 21.56 years old and so on. The last two people in the example mightappear a bit weird, but they've gíven a valld answer to the question.They've just used a diflerent level ol accur acy in placing themselves on the age continuum. &t. ~ One trick to help you remember the dilference between the twotypes of variables is this: generally, a continuous variable is a variablewhere íractions are meaningíul anda discrete variable is a variablewhere íractions aren't meaningíul, and which can take only speciíicvalues. In the example, when you ask someone her age, she could gtve you anyanswer (theoretlcally, but in realíty you won't íind many people above 100 years old) in the íorm oí a íraction ií she wants - it would still bemeaningíul. U you ask someone their gender, they'r e likely to give you one oí two possible answers - mate or remate. ~.¡.\MG1 - (l) Whether you record a variable as discrete or continuous dependson how you measure it. For example, you can't say that age is acontinuous variable without knowing how age has been m easur ed inthe context oí a research study. U you ask peopl e to record their ageand give them the íollowing options: 'less than 25', ·25 to 40' and 'olderthan 40' then you've created a discrete variable. In this case, theperson can only choose one oí the three possible answers andanything in between these answers (any íraction) doesn't make sense.Therefore, you need to examine how you measured a variable beforeclassifyíng itas discrete or continuous.

Looking at Levels ofMeasurement

Page 3: Libro en ingles

You can classify variables according to their measurement properties.When you record variables on a data sheet, you usually record the valueson the variables as numbers, because this can facilitate statistical analysis.However, the numbers can have different measurement properties and this determines what types of analyses you can do with these numbers. Thevariable's leve! of measurement is a classification system that tells you what measurement properties the values of a variable have. The measurement properties that the values in a variable can possess are V Magnitude V Equal intervals V True absolute zero And these three measurement properties enable you to classify the level ofmeasurement of a variable into one of four types V Nominal V OrdinalV lnterval V Ratio We describe both the properties and the types in the sections that follow. Measurement properties The three measurement properties outlíned in the following sections arehierarchical. In other words, you can't have equal intervals unless avariable also has magnitude, and you can't have a true absolute zero pointunless a variable also has magnítude and equal intervals.

Magnitude The property oí magnitude means that you can order the values in avariable from highest to lowest. For example, take the exarnple of age asmeasured using the íollowing categories: 'l ess than 25', ·25 to 40' and 'older than 40'. ln your research study, imagine you give a score oí 1 on thevariable 'age' to people who report being less than 25; you give a score oí 2 to anyone who reports being between 25 to 40; and you give a score oí 3 toanyone who reports being older than 40. Therefore, your variable 'age'contains three values - 1, 2 or 3. These numbers have the property oí magnitude in that you can say that those who obtained a value ol 3 areolder than those who obtained a valué of 2 and they'r e older than those who obtained a value oí l. In this way, you can order the seores. Equal intervals The property oí equal interoals means that a unit diííerence on themeasurement scale is the sarne regardless oí where that unít diííerenceoccurs on the scale. For example, take the variable temperature. The diííerence between 10 degrees Celsius and 11 degrees Celsius is 1 degreeCelsius (one unit on the scale). Equally, the diííerence between 11 degrees Celsius and 12 degrees Celsius is also 1 degree Celsíus. This one-unítdiííerence is the same and rneans the same regardless oí where on the scaleit occurs.

Page 4: Libro en ingles

This ísn't true tor the example oí the age variable in the previous section. Inthis case, the diííerence between a value oí 1 anda value oí 2 is 1 (one unit)and the diííerence between the value oí 2 and the valué oí 3 is also l.However, these diííerences ar en't equal and, in tact, don't really makesense. Eííectively, we'r e asking: is the diííerence between 'less than 25' and'25 to 40' the same as the diííerence between '25 to 40' and 'older than 40'?The queslion doesn't make sense, which should tell you that this variabledoes not have the property of equal intervals. True absolute zero point The property oí a true absolute zero point means that the zero point on themeasurement scale is the point where nothing oí the variable exists and,thereíore, no seores less than zero exlst. Take the example oí weight measured in kilograms. At O kilograms, you consider the thing thal you'r emeasuring to have no weight, and there is no weight less than O kilograms.

But this ísn't true for the example of temperature measured in Celsius (seethe previous section). You can have temperatures below zero on this scale(and frequently do). For example, -12 degrees Celsius is a sensible value toreport. An exarnple oí a temperature scale that does have a true absolutezero point is temperature measured in degrees Kelvin, because notemperature lower than zero degrees Kelvin exists (at least no temperaturelower than this that scientists have been able to record). However, detennining a true absolute zero point is rarely something you need toconcern yoursell with when doing psychological research, so don't worry too much about this measurernent property.

Types of measurement level In the íollowing sections, we outline the tour levels oí measurement. Youcan always classiíy a variable into one oí these tour measurement levels. Nominal The nominal leve! oí measurement means that a variable has none oí thethree measurement properties (see the earlier section 'Measureruentproperties'). You measure a variable at the nominal leve! when you're usingthe numbers in the variable only as labels. For example, in your data set you míght gíve a participant a score oí 1 iíhe's mal e ora score oí 2 ií she's íemale. These numbers don't have anyproperties that you'd normally associate with numbers - they're slmplyshorthand labels. Because these numbers don't have any measurementproperties, you can't do any arithmetic with them (add, subtract and soon). In other wor ds, you couldn't add a score oí 2 to a score oí 1 and get ascore oí 3. Thís is equivalent to saying that you'd add a íemale to a mal eand get something that's not possible on your variable, because a score oí 3doesn't existl You also can't order these seores. because the labels arecompletely arbitrary and it makes justas much sense to gtve someone ascore oí 2 íf he's mal e anda score oí 1 lf she's íemale as it does to score 1tor mal e and 2 tor íem ale.

Page 5: Libro en ingles

Ordinal

lf you measure a variable at the ordinal level then the values on the variable have the measurement property of magnitude only (see the earlier section'Magnítude'). You measure a variable at the ordinal level when the seores in the variable are ordered r anks. For example, when measuring age using the categories 'less than 25', ·25 to40' and 'older than 40', you measure age at the ordinal level. lmagine thatyou gíve a score of 1 on the variable age to people who report being less than 25; you give a score of 2 to anyone who reports being between 25 to40; and you gíve a score of 3 to anyone who reports being older than 40. These numbers (1, 2 and 3) don't tell you how much older (or younger) oneperson is compared to another, simply that one person is older (or younger) than another. The numbers here are ordered ranks and only tellyou whether one score is greater or less than another score. We're psychologists, not English teachers People sometimes discuss whether you should use the word data in a plural orsingular sense. The word data is the plural of datum, so it is correct to insist thatit' su sed in a plural sen se. Therefore, yo u should say something like 'these data have the property of magnitude' rather than 'this data has the property ofmaqnitude', However, our aim in this book is to ensure that you understandstatistics, not to lecture you about the finer points of the English language. So,leave this type of debate to those who have time to engage in a bit of good old-fashioned pedantry. In psychological research, you rarely want to distinguish between variablesmeasured al the ínterval leve! and variables m easur ed at the ratio level.Therefore, you oíten refer to variables measured al the intervalfratio level(see the next section). Sometimes you hear the measurement oí variablesal the intervalfratio level referred to as a sea/e measurement, Interval/ratio lf you measure a variable al the interoal level oí measurement, it has the measur em ent properties oí m agnttude and equal intervals; íf you m easur e

a variable at the ratio level oí measurement, it has the measurement properties of magnitude, equal intervals anda true absolute zero (see theearlier section 'Measureruent properties'). Psychologists tend not to worry about identifying a true absolute zero point, and if the units on a variableare at equal intervals then the variable is at the interval/ratio level. For example, consider the variable weight, measured in kilograms. Weighthas the measurement property of equal intervals (and, by necessity, magnitude), and therefore is measured at the interval/ratio level.

Page 6: Libro en ingles

Determining the Role ofVariables Research studies in psychology, which involve collecting quantitatioe dat(that's any data that can be counted or rendered as numbers), usuallyrequire you to collect and store data on a data sheet about severalvariables. When it comes to conducting your statistical analyses on thisdata, you need to know what role each variable played in your research deslgn. Generally speaking, you classiíy variables in psychological researcdesigns as independent variables, dependent variables or covariates. (And íf you got hung up on the phrase 'thís data' in this paragraph, check out th'We're psychologísts, not English teachers' sidebar in this chapter.) #ªE// 1The way in which you measure a variable in a research studydetermines whether ít's continuous or discrete (see the section 'Understandíng Discrete and Continuous Variables') ami whether tt'smeasured at the nominal, ordinal or interval/ratio leve! (see thesection 'Lo okíng at Levels oí Measurement'). However, ít's the role oíthe variable in a research desígn that determines whether ít's anindependent variable, dependent variable or covariate, which weexplain in the íollowing sections. The role oí the variable can changefrom one research study to another.

Independent variables lndependent variables are sometimes referred to as predictor variables. Strictly speaking, an independent variable is a variable that you manipulateso that you can study how the changes in the independent variable influence changes in other variables. F or example, imagine that you're conducting a research study to examinethe eflect oí relaxation therapy in treating anxiety. You designa study where a group of people with anxiety receive a programme of relaxationtherapy and another group oí people with anxiety receive no additionaltreatment. In this simple research design the independent variable is theleve( o! intervention, which has two categories (relaxation therapy versus no additional treatment). In some cases, you refer to variables as independent variables even whenyou're not directly manipulating them. For exarnple, you might designaresearch study to examine the eflects oí spending time in prison on aperson's sell-esteem. In this case you have two groups o! participants -people who are in prison and people who ar en't in prison and you wtsh tocompare the sell-esteem levels oí these two groups. Here, whether or notyou'r e in prison is the independent variable, but as the researcher youdidn't control who went to prison and who didn't - this manipulation oí theindependent variable [ust occurs naturally. This type oí independent

Page 7: Libro en ingles

variable is a quasi-independent variable.

Dependent variables Dependent variables are sometimes referred to as outcome variables or criterion variables. A dependen! variable is usually the variable that youexpect to change when you manipulate the independent variable. In otherwords, the dependent variable is the variable that the independent variableallects. Thereíore, the dependent variable is so called because its valuedepends on the value oí the independent variable (at least in theory). In the research desígn examining the eHect oí relaxation therapy on levelsoí anxiety (see the last section), the independent variable is the leve! oí intervention ancl the dependenl variable is the parttcípants' level oíanxiety. In this case, you hypothesise that the participants' leve( oí anxiety

will depend on whether or not they receive the relaxation therapy. In thesecond example mentioned in the previous section, the independentvariable was whether or nota person is in prison, and the dependentvariable is the person's leve( oí seH-esteem.

Covariates A cooariate is a broad term used íor a variable in a research design that'sneither an independent nora dependent variable. In sorne designs you use a covariate to take account oí other íactors that might inlluence therelationship between the independent and dependent variable. Asan example, take the research design mentioned in the earlier section'(ndependent variables', where the study aims to examine the efíect oírelaxation therapy on anxiety. In this example, you could probably think oílots oí variables that ar en't part oí the intervention that might influence anxiety seores - tor example whether the participants are taking anymedication íor their anxíety, the type oí social support they receive at home, and so on. A good research design measures these variables so thatyou can account tor their inlluence on anxiety seores in the analysis. Within this research design, these variables are cooariates. Covariates can also exist in research designs where no independent ordependent variables exist. For example, imagine you're designing aresearch study to examine the relationship between corúidence andaccur acy oí eye witness testimony in court. Basically, you want to conducta study to determine whether eye witnesses who are more conJident abouttheir testimony are also more accurate in their testimony. In this study youhave two variables - coníidence and accur acy. However, neither variable ibeing m arupulated and you can't say that the values oí one variable aredependent on the other. In tact, we believe that the relationship betweenthe variables could work in either direction: that your confidence míghtdepend on your accur acy or that your accuracy mlght depend on yourconíidence. In this case, you can't clearly identiíy an independent variable

Page 8: Libro en ingles

anda dependenl variable. In this case, both variables are covariates, andare sometimes reíerred to as corre/ates.