Upload
janis-shields
View
217
Download
0
Tags:
Embed Size (px)
Citation preview
GZLM
... including GEE
• Generalized Linear Modelling• A family of significance tests• ... Something we don’t see mentioned much in
articles yet... but will hear more of• Maybe we should be using it!
• Often we have RQs and RHs that require us to compare groups and/or conditions. E.g.– Do student attitudes vary depending on year of study
and on which of two types of speaking instruction they received?
– Does word length and word frequency affect how well learners remember the word?
– Do speakers differ in their pronunciation of a sound depending on gender and formality of situation?
– Do students trained to use online dictionaries improve in writing more than those who are not?
• Well known statistical significance tests for these comparisons are the GLM family
• General Linear Model• GLM includes:– t tests– ANOVA– Pearson correlation– Linear regression
• But GLM is picky... comes with prerequisite requirements
• 1. the DV scale– Can’t deal with data that is not scores...– ...that are ‘equal interval’ and– ...on a supposedly open ended scale– Counts have to be treated as scores– Rating scales possibly, but... are they ‘equal interval’– Not binary data such as pass/fail or yes/no responses
• 2. Further features of the score data (in the population) e.g.– Normality of distribution shape of scores– Similarity of spread of scores in different groups
(aka homoscedasticity or homogeneity of variance)
– Similarity of variance of differences between pairs of repeated measures (aka sphericity)
• Previous ways of dealing with data that fails the prerequisites– Use GLM anyway, claiming it is ‘robust’ even when
prerequisites are missing• Or just use GLM and don’t check/mention the problems
– For normality, transform the data to be more ‘normal’ in shape (Example)• But results are then hard to talk about
– Use an alternative test (nonparametric, weaker)• But such tests are only available for simple comparisons
• Since the 80s, but only recently available in popular packages like SPSS...
• GZLM, including GEE, covers most of the ground of the GLM family, and more, and deals with most of the problems
• Generalized Linear Model itself for comparing groups only (GZLM)
• An extension of GZLM called Generalized Estimating Equations (GEE) for comparing repeated measures (and groups if necessary)
• An example comparing groups• We see the issue of choosing the right analysis
for the distribution shape• Marin’s data, DV6– DV: Six point rating scale response for how often
learners use vocab strategies– EV: Two genders– EV: Five years of study in university
• An example with repeated measures• We see how to turn the data into ‘long’ form
which GEE requires• Issariya’s data– DV: Percent correct scores for learning vocab
wordlists– EV: Pretest versus posttest– EV: Experimental group (with vocab learning strategy
instruction) and control group (with extra practice)
• An example with Poisson distribution• We see computational limitations• Nushoor’s data– DV: Counts of how often people used types of
modifier expression with requests– EV: Types of modifier– EV: Four groups (2 NS, 2 NNS)– EVs: Types of request situation in terms of social
variables such as power, and seriousness
• An example with binary data• Vineeta’s r data– DV: Numbers of r produced versus other variants– EVs: Various features of the word– EVs: Various features of the people– EV: Formality of situation
• This analysis is I think more or less equivalent to what traditional Varbrul analysis does....BUT– The output is in a different form– In fact this sort of analysis is not really statistically
acceptable anyway (see http://www.scottishhistorysociety.org/media/media_200043_en.pdf)
• To analyse data like Vineeta’s properly we need either the latest version of Varbrul called Rbrul, or GZLM Mixed... the latest bit of GZLM added to SPSS
• Watch this space....