GZLM... including GEE. Generalized Linear Modelling A family of significance tests... Something we don’t see mentioned much in articles yet... but will

GZLM

... including GEE

• Generalized Linear Modelling• A family of significance tests• ... Something we don’t see mentioned much in

articles yet... but will hear more of• Maybe we should be using it!

• Often we have RQs and RHs that require us to compare groups and/or conditions. E.g.– Do student attitudes vary depending on year of study

and on which of two types of speaking instruction they received?

– Does word length and word frequency affect how well learners remember the word?

– Do speakers differ in their pronunciation of a sound depending on gender and formality of situation?

– Do students trained to use online dictionaries improve in writing more than those who are not?

• Well known statistical significance tests for these comparisons are the GLM family

• General Linear Model• GLM includes:– t tests– ANOVA– Pearson correlation– Linear regression

• But GLM is picky... comes with prerequisite requirements

• 1. the DV scale– Can’t deal with data that is not scores...– ...that are ‘equal interval’ and– ...on a supposedly open ended scale– Counts have to be treated as scores– Rating scales possibly, but... are they ‘equal interval’– Not binary data such as pass/fail or yes/no responses

• 2. Further features of the score data (in the population) e.g.– Normality of distribution shape of scores– Similarity of spread of scores in different groups

(aka homoscedasticity or homogeneity of variance)

– Similarity of variance of differences between pairs of repeated measures (aka sphericity)

• Previous ways of dealing with data that fails the prerequisites– Use GLM anyway, claiming it is ‘robust’ even when

prerequisites are missing• Or just use GLM and don’t check/mention the problems

– For normality, transform the data to be more ‘normal’ in shape (Example)• But results are then hard to talk about

– Use an alternative test (nonparametric, weaker)• But such tests are only available for simple comparisons

• Since the 80s, but only recently available in popular packages like SPSS...

• GZLM, including GEE, covers most of the ground of the GLM family, and more, and deals with most of the problems

• Generalized Linear Model itself for comparing groups only (GZLM)

• An extension of GZLM called Generalized Estimating Equations (GEE) for comparing repeated measures (and groups if necessary)

• An example comparing groups• We see the issue of choosing the right analysis

for the distribution shape• Marin’s data, DV6– DV: Six point rating scale response for how often

learners use vocab strategies– EV: Two genders– EV: Five years of study in university

• An example with repeated measures• We see how to turn the data into ‘long’ form

which GEE requires• Issariya’s data– DV: Percent correct scores for learning vocab

wordlists– EV: Pretest versus posttest– EV: Experimental group (with vocab learning strategy

instruction) and control group (with extra practice)

• An example with Poisson distribution• We see computational limitations• Nushoor’s data– DV: Counts of how often people used types of

modifier expression with requests– EV: Types of modifier– EV: Four groups (2 NS, 2 NNS)– EVs: Types of request situation in terms of social

variables such as power, and seriousness

• An example with binary data• Vineeta’s r data– DV: Numbers of r produced versus other variants– EVs: Various features of the word– EVs: Various features of the people– EV: Formality of situation

• This analysis is I think more or less equivalent to what traditional Varbrul analysis does....BUT– The output is in a different form– In fact this sort of analysis is not really statistically

acceptable anyway (see http://www.scottishhistorysociety.org/media/media_200043_en.pdf)

http://www.scottishhistorysociety.org/media/media_200043_en.pdf

http://www.scottishhistorysociety.org/media/media_200043_en.pdf

• To analyse data like Vineeta’s properly we need either the latest version of Varbrul called Rbrul, or GZLM Mixed... the latest bit of GZLM added to SPSS

• Watch this space....

Documents

GZLM... including GEE. Generalized Linear Modelling A family of significance tests... Something we don’t see mentioned much in articles yet... but will