47
There’s two ways to say it: Modeling nonprestige there’s BRIAN RIORDAN Abstract A variety of linguistic, processing, and social factors have been shown to be associated with concord in existential there constructions (e. g., Meechan and Foley 1994; Tagliamonte1998). Using data from the Michi- gan Corpus of Academic Spoken English (MICASE), I assess the predic- tivity of various factors for the appearance of nonprestige there’s, including the novel factor of discourse type. Two binary logistic regression models are developed to model the production of existentials. In both models, highly predictive factors included the type of determiner that appeared in the post- copular NP, the age of the speaker, and crucially, the type of discourse monologic, interactive, or mixed in which speakers were engaged. A “mixed-effect” (Baayen, Davidson et al., submitted) logistic regression model provided a better fit to the data than the standard logistic regression model used in previous studies, highlighting the importance of idiolectal variation in the production of existentials. Implications for syntactic theo- ries of the existential construction are discussed. Keywords: existential there; syntactic variation; agreement; concord; lo- gistic regression; mixed effects models. 1. Introduction “Singular agreement”, “nonagreement” (Schütze 1999), or “noncon- cord” (Martinez Insua and Palacios Martinez 2003) in existential there constructions is one of the most widely attested and studied nonstandard morphosyntactic features in English. Britain and Sudbury (2002) counted no less than 30 studies that either focus on or document the existence of this feature; it has been reported in English varieties in the United Kingdom, Canada, the United States, Australia, New Zealand, Corpus Linguistics and Linguistic Theory 32 (2007), 233279 1613-7027/07/00030233 DOI 10.1515/CLLT.2007.013 Walter de Gruyter Brought to you by | Johns Hopkins University Authenticated | 134.208.103.160 Download Date | 4/9/14 6:59 PM

There's two ways to say it: Modeling nonprestige there's

  • Upload
    brian

  • View
    216

  • Download
    1

Embed Size (px)

Citation preview

Page 1: There's two ways to say it: Modeling nonprestige there's

There’s two ways to say it:Modeling nonprestige there’s

BRIAN RIORDAN

Abstract

A variety of linguistic, processing, and social factors have been shown tobe associated with concord in existential there constructions (e. g.,Meechan and Foley 1994; Tagliamonte 1998). Using data from the Michi-gan Corpus of Academic Spoken English (MICASE), I assess the predic-tivity of various factors for the appearance of nonprestige there’s, includingthe novel factor of discourse type. Two binary logistic regression models aredeveloped to model the production of existentials. In both models, highlypredictive factors included the type of determiner that appeared in the post-copular NP, the age of the speaker, and crucially, the type of discourse �monologic, interactive, or mixed � in which speakers were engaged. A“mixed-effect” (Baayen, Davidson et al., submitted) logistic regressionmodel provided a better fit to the data than the standard logistic regressionmodel used in previous studies, highlighting the importance of idiolectalvariation in the production of existentials. Implications for syntactic theo-ries of the existential construction are discussed.

Keywords: existential there; syntactic variation; agreement; concord; lo-gistic regression; mixed effects models.

1. Introduction

“Singular agreement”, “nonagreement” (Schütze 1999), or “noncon-cord” (Martinez Insua and Palacios Martinez 2003) in existential thereconstructions is one of the most widely attested and studied nonstandardmorphosyntactic features in English. Britain and Sudbury (2002)counted no less than 30 studies that either focus on or document theexistence of this feature; it has been reported in English varieties in theUnited Kingdom, Canada, the United States, Australia, New Zealand,

Corpus Linguistics and Linguistic Theory 3�2 (2007), 233�279 1613-7027/07/0003�0233DOI 10.1515/CLLT.2007.013 � Walter de Gruyter

Brought to you by | Johns Hopkins UniversityAuthenticated | 134.208.103.160Download Date | 4/9/14 6:59 PM

Page 2: There's two ways to say it: Modeling nonprestige there's

234 B. Riordan

and elsewhere. Indeed, nonconcord in existentials may be a universalfeature of spoken English.

Studies of the distribution of concord in existentials have rarely showncategoricity. Some studies have come close: Eisikovits (1991) reported97.7 % nonconcord in the present tense and 88.9 % in the past tense in astudy of teenagers in Sydney, Australia. More often, however, the per-centages have been much lower: Tagliamonte (1998), studying only pasttense variation in York, UK, reported a figure of 62 %, and Meechanand Foley (1994), reporting on older speakers of Ottawa, Canada, found71 % nonconcord across both tenses (see Table 1).

This predominance of the singular copula in existentials � in particu-lar the contracted form there’s � has led some to speculate that singularagreement is actually the more basic pattern, or that a form like there’sis stored in the mental lexicon as an unanalyzed unit. Cheshire (1999)argues for the latter, characterizing there’s as a formula that serves animportant conversational function: taking the floor and quickly intro-ducing and leading listeners to a new referent. Cheshire speculates thatit is only when turns are fixed, as in monologic discourse, that speakershave the time to plan and adjust their speech to prescriptive rules ofgrammar, where the preference for concord in existentials presumablycomes from (Cheshire 1999).

Nevertheless, theoretical accounts of agreement in existentials havetraditionally taken plural agreement to be the more basic pattern, andthe one deserving of explanation (e. g., Chomsky 1995, Hazout 2004,Moro 2006). Recently, however, a number of theoretical accounts havebegun to emerge that attempt to reckon with variation in agreementpatterns in existentials (Meechan and Foley 1994; Schütze 1999; Henry2002, 2005; Pietsch 2005; Rupp 2005; Parrot 2007).

Since concord variability in existentials has drawn the attention mainlyof sociolinguistic researchers, most corpora used for quantitative re-search have been based on small-scale sociolinguistic interview projectswith members of particular social networks or within isolated speechcommunities. This has been the case particularly in the United States,where quantitative reports exist for several communities in the southernUnited States (e. g., in Appalachia and the Ozarks, see Christian, Wol-fram et al. 1988). As a result, there has been a focus on nonstandardvarieties, and the number of existential examples analyzed in any onestudy has been relatively small (Table 1). Recently, however, variation inexistential there concord has begun to attract the attention of researchersoutside sociolinguistics (e. g., Martinez Insua and Palacios Martinez 2003and Crawford 2005). These researchers have investigated the degree ofnonconcord in existentials in much larger corpora, and have comparednonconcord in spoken registers with written registers.

Brought to you by | Johns Hopkins UniversityAuthenticated | 134.208.103.160Download Date | 4/9/14 6:59 PM

Page 3: There's two ways to say it: Modeling nonprestige there's

Modeling nonprestige there’s 235T

able

1.C

hara

cter

isti

csof

rece

ntqu

anti

tati

vest

udie

sof

conc

ord

inex

iste

ntia

ls(N

>20

0)

Lin

guis

tic

Var

iabl

eC

orpo

raN

ames

Cor

pora

Type

sN

umbe

r%

Non

conc

ord

Mul

ti-of

Cas

esin

exis

tent

ials

vari

ate

Ana

lysi

s?

Mee

chan

and

Fol

ey(1

994)

ther

e�

BE

�pl

ural

1.A

fric

anN

ova

Scot

ian

Spok

en(I

nfor

mal

)23

571

%(1

67/2

35)a

Yes

post

copu

lar

NP

Eng

lish

Con

trol

Gro

upC

orpu

s2.

Lin

guis

tics

Dep

artm

ent

Arc

hive

sof

Spok

enL

angu

age

Mat

eria

lsTa

glia

mon

te(1

998)

ther

e�

past

BE

Yor

kE

nglis

hC

orpu

sSp

oken

(Inf

orm

al)

310

(pas

tonl

y)62

%(1

93/3

10)

Yes

Bri

tain

and

Sudb

ury

ther

e�

BE

�pl

ural

Wel

lingt

onC

orpu

sof

Spok

enSp

oken

(Inf

orm

al)

538

Una

vaila

ble

Yes

(200

2):N

ewZ

eala

ndpo

stco

pula

rN

PN

ewZ

eala

ndE

nglis

hE

nglis

hB

rita

inan

dSu

dbur

yth

ere

�B

E�

plur

alF

alkl

and

Isla

ndda

taSp

oken

(Inf

orm

al)

538

Una

vaila

ble

Yes

(200

2):F

alkl

and

Isla

ndpo

stco

pula

rN

PE

nglis

hB

rita

in(2

002)

ther

e�

past

BE

Fen

sda

taan

dor

alhi

stor

ySp

oken

(Inf

orm

al)

426

44%

(186

/426

)N

ob

arch

ive

Mar

tinez

Insu

aan

dth

ere

�B

EB

ritis

hN

atio

nalC

orpu

sSp

oken

(For

mal

1447

13%

No

Pal

acio

sM

arti

nez

(200

3)an

dIn

form

al)

Hay

and

Schr

eier

(200

4)th

ere

�B

E�

plur

al1.

Mob

ileU

nitA

rchi

veSp

oken

(Inf

orm

al)

1028

58%

(599

/102

8)Y

espo

stco

pula

rN

P2.

Inte

rmed

iate

Arc

hive

3.C

ante

rbur

yC

orpu

sC

raw

ford

(200

5)th

ere

�B

E�

plur

alL

ongm

anSp

oken

Cor

pus

Spok

en(I

nfor

mal

)23

0550

%N

opo

stco

pula

rN

Pth

ere

�B

E�

plur

alT

OE

FL

2000

Spok

enan

dW

ritt

enSp

oken

(For

mal

)14

8145

%po

stco

pula

rN

PA

cade

mic

Lan

guag

eC

orpu

sP

iets

ch(2

005)

ther

e�

pres

/pas

tBE

Nor

ther

nIr

elan

dT

rans

crib

edSp

oken

(Inf

orm

al)

2072

c-s

Yes

Cor

pus

ofSp

eech

68%

(517

/758

)-r 13

%(1

69/1

314)

Rio

rdan

(200

7)th

ere

�pr

esB

E�

Mic

higa

nC

orpu

sof

Spok

enSp

oken

(For

mal

1520

40%

Yes

plur

alpo

stco

pula

rN

PA

cade

mic

Eng

lish

and

Info

rmal

)

aM

eech

anan

dF

oley

repo

rta

figu

reof

72%

nonc

onco

rdou

tof2

46to

kens

,but

they

subs

eque

ntly

redu

ceth

esi

zeof

the

corp

usto

235

toke

ns.

71%

isth

eco

rrec

tper

cent

age

ofno

ncon

cord

inth

ere

duce

dco

rpus

base

don

thei

rT

able

5.b

Bri

tain

’sst

udy

does

notp

rese

nta

sepa

rate

mul

tiva

riat

ean

alys

isof

the

dist

ribu

tion

ofex

iste

ntia

lthe

rese

nten

ces.

cF

rom

Pie

tsch

(200

5)T

able

5.13

Brought to you by | Johns Hopkins UniversityAuthenticated | 134.208.103.160Download Date | 4/9/14 6:59 PM

Page 4: There's two ways to say it: Modeling nonprestige there's

236 B. Riordan

Many factors, linguistic and extralinguistic, have been proposed toinfluence the rate at which speakers produce nonconcord in existentials.One idea, which has received some support from several quantitativestudies, is that greater education is associated with higher levels of con-cord. Meechan and Foley (1994) assert this idea most forcefully as theyattempt to explain their quantitative results indicating a significant asso-ciation between level of education (less than or greater than 11 years offormal education) and concord in existentials:

… concord in existentials may be linked to grammatical rules encoun-tered during the later stages of formal education. There was an overallincreased use of concord associated with attendance at high schooland exposure to more advanced rules of grammar. We assert that theseprescriptive rules of grammar taught in high school have influencedthe use of concord in both these sets of speakers and serve to obscurethe fact that nonconcord is the norm. This also provides an explana-tion for the fact that concord is assumed in most structural analyses,as the authors of these analyses are all highly educated. (Meechan andFoley 1994: 82)

Meechan and Foley thus claim a crucial role for formal education, inparticular “exposure to advanced rules of grammar”, in bringing abouta shift from nonconcord to concord in existentials. However, Meechanand Foley do not make explicit any predictions about the nature of sucha shift.

Crawford (2005) argues against such a conclusion based on his com-parison of the percentage of nonconcord in existentials in the LongmanSpoken Corpus and the TOEFL 2000 Spoken and Written AcademicLanguage Corpus (T2K-SWAL). The Longman Spoken Corpus is a 2.5million word corpus of informal conversation representing all levels ofeducation, while the T2K-SWAL corpus is primarily composed of aca-demic lectures. In the context of a plural postcopular NP, the percentageof nonconcord in the Longman corpus was 50 %, while that in the T2K-SWAL corpus was 45 %. Assuming that the level of education of speak-ers in the T2K-SWAL corpus was higher than that of speakers in theLongman corpus, Crawford concludes that there is little difference inexistential concord behavior based on level of education.1

If the reasoning of Cheshire (1999) is correct, we might expect the typeof discourse � monologic versus conversational, and thus formal versusinformal � to affect the percentage of nonconcord in existentials. Spe-cifically, we might expect the percentage of nonconcord in informalcontexts to be high, as in previous sociolinguistic reports, when com-pared with more monologic speech, where speakers might tend to more

Brought to you by | Johns Hopkins UniversityAuthenticated | 134.208.103.160Download Date | 4/9/14 6:59 PM

Page 5: There's two ways to say it: Modeling nonprestige there's

Modeling nonprestige there’s 237

closely follow standard prescriptions on agreement. Crawford (2005),however, argues that his results do not support a controlling effect forformality, since the T2K-SWAL corpus evidenced a level of nonconcordvery similar to that of the informal conversation in the Longman corpus.However, it is unclear whether the lecture-based T2K-SWAL corpus ismonolithically formal; some lectures may have included a degree of in-teractivity, which may have affected perceptions of the discourse typein which participants were engaged and thereby changed their concordpreferences in existentials.

The present study investigates in more detail the effects of level ofeducation and discourse type on nonconcord production. Like Crawford(2005) and Martinez Insua and Palacios Martinez (2003), I use a largecorpus to gain a “wide-angle” view of the phenomenon. Like many pre-vious sociolinguistic approaches (see Table 1), I also attempt to modelthe production of concord in existentials using multivariate statisticalmodels.

The corpus used in this study is the Michigan Corpus of AcademicSpoken English (MICASE) (Simpson, Briggs et al. 2002), a 1.7 millionword corpus of speech in “academic contexts” recorded at the Universityof Michigan. I use this corpus for two reasons. First, each transcript iscoded for discourse type � monologic, interactive, or mixed � allowinga comparison of nonconcord behavior in each discourse type. Otherstudies have attempted to discern the effect of discourse (e. g., Meechanand Foley 1994), but evidence of its effect has been elusive, despite stronganecdotal evidence that informal and formal speech differ greatly in thefrequency of nonconcord. Second, using a corpus devoted to “academicspeech” controls for the level of education variable: since the vast major-ity of speakers in this corpus are students or faculty of the University ofMichigan, they are all by definition high school-educated and above.This allows us to test the hypothesis that speakers with this level ofeducation behave similarly in their production of concord in existentials.

This study focuses on concord variation in existentials in the presenttense. Each of Meechan and Foley (1994), Martinez Insua and PalaciosMartinez (2003), and Crawford (2005) either focused on or had largecomponents of data on existentials from standard varieties, and eachfound that the present tense is the locus of concord variability. Neverthe-less, variability in the present tense has received less attention in theliterature, especially with respect to United States varieties.

By developing a multivariate statistical model of the production ofconcord in existentials, we are able to investigate the relative magnitudeof the effects of each of the variables hypothesized to influence concordpatterns. However, the effect of variation between individual speakershas not received significant attention in previous sociolinguistic studies.

Brought to you by | Johns Hopkins UniversityAuthenticated | 134.208.103.160Download Date | 4/9/14 6:59 PM

Page 6: There's two ways to say it: Modeling nonprestige there's

238 B. Riordan

Individual variation has been hypothesized to affect the base rate ofproduction of a particular linguistic variant (Bayley 2002), but this vari-ability has not been explicitly incorporated into the statistical model.New multivariate modeling techniques � in particular, “mixed-effectmodeling” (Baayen 2007; Baayen, Davidson et al. submitted) � are ableto assess the variance explained in the dependent variable by variationbetween individuals. In this study, I employ a mixed-effect model todevelop a separate multivariate model of existential concord variationto account for inter-speaker variation.

In sum, this study has three goals. First, I seek to complement previ-ous research on concord variation in existentials in world Englishes byproviding data on a near-standard variety of United States English. Sec-ond, this study aims to test claims made in the literature about the roleof age, level of education, and type of discourse on concord patterns inexistentials, regardless of variety. Third, I attempt to explicitly model therole of individual speaker variation in accounting for concord patterns.

This paper is organized as follows. In Section 2 I review all the majorlinguistic, processing, discourse, and social factors that have been shownto be correlated with nonconcord in existentials. In Section 3 I discussthe corpus data and how it was coded. I introduce two new factors thatwere coded for � discourse type and disfluencies � that have not yetbeen explored in the literature. In Section 4 I present a series of distribu-tional analyses of nonconcord in the data by considering its associationwith each of the factors presented in Section 3. In Section 5 I considerall the effects discussed in Section 4 simultaneously by using them aspredictors in a logistic regression model of the data. Section 6 developsan alternative model of the data using mixed effects modeling, at-tempting to account for the variance introduced by individual speakers’concord patterns. Section 7 summarizes and concludes the paper.

2. Factors

Do speakers in MICASE show variable concord patterns in existentials?Data from 332 speakers make up the 1520 utterances that comprise thecurrent corpus of existentials. 195 of these speakers contribute 2 or moreutterances. Of these speakers, 120 speakers (61.5 %) show variable con-cord. When we consider the 150 speakers that contribute 3 or moreutterances, concord is variable in 106 (70.7 %) speakers’ utterances.Thus, for most speakers, concord production is variable. Following pre-vious sociolinguistic research, I hypothesize that this variability is influ-enced by various linguistic and extralinguistic factors.

In this section I review the literature on factors thought to have aneffect on the frequency of nonconcord in existential there constructions,

Brought to you by | Johns Hopkins UniversityAuthenticated | 134.208.103.160Download Date | 4/9/14 6:59 PM

Page 7: There's two ways to say it: Modeling nonprestige there's

Modeling nonprestige there’s 239

Table 2. Factors explored in quantitative multivariate analyses of copula variation inexistentials. Blank cells represent factors that were not coded in a particularstudy.

Mee

chan

and

Fol

ey(1

994)

Tag

liam

onte

(199

8)

Bri

tain

and

Sudb

ury

(200

2):

NZ

E

Bri

tain

and

Sudb

ury

(200

2):

FIE

Hay

and

Schr

eier

(200

4)

Pie

tsch

(200

5):

NIT

CS

Rio

rdan

(200

7):

Mix

ed-e

ffec

tsm

odel

Linguistic Clause Structure sig.Clitic / Full Copula sig. sig.Determiner Type sig. sig. sig. sig. sig.Determiner Type / sig.DistanceInversion n.s. n.s.Lexical Noun n.s.NP Type n.s. n.s.Plural -s n.s. n.s. n.s.Polarity sig. n.s. n.s. n.s. sig. n.s.Quantifier Type sig. n.s.Region sig.Region / Age sig.Region / Age / Sex sig.Small Clauses n.s.Specificity n.s.Tense n.s. sig. sig. sig.Tense / Polarity sig.

Processing Distance n.s. sig. sig. sig. n.s.Disfluency n.s.

Social Age n.s. n.s. n.s. sig. sig.Education sig. sig. sig.Employment sig.RankingEthnicity n.s.Gender n.s. n.s. n.s. n.s. sig. n.s.Social Status sig.Urban/Rural n.s. sig.Age / Gender sig. sig.Gender / Ethnicity sig.

Discourse Topic of n.s.ConversationPrimary Discourse sig.ModeAge / PrimaryDiscourse Mode

Brought to you by | Johns Hopkins UniversityAuthenticated | 134.208.103.160Download Date | 4/9/14 6:59 PM

Page 8: There's two ways to say it: Modeling nonprestige there's

240 B. Riordan

focusing on the results from recent quantitative investigations. For themost part, I limit my discussion to factors that have been found to havea statistically significant correlation with nonconcord in these studies.These factors fall into four categories: linguistic, processing, social, anddiscourse.

Tables 1 and 2 summarize the major characteristics and findings,respectively, of quantitative studies of that specifically dealt with existen-tial concord variation. For the most part, these studies have focused onconcord variation with plural postcopular NPs2. Sample size in thesestudies varied widely from a low of 235 in Meechan and Foley (1994) to2305 in Crawford (2005). Most of these studies involved multivariateanalyses using Goldvarb (Sankoff, Tagliamonte et al. 2005). As a result,all tests of significance for different factors are reported from Goldvarbresults3.

2.1. Linguistic factors

Since Meechan and Foley (1994), all quantitative sociolinguistic studiesof concord variation in existentials have explored whether the type ofdeterminer in the postcopular NP has some effect, resulting in differentialrates of concord. Following a proposal by Milsark (1977), Meechan andFoley coded the determiner in the postcopular NP as either “weak” or“strong”. “Strong” determiners included definite determiners, demon-stratives, and possessives, as well as universal quantifiers. “Weak” deter-miners encompassed all other determiners, including the indefinite articleand cardinal numerals (Meechan and Foley 1994). Tagliamonte (1998)elaborated slightly on this taxonomy; she provided the following exam-ples (1998):

(1) (a) Strong(i) There was these concerts. [definite]

(b) Weak(i) There was only about four blokes. [numeric](ii) There was no roads. [no](iii) There was lots and lots of pubs … [partitive](iv) There was quite a few [other quantifier](v) There was odd things going on. [adjective](vi) There was pictures as big as that blackboard. [bare]

Although Tagliamonte did not elaborate on the definition of “partitive”,this category would seem to include expressions such as a lot and acouple. These determiners are treated as “‘a’ quantifiers” in Britain andSudbury (2002) and Hay and Schreier (2004). The “other quantifier” is

Brought to you by | Johns Hopkins UniversityAuthenticated | 134.208.103.160Download Date | 4/9/14 6:59 PM

Page 9: There's two ways to say it: Modeling nonprestige there's

Modeling nonprestige there’s 241

Table 3. Frequency of nonconcord in existentials by postcopular NP determiner type

Meechan and Foley no > number/weak/strong(1994): CanadaTagliamonte (1998): partitive > no > definite > numeric > other quantifier >York, U.K. undeterminedBritain and Sudbury no > definite > numeric > bare > quantifiers > adjective(2002): Falkland IslandsBritain and Sudbury no > numeric > definite > bare > quantifiers > adjective(2002): New ZealandHay and Schreier numeric > no > definite > bare > quantifiers > adjective(2004): New Zealand

also not defined in detail. Previous studies do not discuss their codingprocedures for categorizing tokens into each category.

Many of the quantitative studies listed in Table 2 have found dif-ferential rates of concord by postcopular NP determiner type. Table 3summarizes these findings (the leftmost category shows the greatest non-concord). Britain and Sudbury have gone so far as to call these “con-straint hierarchies” (2002) for each of the varieties. It should be pointedout that none of these studies tested the proportions of nonconcord foreach determiner type for statistical significance vis a vis the other types.

It can bee seen from Table 3 that no and numeric tend to be associatedwith more nonconcord, while other quantifiers, adjectives, and bare NPstend to be associated with less.

Some research on nonstandard English dialects has indicated an asso-ciation between polarity and nonconcord (e. g., Schilling-Estes and Wol-fram 1994; Britain 2002): affirmative contexts are associated with singu-lar copula forms, while negative contexts are associated with pluralcopula forms. Tagliamonte (1998) found strong evidence for this in exis-tentials in York plural past tense contexts: was was almost four times aslikely to appear in affirmative contexts (66 %) as in negative ones (17 %)(1998: 162). However, other studies have found no significant effect(Britain and Sudbury 2002; Martinez Insua and Palacios Martinez 2003;Hay and Schreier 2004).

Meechan and Foley (1994) hypothesized that the presence of a plural-s marker on the postcopular NP might trigger agreement on the copula.However, they found no significant effect. Britain and Sudbury (2002)noticed a slight trend in this direction, but it was not significant either.

Some sociolinguistic researchers have included the “contractedness”of the copula � whether the copula occurs in full or cliticized form �as a factor influencing concord patterns in existentials. For example,Meechan and Foley (1994) found that contractedness of the copula was

Brought to you by | Johns Hopkins UniversityAuthenticated | 134.208.103.160Download Date | 4/9/14 6:59 PM

Page 10: There's two ways to say it: Modeling nonprestige there's

242 B. Riordan

the strongest predictor of singular agreement in their sample. Hay andSchreier (2004) coded for an interaction between tense and copula, andthis factor was a significant predictor of concord variability. On the otherhand, other researchers have not coded contractedness as a factor (Brit-ain and Sudbury 2002; Tagliamonte 1998). Henry (2005) expresses doubtthat contraction is the source of nonconcord in existentials, instead pro-posing that it is associated with a grammar for informal contexts, whichin turn produces nonconcord.

The form of the copula is not used as a predictor in this investigationfor two reasons. First, investigating the effects of the different formsof the copula on agreement requires a fine-grained phonetic coding ofutterances. MICASE transcription is not phonetic, and there is likelyvariability in the coding of copular forms in existentials since phoneticdetail was not a priority in corpus construction. Second, using contract-edness of the copula as a factor amounts to a theoretical claim thatcontractedness is an influence on concord in the production of existen-tials. That is, it would be assumed that speakers use an abstract intentionto contract the copula as part of their selection of the copula form. Whilethis is possible, it precludes the possibility of speakers selecting there’s asan unanalyzed lexical unit, since it assumes that contraction will occurfor a copula after the copula is selected as a separate unit.

2.2. Processing factors

The processing factor that has been most examined as potentially havinga relationship with concord variation in existentials is the distance be-tween the copula and the head noun of the postcopular NP. Researchershave hypothesized that distance between these elements may influenceagreement processing, so that greater distance or more intervening mate-rial may result in more instances of nonconcord. Meechan and Foley(1994) coded their data for intervening material “between the verb andthe postcopular NP” (1994): they used a binary classification that codedfor “presence” or “absence” of such intervening material. However, theyfound no difference in the frequency of nonconcord when interveningmaterial was present versus when it was not. Tagliamonte (1998) used adifferent method for counting distance: she counted “how many wordsintervened between [the copula] and the subject noun”, resulting in sixcategories, “adjacent” through “5�” (1998). Her results were significant,and the percentages of nonconcord increased smoothly as the number ofintervening words increased. Britain and Sudbury (2002) used a similarmethod. Their results indicated a significant association between greaterdistance and nonconcord for both New Zealand English and Falkland

Brought to you by | Johns Hopkins UniversityAuthenticated | 134.208.103.160Download Date | 4/9/14 6:59 PM

Page 11: There's two ways to say it: Modeling nonprestige there's

Modeling nonprestige there’s 243

Island English.4, 5 Pietsch (2005) employed a binary classification of exis-tentials into cases where the copula was adjacent to the NP head andwhere there was intervening material, and found a greater percentage ofnonconcord in adjacent sentences. However, not all studies have revealeda significant effect of distance (Martinez Insua and Palacios Martinez2003; Hay and Schreier 2004).

Martinez Insua and Palacios Martinez (2003) introduced the idea thatthe length of the predicate following the postcopular NP might affect theform of the copula: longer, heavier sequences might be associated withgreater nonconcord. They created a binary classification of existentials:“minimal” existentials were defined as those with no extension followingthe postcopular NP; other existentials fell into a second group that in-cluded “locative or adverbial elements, relative clauses, participial -ingclauses, and to- infinitival clauses.” Their results supported their hypoth-esis: of 192 spoken existentials with nonconcord, 69 % fell into the lattercategory (Martinez Insua and Palacios Martinez 2003).

2.3. Discourse factors

Meechan and Foley (1994) attempted to gauge the effect of style onconcord in existentials by coding for topic of conversation. They hypoth-esized, for example, that conversations relating to school might havebrought about more concord in informants. However, no effect of topicwas found.

2.4. Social factors

In addition to linguistic, processing, and discourse factors, researchershave explored whether social factors such as age, gender, level of educa-tion, etc. have some effect on the concord patterns of existentials. Asdiscussed in the introduction, level of education has been linked to fre-quency of nonconcord. Meechan and Foley (1994) coded for whether ornot speakers had completed 11 years of education. They found that thisdistinction was the greatest predictor of concord in existentials for theirdata: speakers with less than 11 years of education produced significantlymore nonconcord. Quantitative evidence from other studies, however,has not been so clear-cut. Tagliamonte’s study also coded for education,but drew the line at 16 years of education. Although not significant,there was a clear difference between men and women who had graduatedfrom college: women favored nonconcord at a rate of 70 % while menonly used it 26 % of the time. These figures, however, were based on avery small sample (57 and 19 sentences, respectively) (1998). Britain andSudbury (2002) mention in passing that level of education significantly

Brought to you by | Johns Hopkins UniversityAuthenticated | 134.208.103.160Download Date | 4/9/14 6:59 PM

Page 12: There's two ways to say it: Modeling nonprestige there's

244 B. Riordan

predicted concord patterns in their New Zealand data, but they do notelaborate (2002).

Quantitative studies have also looked at age and gender effects. Ineach of the studies that have coded for these factors, a significant in-teraction has been found. In Tagliamonte’s data, education as a factorwas not fully significant, but when cross-tabulated with gender, a clearerpattern emerged: younger women (but not men) used more nonconcordin existentials (1998). Tagliamonte even argued that women might be inthe process of making singular agreement categorical. Britain and Sud-bury found that in both New Zealand English and Falkland Island Eng-lish younger speakers, both men and women, produced more noncon-cord; there was less of a distinction between genders in their data (2002).Hay and Schreier specifically coded an age-by-gender interaction. Theydiscovered an interesting trend in their data on New Zealand English,which extends from the mid 1800s to the present: there was lessening ofnonconcord until the late 19th century/early 20th century, then noncon-cord gradually became more frequent throughout most of the 20th cen-tury, with a large gap in concord preferences between genders aroundthe late 19th century (2004). Finally, Pietsch (2005) reported that olderspeakers in Northern Ireland tended to produce more nonstandardforms, while women favored standard agreement patterns.

2.5. Summary

While the finding of high frequencies of nonconcord in existentials iscommon to all quantitative research on the topic, findings with regard tothe factors that affect concord variation have not always been consistent.Among linguistic factors, the strongest and most consistent effects havebeen different concord patterns based on the type of determiner in thepostcopular NP, and among social factors, age and gender. Of course,the differences in the significant factors in each study can be attributedto differences in the varieties under investigation, but sample size anddata collection methodology have also likely played a role.

3. Corpus and data coding

3.1. Corpus description

The Michigan Corpus of Academic Spoken English (MICASE) is a 1.7million word corpus of speech recorded in various settings at the Univer-sity of Michigan from 1997 to 2001. The University of Michigan is alarge public university in the Midwestern United States with approxi-mately 37,000 students. The corpus includes speech of faculty, students,and staff at the university (The English Language Institute 2003).

Brought to you by | Johns Hopkins UniversityAuthenticated | 134.208.103.160Download Date | 4/9/14 6:59 PM

Page 13: There's two ways to say it: Modeling nonprestige there's

Modeling nonprestige there’s 245

The focus of the MICASE corpus is “academic speech”, which is de-fined simply as “speech which occurs in academic settings”. The speechevents recorded include lectures, colloquia, lab sessions, discussion sec-tions, office hours, study groups, interviews, as well as a few serviceencounters. Thus the corpus contains both monologic and interactivespeech, and is not limited to “scholarly discussion” (The English Lan-guage Institute 2003).

3.2. Data extraction and coding

This study used the online version of the corpus, which is availablethrough a web interface6. I searched for all possible versions of declara-tive existential there sentences, including there’s, there is, there’re, thereare, as well as there isn’t and there aren’t.7, 8 The annotation of mostfactors of interest was done by hand.

Table 4 lists the categories of factors and each factor, along with exam-ple sentences from the corpus for the linguistic and processing factors.In the following sections I discuss the coding for many of these factorsin more detail.

3.2.1. Type of determiner

All previous studies of concord in existentials that have explored theeffect of the type of determiner on concord have followed the determinertaxonomy of Milsark (1977). However, in practice, this taxonomy provesnot to be exhaustive, and leaves many cases in doubt. For this reason Iused instead the determiner taxonomy of Huddleston and Pullum (2002).

In Huddleston and Pullum’s system, a determiner is a “function in thestructure of the NP” while a determinative is “a category of words…whose distinctive syntactic property concerns their association with thedeterminer function” (2002). Table 5 lists the determinatives, theirnames, and their status as definite or indefinite. In coding the data, sen-tences with definite determinatives listed in Table 5 were coded as defi-nite. Cardinal numerals and no were coded separately, while all othersentences with determinatives in Table 5 were coded as indefinite. Forthese categories, coding in general matched the coding of previousstudies.

As discussed in 2.1. above, Tagliamonte (1998) coded for a “partitive”determiner category, which contained lots and lots and other similar ex-pressions. Britain and Sudbury (2002) do not code for this category, butrecognize an ” ‘a’ quantifier” category, containing expressions such as alot. This study instead follows Huddleston and Pullum’s classification(2002) of all of these expressions as “non-count quantificational nouns”

Brought to you by | Johns Hopkins UniversityAuthenticated | 134.208.103.160Download Date | 4/9/14 6:59 PM

Page 14: There's two ways to say it: Modeling nonprestige there's

246 B. Riordan

Table 4. Factors coded for, factor levels, and examples.

Linguistic

Type of determiner DefiniteAnd then of course there’s the issue of how many graduatestudents there are … (MTG999ST015:S1)

Cardinal NumeralIn the same way that there’s two kinds of human sensibil-ity, two species of it, there’s twelve kinds of human under-standing … (DIS475MU012:S1)

Non-Count Quantificational NounIt’s the receptor that confers specificity in the system, andthere’s a lot of different serotonin receptors.(LEL500SU088:S1)

NoSo as you can see these two, reciprocity and redistribu-tion, are not in keeping with the capitalist mode. I meanthere’s no prices involved … (LEL115JU090:S1)

Other IndefiniteSo, that’s how I was taught. and there’s some things thatI learned like really really well that way and there’s somethings I didn’t learn. (COL999MG053:S4)

No DeterminerUm, but, you know there’s different people making it up,this time so, they might have a different style.(OFC175JU145:S1)

Polarity PositiveBut at the same time there’s a constant struggle betweenthe church and the state with Louis … (DIS315JU101:S5)

NegativeWell there’s nothing that says that, we have an Americanpolitical culture that has the same values beliefs, andassumptions forever and ever and ever right?(DIS495JU119:S1)

Plural �s PresentRight cuz there’s so many sublets out there that are reallycheap. (MTG999ST015:S10)

AbsentSo, you need to take note of the stop words. And there’sonly nine to memorize. (LES335JG065:S1)

Processing

Distance between copula 0and head noun of And like I said, there’s nothing in here like the folk thingpostcopular NP that like sticks out as something that must be smashed.

(ADV105SU068:S2)

Brought to you by | Johns Hopkins UniversityAuthenticated | 134.208.103.160Download Date | 4/9/14 6:59 PM

Page 15: There's two ways to say it: Modeling nonprestige there's

Modeling nonprestige there’s 247

Table 4. (continued)

1I mean there’s always standards in terms of like I said …(STP450SG128:S7)

2And, part of it could be that there’s a higher percentage ofnon-native English speakers in, um engineering …(LAB500SU089:S1)

3�There’s a little bit of grade inflation at U-of-M so mostpeople sorta you know B-minus C-plus is kinda average.(ADV700JU047:S2)

Postcopular sequence MinimalThat was another six hundred million dollar, merger.There’s just a lot of money. (LES405JG078:S1)

ExtendedUh, footnoting there’s, still some huge confusion going onabout footnoting. (SEM300MU100:S1)

Disfluency between Presentcopula and head noun Have any of you guys ever heard that, about after a black-of postcopular NP out there’s, a rise in births? (LAB500SU089:S1)

AbsentAnd the second is that there is actually a model, withinIslamic thought … (LES315SU129:S1)

Discourse

Primary Discourse Mode MonologMixedInteractivePanel

Social

Age 17�2324�3031�5051�

Academic Role UndergraduateGraduateFacultyStaff

Gender MaleFemale

Brought to you by | Johns Hopkins UniversityAuthenticated | 134.208.103.160Download Date | 4/9/14 6:59 PM

Page 16: There's two ways to say it: Modeling nonprestige there's

248 B. Riordan

Table 5. The basic determiners of Huddleston and Pullum (2002)

Definite determinatives

the Definite Articlethis (these), that (those) Demonstrative determinativeswe (us), you Personal determinativesall, both Universal determinatives

Indefinite determinatives

a Indefinite Articleeach, every Distributive determinativessome, any Existential determinativesone, two, three … Cardinal numeralseither, neither Disjunctive determinativesno (none) Negative determinativeanother Alternative-additive determinativea few, a little, several, … Positive paucal determinativesmany, much, few, little, … (inflect for Degree determinativesgrade, e. g., much, more, most)enough, sufficient Sufficiency determinatives

Table 6. Types of non-count quantificational nouns (from Huddleston and Pullum2002)

Number-transparent quantificational nouns1 lot, plenty2 lots, bags, heaps, loads, oodles, stacks3 remainder, rest4 number, couple

Non-count quantificational nouns selecting a singular obliquedeal, smidgen, bit, amount, quantity

Non-count quantificational nouns selecting a plural obliquedozens, scores, tens, hundreds, thousands, millions, billions, zillions, …

and codes them as such. Huddleston and Pullum reserve this term forexpressions in which quantification is expressed by a head noun that isnon-count and that takes an of prepositional phrase as a complement,as in a lot of students. Non-count quantificational nouns may be number-transparent, select a singular oblique, or select a plural oblique. Exam-ples are listed in Table 6. Number-transparent non-count quantifica-tional nouns allow the number of the element in the oblique position todetermine the number of the NP. Each of the four types listed in Table6 have different restrictions on modifiers and type of oblique (see Hud-dleston and Pullum 2002 for details).

Brought to you by | Johns Hopkins UniversityAuthenticated | 134.208.103.160Download Date | 4/9/14 6:59 PM

Page 17: There's two ways to say it: Modeling nonprestige there's

Modeling nonprestige there’s 249

In addition to these categories, a bare category is maintained for caseswhere there is no modification of the NP head. Cases where only adjec-tives (i. e., no determiners) occur in the NP are coded as no determiner9.

Huddleston and Pullum consider cases such as (2) to be “false defi-nites” (2002):

(2) … so it’s like technically, blacks have the rights but there is like allthese other loopholes that they throw in … (STP095SU139:S7)

These here is definite in form, but indefinite in meaning, and so is codedas indefinite rather than definite.

There are many cases in the data where the head noun is also a deter-miner, as in And there’s only nine to memorize. Huddleston and Pullumanalyze this as a “fused determiner-head” (2002) construction: “the headis combined with a dependent function that in ordinary NPs is adjacentto the head”. In these cases, I considered the fused-determiner head asthe head, and did not code it as a determiner.

3.2.2. Distance between copula and head noun of postcopular NP

This distance was coded as the number of words between the copulaand the head noun of the postcopular NP. In the case of non-countquantificational nouns, a and the noun were coded as one word each,except for a lot, which was coded as a single word. Hesitations and otherdisfluencies were not coded as words (see below).

3.2.3. Determiner category and distance

Note that the no determiner category is the only category that can havezero distance. Obviously, if a determiner is present in the postcopularNP, the head noun cannot be adjacent to the copula. Previous studieshave not considered this structural zero problem.

As a result, it was necessary to construct a factor that combined thedeterminer type and distance categories. However, it is not clear that afine-grained factor that includes a level for each determiner type crossedwith distance is necessary. In the logistic regression models describedbelow, such a factor would inflate the number of model parameters whileonly providing a slight boost in model fit (see footnote 13). Therefore,the distance variable was split into a two-level factor: distances of one-word were coded as “minimal”, while distances of two or more wordswere coded as “extended”. This coding was then crossed with the deter-miner category coding to produce a factor with no structural zeroes. Thelevels of this factor are presented in Table 7.

Brought to you by | Johns Hopkins UniversityAuthenticated | 134.208.103.160Download Date | 4/9/14 6:59 PM

Page 18: There's two ways to say it: Modeling nonprestige there's

250 B. Riordan

Table 7. Coding of the structural interaction between determiner type and distance toNP head. “

√” indicates a factor level

Determiner Type Distance

Minimal Extended

No determiner√

�Indefinite

√ √

Cardinal Numeral√ √

Non-count Quantificational Noun√ √

No√ √

Definite√ √

Other (includes adjectives)√ √

3.2.4. Postcopular sequence

The length of the postcopular sequence was coded following MartinezInsua and Palacios Martinez (2003) (see Section 2.2).

3.2.5. Disfluencies between copula and head noun of postcopular NP

Jaeger (2005), examining the relation between optional that in relativizedclauses and disfluencies, found that speakers produced more that whenexperiencing disfluencies or anticipating disfluencies in their speech, pos-sibly in order to signal such difficulties to addressees. In a similar vein,it may be possible that disfluencies have an effect on concord in existen-tials. I hypothesize that production difficulties may cause speakers to“slow down” such that they take prescriptive norms on agreement moreinto account. For example, this could lead speakers to change the natureof the postcopular NP such that it agrees with the copula already pro-duced.

To test this, the presence versus absence of disfluencies between thecopula and the noun of the postcopular NP was coded. Disfluenciesincluded pauses of various durations (each is coded separately in MI-CASE), fillers such as um, and restarts.

3.2.6. Age

Age is split into four categories in MICASE: 17�23, 24�30, 31�50, and51 and over. The same coding was used in this study.

Brought to you by | Johns Hopkins UniversityAuthenticated | 134.208.103.160Download Date | 4/9/14 6:59 PM

Page 19: There's two ways to say it: Modeling nonprestige there's

Modeling nonprestige there’s 251

3.2.7. Academic role

MICASE classifies speakers into 10 academic roles. These roles and theirrecodings for this study are listed in Table 8.

Table 8. Recodings of MICASE academic roles

MICASE Academic Role Recoding

Junior Undergraduate UndergraduateSenior UndergraduateJunior Graduate GraduateSenior GraduateJunior Faculty FacultySenior FacultyResearcherPostdoctoral FellowStaff StaffVisitor/Other Recoded based on context

3.2.8. Primary discourse mode

MICASE categorizes all speech events according to their primary dis-course mode. These roles and their descriptions are listed in Table 910.

Table 9. Primary discourse modes in MICASE

Monologic One speaker monopolizes the floor, sometimes with a question andanswer period.

Panel Several consecutive monologs usually followed by multi-speakerinteractions.

Interactive Interactional discourse involving two or more speakers.Mixed No one discourse is predominant.

4. Results: Distributional analyses

In this section I present a series of analyses of the association of factorsfrom Section 3 with concord in existentials in the MICASE corpus.Where motivated by previous research, I also present evidence for in-teractions between factors, and the association of these interactionswith concord.

As in most previous studies, I focus on the subset of the data thatcontain plural postcopular NPs, since this is where the vast majority ofconcord variation occurs, especially in standard varieties. This comprises

Brought to you by | Johns Hopkins UniversityAuthenticated | 134.208.103.160Download Date | 4/9/14 6:59 PM

Page 20: There's two ways to say it: Modeling nonprestige there's

252 B. Riordan

1520 tokens. I model the probability of the singular versus the pluralcopula. In other words, there is, there’s, and there isn’t are opposed tothere are, there’re, and there aren’t (see Table 10).

4.1. Overall concord

Overall, nonconcord in existentials with plural postcopular NPs in MI-CASE reaches 39.7 % (604/1520 tokens). This figure is similar to Craw-ford (2005)’s results for the T2K-SWAL corpus (45 %), but lower thanother sociolinguistic studies (see Table 1)11.

Table 10 presents the frequencies of the various types of existentialconstructions in the plural postcopular NP context, along with their con-cord status.

Table 10. Frequencies of existential types in plural postcopular NP contexts

Nonconcord Concord

Existential Type there are 0 645there’re 0 250there aren’t 0 21there is 11 0there’s 591 0there isn’t 2 0

Total 604 916

4.2. Type of determiner

In terms of number of tokens, the no determiner category was the highest(606). The no and definite categories had the fewest tokens (56 and 30,respectively), numbering almost an order of magnitude less than theother categories, other indefinite (339), cardinal numeral (CN) (244), andnon-count quantificational nouns (NCQN) (245). Figure 1 presents thepercentages of nonconcord by category.

By percentages, we have this ordering in terms of the category thatco-occurs most with non-concord:

definite > cardinal numeral > NCQN > indefinite > no > no deter-miner

The underlined categories have 100 or more total tokens. Among these,cardinal numerals and NCQNs show the highest percentages of noncon-cord. This is a similar but more extreme result than in previous studies,

Brought to you by | Johns Hopkins UniversityAuthenticated | 134.208.103.160Download Date | 4/9/14 6:59 PM

Page 21: There's two ways to say it: Modeling nonprestige there's

Modeling nonprestige there’s 253

Figure 1. Percent nonconcord by determiner type (CN � cardinal numeral; NCQN �non-count quantificational noun)

where the category numeric was always one of the top three categoriesby percentage, with the exception of Tagliamonte (1998). The NCQNcategory is not directly comparable to previous studies (see Section3.2.1.), where these sentences would have been part of the quantifierscategory. The exception again is Tagliamonte’s study, where the partitivecategory containing these items was explicitly coded. In her study thiscategory had the highest percentage of nonconcord (see Table 3).

The number of tokens for definite and no is low relative to the othercategories, so their nonconcord percentages are not likely to be reliable.In addition, because the number of observations for these categories issignificantly less than for the other categories, slight changes in numberswould cause their percentages to change more than the other categories.With the exception of Hay and Schreier (2004), other studies have onlyreported percentages of nonconcord by category, so it is impossible totell if a similar situation existed in their data. Hay and Schreier’s corpusalso contained fewer instances of definite and no, though not as few ashere. There is also the issue of what the authors of each study coded asdefinite, as discussed in Section 3.2.1.

While determiner category and concord are not unrelated (χ2 �22.9; df � 5; p < 0.0004; N � 1520), they are only weakly associated(Cramer’s V � 0.123). Nevertheless, the different percentages of noncon-

Brought to you by | Johns Hopkins UniversityAuthenticated | 134.208.103.160Download Date | 4/9/14 6:59 PM

Page 22: There's two ways to say it: Modeling nonprestige there's

254 B. Riordan

Table 11. Pairwise comparisons of proportions of nonconcord by determiner type usingthe Marascuilo Procedure

Comparison Absolute Difference Critical Range Significant?Between Proportions

No determiner vs. indefinite 0.038 0.087 NoNo determiner vs. cardinal 0.1 0.100 YesnumeralNo determiner vs. NCQN 0.101 0.099 YesIndefinite vs. cardinal 0.122 0.113 YesnumeralIndefinite vs. NCQN 0.063 0.112 NoCardinal numeral vs. NCQN 0.059 0.121 No

cord with each determiner type indicate that speakers have somewhatdifferent concord preferences depending on the determiner. Leaving outno and definite because of their small total token size, a chi-square testfor independence among proportions of nonconcord is significant: χ2 �20.49; df � 3; p < 0.0002; Cramer’s V � 0.119; N � 1434. However, thisdoes not tell us what individual proportions are significantly different.To test these pairwise differences, I used the Marascuilo Procedure fortesting differences among multiple proportions. The result is presentedin Table 11. Of the six comparisons tested, only three reached the conser-vative criterion of significance of the Marascuilo Procedure. The cardinalnumeral and NCQN sentences are significantly different from no deter-miner sentences, and cardinal numeral sentences are also different fromindefinite determiner sentences. This is evidence, then, that existentialconstructions with cardinal numerals and NCQNs really are associatedwith higher proportions of nonconcord.

4.3. Polarity

Table 12 presents a crosstabulation of polarity and concord for existen-tials with plural postcopular NPs. The proportions of nonconcord ap-pear very similar for both positive and negative sentences, and this isconfirmed by a chi-square test of independence: χ2 � 0.0014; df �1; p� 0.9702; Cramer’s V � 0.00985; N � 1520). In these data, then, thereis no association between polarity and concord. This result is consistentwith most of the previous quantitative studies of concord in existentials.It appears that polarity plays a significant role in concord behavior inexistentials only for some varieties of English, such as the York varietyinvestigated by Tagliamonte (1998).

Brought to you by | Johns Hopkins UniversityAuthenticated | 134.208.103.160Download Date | 4/9/14 6:59 PM

Page 23: There's two ways to say it: Modeling nonprestige there's

Modeling nonprestige there’s 255

Table 12. Polarity and concord

Polarity Nonconcord Concord

N % N %

Negative 40 38.5 64 61.5Positive 564 39.8 582 60.2

Table 13. Postcopular sequence length and concord

Postcopular Nonconcord Concordsequence length

N % N %

Minimal 199 48.3 213 51.7Extended 405 36.6 703 63.5

4.4. Length of the predicate following the postcopular NP

Martinez Insua and Palacios Martinez (2003) presented evidence thatlong predicates following postcopular NPs in existentials were morelikely to show nonconcord. Based on the results in Table 13, it appearsthat an opposite trend is in effect in this corpus. Length of postcopularNP is significantly associated with nonconcord (χ2 � 16.5; df � 1; p <0.001; Cramer’s V � 0.107; N � 1520). The odds ratio is 0.607, indicat-ing that minimal predicates are more likely to appear with nonconcordthan extended predicates.

Since Martinez Insua and Palacios Martinez do not report similar sta-tistical tests for the association they found between extended predicatesand nonconcord, we cannot make any comparisons regarding thestrength of the association. Determining the nature of this associationbetween minimal predicates and nonconcord will require further analysisthat takes into account the type of the NP, including its determiner.

4.5. Plural -s

Table 14 presents a crosstabulation of existentials with postcopular NPsmarked with plural -s and concord. From this it appears that the pres-ence of plural -s is associated with fewer instances of nonconcord. Anassociation exists, but it is not significant (χ2 � 3.7006; df � 1; p �0.054; Cramer’s V � 0.05079; N � 1520). The odds ratio is 1.139, indi-cating that plural -s is slightly less associated with nonconcord than exis-tentials not containing plural -s. Britain and Sudbury (2002) also re-

Brought to you by | Johns Hopkins UniversityAuthenticated | 134.208.103.160Download Date | 4/9/14 6:59 PM

Page 24: There's two ways to say it: Modeling nonprestige there's

256 B. Riordan

Table 14. Plural -s and concord

Plural -s Nonconcord Concord

N % N %

Presence 500 38.7 793 68.3Absence 104 45.8 123 54.2

ported a similar trend, although their results were not significant, either.Thus, evidence for overt number marking on the postcopular NP trigger-ing number agreement on the copula is weak.

4.6. Distance between copula and head noun of postcopular NP

Previous studies have reported an effect on concord such that the greaterdistance (measured in words) between the copula and the head noun ofthe postcopular NP, the more likely nonconcord is. Table 15 shows theresults for MICASE. The amount of nonconcord by percentage shows aclear increasing trend from adjacent (0) to distant (3�). However, a chi-squared test is not significant: χ2 � 7.141; df � 1; p � 0.0675; Cramer’sV � 0.07056; N � 1520. Thus, while intriguing, the linearly increasingeffect could be artifactual. These results are similar to Hay and Schreier(2004), who also found a consistent but insignificant effect.

Table 15. Distance (in words) and concord

Distance (words) Nonconcord Concord

N % N %

0 103 34.2 198 65.81 198 38.5 317 61.52 157 42.7 211 57.33� 146 43.5 190 56.5

4.7. Disfluencies between copula and head noun of postcopular NP

In Section 3.2.4 I hypothesized that disfluencies in speech might havesome effect on concord in existential constructions. Table 16 shows theproportions of concord when disfluencies are absent versus present.Nonconcord is 40.5 % when there are no disfluencies, and 33 % when adisfluency is present. Thus, nonconcord is appears more likely in existen-tials without disfluencies. However, this association does not rise to signi-

Brought to you by | Johns Hopkins UniversityAuthenticated | 134.208.103.160Download Date | 4/9/14 6:59 PM

Page 25: There's two ways to say it: Modeling nonprestige there's

Modeling nonprestige there’s 257

Table 16. Disfluency and concord

Disfluency Nonconcord Concord

N % N %

Presence 62 33 126 67Absence 542 40.7 790 59.3

ficance: χ2 � 3.77; df � 1; p � 0.0519; Cramer’s V � 0.0498; odds ratio� 1.39. This result is consistent with the hypothesis that speakers may“slow down” when encountering processing difficulties such thatprescriptive norms on agreement may have more of a chance to be con-sidered, but the effect is marginal.

4.8. Age

Age has often been examined for its possible association with concordbehavior in existentials. As can be clearly seen from Figure 2, there is astrong relationship between age and production of nonconcord: theolder the age group, the less nonconcord is produced. Statistical testsconfirm this, showing a moderately strong association: χ2 � 107.1; df �

Figure 2. Percentage of nonconcord by age group

Brought to you by | Johns Hopkins UniversityAuthenticated | 134.208.103.160Download Date | 4/9/14 6:59 PM

Page 26: There's two ways to say it: Modeling nonprestige there's

258 B. Riordan

3; p < 0.0001; Cramer’s V � 0.265; N � 1520. The effect seen hereis similar to that observed in previous studies, with younger speakersproducing relatively more nonconcord.

4.9. Academic role

There is a similarly strong relationship between academic role and con-cord: χ2 � 84.96; df � 3; p < 0.0001; Cramer’s V � 0.236; N � 1520.This is not surprising, given the makeup of the academic setting: under-graduates are the youngest group, followed by graduate students, thenfaculty. The number of existential tokens for staff is small, so their non-concord percentage is less reliable than that of the other groups.

4.10. Age and academic role

There is an extremely strong association between age and academic role:χ2 � 2139.5; df � 9; p <.0001; Cramer’s V � 0.685; N � 1520. For thisreason, only age is considered in the logistic regression models presentedin the next two sections.

4.11. Gender

The distribution of tokens in the subset of existentials with plural post-copular NPs according to gender is not equal: tokens contributed byfemales account for 58.4 % (888/1520).

The tabulation of gender and concord appears in Table 17. Althoughthere appears to be a slight association between female speakers andnonconcord, this trend is not significant: χ2 � 2.77; df � 3; p < 0.096;Cramer’s V � 0.0366. This lack of a significant individual effect of gen-der is consistent with previous studies.

Table 17. Gender and concord

Gender Nonconcord Concord

N % N %

Female 369 41.6 519 51.4Male 235 37.2 397 62.8

4.12. Age and gender

If we consider the relationship between age, gender, and concord, wecan discern an apparent interaction. Do both genders show the same

Brought to you by | Johns Hopkins UniversityAuthenticated | 134.208.103.160Download Date | 4/9/14 6:59 PM

Page 27: There's two ways to say it: Modeling nonprestige there's

Modeling nonprestige there’s 259

decreasing trend for nonconcord as age increases? They do, but theirtrajectories are different. Within males, there is a clear difference betweenthe younger ages, 17�23 and 24�30, and the older ages, 31�50 and51�. Within females, if a drop-off in nonconcord exists it is between thetwo older age groups, 31�50 and 51�. We test this difference in gendersby testing the difference in the association between gender and concordin the 31�50 age group, where the rate of concord appears to be higherfor the female group. The difference between genders among this sub-group is significant: χ2 � 14.5; df � 3; p � 0.0001; Cramer’s V � 0.16;N � 1520. Thus, as in previous studies, there is an apparent interactioneffect between age and gender with respect to concord in existentials.Here, younger female faculty are more likely to produce nonprestigethere’s than their male counterparts.

4.13. Discourse type

In MICASE, more existentials are produced in the interactive (546) andmonologic contexts (533) than in the mixed (313) and panel contexts(128). The data show a significant decreasing trend for nonconcord aswe move from interactive to mixed to monologic contexts (Figure 3): χ2

� 91.9; df � 3; p � < 0.0001; Cramer’s V � 0.246; N � 1520. (The

Figure 3. Percentage of nonconcord by discourse type

Brought to you by | Johns Hopkins UniversityAuthenticated | 134.208.103.160Download Date | 4/9/14 6:59 PM

Page 28: There's two ways to say it: Modeling nonprestige there's

260 B. Riordan

sample size for the panel category is small, and the results difficult tointerpret.) Thus the interactive context is very favorable for nonconcord,while monologic contexts are not.

4.14. Age and discourse type

Given the peculiarities of the academic setting, we would expect age anddiscourse mode to be highly related: students, who tend to be the youn-gest group, have the most opportunity to speak in interactive settings,while of course it is the faculty’s job to speak in monologic settings.

In Figure 4 we see the effect of age and primary discourse type onconcord in existentials. Within each age group, concord in existentials issignificantly different across discourse types. The most interesting effectis for the 31�50 age group, which is largely made up of the tokens offaculty. Concord for this group clearly increases from interactive tomixed to monologic contexts.

Thus we have found a significant association between primary dis-course type and concord, mediated by an interaction with age. Facultytend to be the most sensitive to changes in discourse context, producingmore nonconcord in existentials as the setting becomes more interactive.This suggests that concord in existentials is associated in speaker’s mindswith formality and perhaps social status.

Figure 4. Age, discourse type, and concord

Brought to you by | Johns Hopkins UniversityAuthenticated | 134.208.103.160Download Date | 4/9/14 6:59 PM

Page 29: There's two ways to say it: Modeling nonprestige there's

Modeling nonprestige there’s 261

4.15. Summary of distributional analyses

The analyses in this section have identified a number of factors, bothlinguistic and extralinguistic, that appear to have an effect on the pro-duction of nonconcord in existentials. We saw an effect of type of deter-miner on nonconcord, with postcopular NPs containing cardinal nu-merals and non-count quantificational nouns significantly associatedwith greater levels of nonconcord. There was an effect of the length ofthe predicate on concord, such that shorter predicates were more associ-ated with nonconcord, contradicting a previous finding in the literature.There were strong effects of social and discourse factors on concord:increasing age and formality of discourse context were associated withlower levels of nonconcord.

5. Modeling the production of concord in existentials

In the previous section I considered a series of analyses of the effects ofindividual factors on concord in existentials. In this section I develop alogistic regression model of the production of concord, in which all ofthese factors are considered simultaneously. As in previous studies, mygoal is to discern the relative strengths of factors, controlling for thestatistical effect of each one.

In logistic regression models, one or more independent variables areused to predict a dichotomous dependent variable. In past sociolinguisticstudies of existentials, linguistic and extralinguistic variables have beenused to try to predict singular versus plural agreement, operationalizedas an abstract dichotomous variable. This study focuses on variation inconcord in the present tense with plural postcopular NPs, so that con-cord takes the forms there’re, there are, and there aren’t, and nonconcord,the forms there’s, there is, and there isn’t (see Table 10).

Notice that implicit in this investigation is the premise that all thevariables that are considered combine to influence the production of theexistential variant, there’s/there is or there’re/there are. In the case of someof the linguistic and processing variables, this means that speakers mustregister the influence of these variables before the form of the copula isactually produced. This is the case, for example, with the plural -s andpostcopular NP length factors. Hence, speakers must be assumed tomaintain a substantial ability to “look ahead” in producing their utter-ances, which influences their choice of agreement pattern.

5.1. Collinearity

One of the assumptions of logistic regression models is that independentvariables are not highly correlated. When one independent variable is a

Brought to you by | Johns Hopkins UniversityAuthenticated | 134.208.103.160Download Date | 4/9/14 6:59 PM

Page 30: There's two ways to say it: Modeling nonprestige there's

262 B. Riordan

function of another, the variables are collinear. Collinearity in logisticregression models can cause the coefficients in the regression equationto become unreliable (Garson n.d.). The linear relationship of the inde-pendent variables was checked by examining their Variance InflationFactors12. As expected, age and academic role did share a linear relation-ship (age VIF � 3.70; academic role VIF � 3.48), but none of the othervariables did. For this reason, academic role was excluded from the groupof predictor variables in each of the logistic regression models.

5.2. Fixed effects model

In order to determine which factors significantly predict nonconcord inexistentials in MICASE, I constructed a binary logistic regression modelwith each of the factors in Table 18 (each discussed in Sections 2 and 3)as predictors13. I will refer to this model as the “fixed effects” modelbecause the levels of the predictors are not assumed to be samples from alarger population, but are exhaustively coded (in contrast with “randomeffects”; see Section 6). As before, the data for the model consisted ofthe 1520 existential tokens from MICASE with a plural postcopular NP.The model was fit using the R lrm function.

Among the predictors in the model, not all significantly contributedto predicting concord behavior. Therefore a process of model simplifica-tion was employed to find the minimally adequate model for the MI-CASE data (Crawley 2005). Such a model incorporates the linguistic andextralinguistic constraints that are most predictive of concord behavior.

Stepwise procedures for model simplification, used in previous studies,suffer from a limited search space and order-of-deletion/addition effects(Faraway 2004; Crawley 2005). Instead, I use the model simplificationprocess outlined in Crawley (2005: Ch. 7). First, the higher order interac-tion terms were examined. The explanatory power of interaction factorswas tested by comparing a maximal model including all main effect andinteraction factors (“ME�I” in Table 18) with models excluding the indi-vidual interaction factors, using likelihood ratio tests (LRTs)14. Asshown in Table 18, both the interaction between age and gender and theinteraction between age and discourse type factors significantly contrib-uted to the model’s fit.

Next, the non-interaction factors in the model were tested by compar-ing a model including only these factors (that is, excluding the age/gen-der and age/discourse type interactions; “ME” in Table 18) to modelswithout individual factors, using LRTs. The strongest predictors wereage and discourse type, followed by postcopular sequence length anddeterminer type/distance. Four individual factors were not significantpredictors in the model: polarity, plural -s, disfluencies, and gender.

Brought to you by | Johns Hopkins UniversityAuthenticated | 134.208.103.160Download Date | 4/9/14 6:59 PM

Page 31: There's two ways to say it: Modeling nonprestige there's

Modeling nonprestige there’s 263

Table 18. Significance tests for factors in the fixed effects logistic regression model.(ME � model with all main effects; ME�I � model with all main effectsand all interactions)

Factor Maximal model LRT χ2 p-value Included infor comparison minimal model?

1 Determiner Type / ME 29.99 < 0.01√

Distance2 Polarity ME 0.44 � 0.53 Plural -s ME 2.36 � 0.124 Disfluency ME 2.68 � 0.15 Postcopular ME 8.29 < 0.01

sequence length6 Age ME 45.4 < 0.00001

7 Gender ME 1.54 � 0.21√

8 Discourse Type ME 40.3 < 0.00001√

9 Age / Gender ME�I 13.4 < 0.01√

10 Age / Discourse ME�I 32.6 < 0.001√

Type

The last column in Table 18 lists the factors that are retained in theminimally adequate fixed effect model. Note that gender is retained as afactor; this is because interaction terms are not interpretable as suchwhen terms for variables making up the interaction are not also includedin the model (Nichols 1995). The factors that are retained are ones thatthe distributional analyses in Section 4 found to have significant associa-tions with the appearance of nonconcord in the data. The marginal asso-ciations from Section 4, plural -s and disfluencies, were not retainedbecause they do not show sufficient explanatory power when the statisti-cal effects of the other independent variables are controlled for.

To what extent is this model able to predict concord behavior in thedata? One important test of the model’s fit is its ability to predict con-cord/nonconcord in individual cases. Table 19 shows that the model’soverall classification accuracy was 67.5 % � a modest improvement overa baseline model, which achieves 60.4 % by guessing concord in all cases.

Another measure of fit is Somer’s Dxy, which measures the rank corre-lation between predicted probabilities and observed responses, andranges between 0 and 1 (perfect prediction). For this model Somer’s Dxy

was .45. This is somewhat lower than the Somer’s Dxy value of .6 that isassociated with significant predictive capacity (Baayen 2007).

Finally, the loglikelihood score measures how likely the observed re-sponses are to be predicted from the observed values of the independentvariables (Garson n.d.), and is sometimes reported in sociolinguisticstudies that use logistic regression. The loglikelihood score for the mini-mal model was �902.74.

Brought to you by | Johns Hopkins UniversityAuthenticated | 134.208.103.160Download Date | 4/9/14 6:59 PM

Page 32: There's two ways to say it: Modeling nonprestige there's

264 B. Riordan

Table 19. Fixed effects model classification accuracy.

Predicted

Nonconcord Concord % CorrectlyClassified

Observed Nonconcord 303 301 50 %Concord 193 723 79 %

Overall 67.5 %

5.3. Discussion

Of the ten factors hypothesized from previous studies and from the dis-tributional analyses of Section 4 to have an effect on the appearance ofnonconcord, six were shown to be statistically significant predictors ofconcord behavior in existentials when all factors are considered simulta-neously: age, discourse type, postcopular sequence length, determinertype/distance, and the interactions between age and discourse type andage and gender. The model supports the findings from the distribu-tional analyses.

Based on this sample of existentials, we may conclude that polarity,plural -s, and disfluencies play little or no role in determining concordbehavior among this group of speakers, in the contexts recorded in MI-CASE. Plural -s and disfluencies showed a tenuous correlation with con-cord patterns in the distributional analyses but were not retained in thefinal model because their effects were minimal when considered togetherwith the other factors. More corpus research is needed to delineate therole of these factors, if any, in concord behavior in existentials.

How well does the resulting minimal model, incorporating the sixfactors that significantly predicted concord behavior, compare in its fitto the MICASE speakers’ data to previous logistic regression modelsof concord behavior proposed by sociolinguistic researchers for othervarieties of English? Unfortunately, most previous studies have not re-ported a measure of model fit. One exception is Hay and Schreier (2004),who report a loglikelihood score for their model of �540.8. While thecurrent model’s fit cannot be directly compared with that of Hay andSchreier, owing to the different data sets and different models applied tothese data sets, nevertheless the loglikelihood score of the current modelindicates a much lower degree of model fit. Two major factors likelyaccount for this result. First, Hay and Schreier included contraction ofthe copula as a predictor in their model, while it was excluded fromconsideration in the present study (see Section 2.1). Second, the speakersin MICASE do not form a coherent speech community as traditionally

Brought to you by | Johns Hopkins UniversityAuthenticated | 134.208.103.160Download Date | 4/9/14 6:59 PM

Page 33: There's two ways to say it: Modeling nonprestige there's

Modeling nonprestige there’s 265

defined. Differences between individuals’ patterns of concord in existen-tials may result in a much greater variance in concord behavior than iscommonly found in samples from the more homogeneous speech com-munities that have been documented by sociolinguistics researchers. Forthis reason, in the next section I attempt to estimate the effect of individ-ual speaker differences on the model, and I explore the degree to whichincorporating such information into a model can add to our understand-ing of factors that influence concord behavior.

6. Modeling idiolectal variation in existentials

The “fixed effect” model of concord production in existentials that wasproposed in Section 5 can only successfully predict concord versus non-concord in just over two-thirds of observed cases. I showed that addingfurther predictors does not significantly improve the predictive capacityof the model. Because so many cases are not accounted for by the model,the role of individual variation in concord behavior requires closer ex-amination. In this section I estimate the effect of individual differenceson the parameters of the fixed-effect model. I then develop a “mixed-effect” logistic regression model that takes account of individual speak-ers’ concord patterns. This model is significantly better able to predictconcord behavior, and forces us to modify our conclusions about whatfactors influence concord behavior in this variety of English.

One of the main assumptions of logistic regression models is indepen-dence of sampling. That is, it is assumed that no dependencies existbetween observations. Here, this means that speakers should not con-tribute multiple observations to the data (Garson n.d.). The fixed-effectmodel violates this assumption, because many individual speakers arethe source more than one token in the data. In fact, the distribution ofindividuals’ contributions to the data is highly skewed:

204 speakers contribute 1�3 utterances64 speakers contribute 4�7 utterances34 speakers contribute 8�12 utterances17 speakers contribute 13�18 utterances10 speakers contribute 19�25 utterances3 speakers contribute 25 or more utterances

The presence of non-independent observations, and the varying numberof observations contributed by individual speakers, will tend to bias theparameters estimated by the logistic regression model-fitting procedure.

One way to correct for this bias in parameter estimates is to use resam-pling (Harrell 2001). Multiple copies of the data can be created by sam-

Brought to you by | Johns Hopkins UniversityAuthenticated | 134.208.103.160Download Date | 4/9/14 6:59 PM

Page 34: There's two ways to say it: Modeling nonprestige there's

266 B. Riordan

Table 20. Corrected odds ratio estimates for the fixed-effect model. (Only odds ratiosfor parameters that were significantly different from their respective referencecategories in the fixed-effect model are shown)

Original model Corrected 95 % C.I. Sig.?odds ratio odds ratio

Age � 24�30 1.82 1.85 0.95�3.59 xAge � 51� 2.52 2.79 0.84�12.61 xDiscourse � Panel 4.37 4.43 1.96�12.01

Gender � Male 0.60 0.62 0.33�1.13 xDeterminer Type/Distance � 0.40 0.39 0.25�0.57

Cardinal Numeral, ExtendedDeterminer Type/Distance � 0.61 0.59 0.40�0.86

Indefinite, MinimumDeterminer Type/Distance � 0.53 0.51 0.34�0.75

NCQN, ExtendedDeterminer Type/Distance � 0.50 0.48 0.30�0.74

NCQN, MinimumPostcopular NP length � 0.74 0.72 0.59�0.91

ExtendedAge � 24�30 * Discourse � 0.15 0.14 0.03�0.59

PanelAge � 31�50 * Discourse � 0.12 0.12 0.03�0.58

PanelAge � 31�50 * Gender � Male 3.10 3.12 1.39�6.65

pling from the existing data with replacement. The logistic regressionmodel can be refitted to each sample, so that confidence intervals forthe parameter estimates of the model can be derived. To quantify thedegree to which inter-speaker variation affects the parameter estimates,we can resample from the speakers (Bresnan, Cueni et al. 2005; Baayen2007). That is, we sample from the speakers in the corpus with replace-ment, then use the utterances contributed by these speakers as the corpusof observations to which the model is fitted. This provides a way ofquantifying the degree to which different samples of speakers would af-fect the parameter estimates of the model.

The speakers and the existential data were sampled 300 times withreplacement, and the fixed-effect model was refit to each sample. Table20 displays the odds ratios for the parameters from the original modelalong with their corresponding corrected odds ratios derived from theaverage parameters estimated over the course of resampling. An oddsratio greater than 1 indicates that the presence of that category increasesconcord relative to when the reference category is present (and viceversa), while an odds ratio of 1 indicates that the presence of the cat-egory makes no difference relative to the concord behavior when the

Brought to you by | Johns Hopkins UniversityAuthenticated | 134.208.103.160Download Date | 4/9/14 6:59 PM

Page 35: There's two ways to say it: Modeling nonprestige there's

Modeling nonprestige there’s 267

reference category is present. For example, the original odds ratio forthe Age 51� category is 2.52, indicating that concord is 2.5 times morelikely in the existentials of these speakers relative to the reference cat-egory, Age 17�24. For the most part, the corrected odds ratios are closeto the original estimates.15

The 95 % confidence intervals for the corrected odds ratios provide anindication of the variability attributable to inter-speaker variation. Onlycategories whose odds ratio confidence intervals do not include 1 can beconcluded to be significantly different from the reference category whenspeaker variation is taken into account. The confidence intervals for theextralinguistic categories such as age, discourse, and gender are verywide. In fact, the confidence intervals for the odds ratios for the age andgender categories both include 1, meaning that the parameters for thesecategories in the model cannot be concluded to be significant whenspeaker variation is considered. Thus, while speakers largely share theeffects of linguistic factors on their production of existentials, the effectof different samples of speakers with respect to extralinguistic variablesis large enough to significantly bias the parameter estimates in the model.

Resampling from speakers can give us an idea of how well the modelaccounts for the data in the face of inter-speaker variation, and allowsthe model to be corrected for overfitting the data. “Mixed-effect” models(Baayen 2007; Baayen, Davidson et al. submitted) can also estimate therole played by differences between speakers, but they offer the advantageof being able to incorporate speaker variation into the model itself.

In mixed effect models, speakers in the sample can be treated as ran-dom samples from a population of speakers. An additional parameter(called a “random effect”) is introduced into the model that allows foradjustments to the model’s fixed effects, conditioned on the data contrib-uted by particular speakers16. By incorporating speaker variation di-rectly into the model in this way, we can estimate the overall effect ofsuch variation relative to the other factors (“fixed effects”). This in turnallows a broadening of the model’s predictive power, and a better estima-tion of the factors that are in play in influencing linguistic behavior.

To account for inter-speaker variation, I introduce by-speaker adjust-ments to the intercept of the model. In essence, these adjustments changethe model’s baseline tendency to predict concord versus nonconcord ona speaker-by-speaker basis.

Using a mixed-effect model to incorporate speaker variation maychange the correlation of other factors with the dependent variable. Inparticular, the speaker random effect may account for some of the vari-ance that currently is accounted for by fixed effects in the model. Forthis reason, I repeat the process of model simplification from Section 5,

Brought to you by | Johns Hopkins UniversityAuthenticated | 134.208.103.160Download Date | 4/9/14 6:59 PM

Page 36: There's two ways to say it: Modeling nonprestige there's

268 B. Riordan

in order to arrive at the minimally adequate model wherein speakers aremodeled as random effects.

Unfortunately, the existing data is not sufficient to adequately esti-mate parameters for the interaction factors from the fixed-effect model(age/discourse type and age/gender) in the mixed effect-model17. Hence,these factors cannot be tested for significance in the model simplificationprocess. However, the effects of these factors will be tested indirectly: ifthe mixed effect model provides a better model fit than the final fixedeffect model (which included the interactions factors), then we can con-clude that these factors would likely not have significantly improved thefit of the mixed effect model.

Another difference in the model simplification process with the mixedeffect model is the way fixed effect terms are tested for significance.Likelihood ratio tests of fixed effects involving a maximal model and amodel excluding a factor of interest are not recommended for mixedeffects models (Bates 2006). Therefore, following Bates’ suggestion, foreach fixed effect factor, I compare a model including that term and thespeaker random effect with a baseline model that includes only thespeaker random effect.

Table 21 mirrors Table 18, showing the factors that were tested in thesimplification process and their significance. The last column indicateswhether each factor was included in the final mixed effect model. Thefixed effects factors that are retained in the final model are the same asin the fixed effect model (with the exception of the interaction terms):determiner type/distance, length of postcopular sequence, age, and dis-course type. (Gender is not retained because there is no age/gender in-teraction term.) Furthermore, the relative contributions of these factorsto the model’s predictive capacity are virtually unchanged from the fixedeffect model.

As measures of the model’s fit to the data, we again consider classifica-tion accuracy, Somer’s Dxy, and loglikelihood score. As is shown in Table22, the ability of the model to predict concord versus nonconcord inindividual cases, 80.1 %, is a dramatic increase over the fixed effectsmodel (67.5 %), and is far better than the baseline model (60.4 %).

The model’s Somer’s Dxy reaches .722, now well within the range ofsignificant predictive capacity. The model’s loglikelihood score is �877.6. In sum, the inclusion of the random effect for speakers substan-tially improves the model’s fit.

The mixed effect model’s coefficients and corresponding odds ratiosare shown in Table 23. The p-values shown indicate the significance ofthe contrast between categories their respective reference categorieswithin a factor. The reference category for the determiner type/distancefactor is bare. The odds ratios for each of the levels within the determiner

Brought to you by | Johns Hopkins UniversityAuthenticated | 134.208.103.160Download Date | 4/9/14 6:59 PM

Page 37: There's two ways to say it: Modeling nonprestige there's

Modeling nonprestige there’s 269

Table 21. Significance tests for factors in the mixed effects logistic regression model.(SRE � model with only the speaker random effect)

Factor Model for LRT χ2 p-value Included incomparison minimal model?

1 Determiner Type / SRE 25.66 < 0.05√

Distance2 Polarity SRE .648 � 0.423 Plural -s SRE 2.91 � 0.094 Disfluency SRE 2.88 � 0.095 Postcopular SRE 11.5 < 0.001

sequence length6 Age SRE 51.9 < 0.0000000001

7 Gender SRE .39 � 0.538 Discourse Type SRE 35.6 < 0.0000001

Table 22. Mixed effects model classification accuracy.

Predicted

Nonconcord Concord % CorrectlyClassified

Observed Nonconcord 417 187 69 %Concord 116 800 87 %

Overall 80.1 %

category/distance factor are all below one: all categories within thisfactor are associated with nonconcord relative to the bare category. Thelowest odds ratio occurs with the cardinal numeral-by-extended cat-egory, indicating that a significant proportion of nonconcord occurshere. The significant contrasts of the cardinal numeral and non-countquantificational noun categories with the bare category buttress our ear-lier argument (Section 4.2) that cardinal numerals and non-count quanti-ficational nouns are significantly associated with nonconcord in existen-tials.

6.3. Discussion

Employing a mixed-effect model � and specifically by including speaker-by-speaker adjustments for possible idiolectal differences � greatly im-proved our ability to model concord behavior in existentials. This sug-gests that the role of inter-speaker variation in concord patterns in exis-tentials among similar groups of speakers in similar contexts is likely tobe great as well.

Brought to you by | Johns Hopkins UniversityAuthenticated | 134.208.103.160Download Date | 4/9/14 6:59 PM

Page 38: There's two ways to say it: Modeling nonprestige there's

270 B. Riordan

Table 23. Mixed effect model coefficients and odds ratios

Estimate Odds p-valueratio

(Intercept) �0.16 0.85Age 24�30 0.44 1.56Age 31�50 0.80 2.23 < 0.01Age 51� 1.79 6.01 < 0.001Discourse Type � Mixed 0.48 1.61Discourse Type � Monolog 1.06 2.88 < 0.001Discourse Type � Panel 0.28 1.32Postcopular sequence � Minimum �0.41 0.66 < 0.01Det Type/Dis � CN, Extended �0.97 0.38 < 0.001Det Type/Dis � CN, Minimum �0.48 0.62Det Type/Dis � Definite, Extended �0.79 0.45Det Type/Dis � Definite, Minimum �0.23 0.80Det Type/Dis � Indefinite, Extended �0.25 0.78Det Type/Dis � Indefinite, Minimum �0.66 0.52 < 0.01Det Type/Dis � No, Extended �0.22 0.80Det Type/Dis � No, Minimum �0.08 0.92Det Type/Dis � None, Extended �0.34 0.71Det Type/Dis � None, Minimum �0.05 0.95Det Type/Dis � NCQN, Extended �0.71 0.49 < 0.01Det Type/Dis � NCQN, Minimum �0.83 0.44 < 0.01

A model such as the fixed effect model from the previous section,which ignored such idiolectal variation, failed to fit the data as wellas the current model. Indeed, even when well-motivated higher-orderinteraction terms are included, as was the case in the fixed effect model,much less variance is explained than when speakers are treated as ran-dom effects. Importantly, it seems that the interaction terms were imper-fect ways of accounting for substantial underlying variation betweenspeakers.

7. Conclusion

This paper presented a large-scale corpus study of present-tense concordvariation in existential there constructions in American English speechfrom academic settings. By controlling for level of education, we wereable to illuminate the wide variation in the use of nonstandard there’s inthis social strata. Using distributional analyses and logistic regressionmodeling, we found strong evidence that age and discourse play majorroles in shaping concord choice. To a lesser extent, we saw that thestructure of the postcopular NP � in particular the type of determinerit contains � also plays a role in whether speakers use nonstandard

Brought to you by | Johns Hopkins UniversityAuthenticated | 134.208.103.160Download Date | 4/9/14 6:59 PM

Page 39: There's two ways to say it: Modeling nonprestige there's

Modeling nonprestige there’s 271

there’s. By focusing on the fit of the logistic regression models to thedata, and employing a more sensitive statistical model than in previoussociolinguistic studies, we identified the existence of substantial variationbetween individual speakers in concord behavior. This allowed a clearerappraisal of the factors that are significantly associated with concordpatterns.

Because of the unique social and discourse makeup of academic set-tings, we should be cautious about making claims about other linguisticvarieties. However, the overall concord pattern in the corpus analyzedhere was very similar to that found in a previous corpus study thatcompared academic lectures with casual conversation (Crawford 2005),hinting that the findings here are likely replicable in other corpora ofstandard English, possibly without monologic discourse components.

Given the large effect of age on concord patterns found in this study, itis tempting to make the inference that concord in existentials in standardAmerican English is in the process of a shift from plural agreement tosingular agreement. However, we should be cautious about engaging inan apparent-time analysis of the current findings because of the unusualnature of the participants in an academic setting, as noted above. Firmersupport for the possibility of an age-related shift would be provided bycorpus research of the kind conducted by Hay and Schreier (2004) forNew Zealand English, comparing the results from this corpus with afuture corpus of American academic speech.

A number of proposals for accounting for variability in existentialshave appealed to the various functions that singular agreement mightserve. For example, we saw that, in general, the more interactive thesetting, the more nonconcord speakers produced. Cheshire has arguedthat in running conversations there are discourse pressures toward asingle, invariant existential form (Cheshire 1999) � what Breivik calleda “presentative signal” (Breivik 1990). The result is large numbers ofnonstandard there’s.

At the same time, the results point to plural agreement serving a socialfunction. Speakers � especially teachers and professors, who often findthemselves in formal settings and with a certain contextual social stat-ure � are undoubtedly aware of the norms for concord used in writing,and may possibly have ideas about “correct grammar”, leading to thestandard are in contexts with a plural postcopular NP. Although it islikely that, as Crawford (2005) argues, the cognitive demands of onlinespeech processing especially in formal speech contexts will lead to in-stances where nonconcord does not retreat from conversational levels, itmay be that a confluence of discourse context, age, and perceived socialstature is crucial for the production of standard are in existentials inspoken contexts at a relatively high level.

Brought to you by | Johns Hopkins UniversityAuthenticated | 134.208.103.160Download Date | 4/9/14 6:59 PM

Page 40: There's two ways to say it: Modeling nonprestige there's

272 B. Riordan

This function of plural agreement in existentials may also have impli-cations for the students’ concord behavior. Eisikovits (1991) found evi-dence that it was only in the mid to late teenage years that speakersbegan to change their concord patterns, possibly as they began to orienttoward perceived norms of speech beyond their immediate speech com-munities. The time when these speakers transition to university life or towork may be a crucial time for the settling of concord patterns in indi-viduals’ speech, or the development of a stylistic alternative to singularagreement based on standard norms.

Nonprestige there’s may also be explained by functional advantages inprocessing. This study found higher rates of nonconcord with certaindeterminer types in the postcopular NP. This suggests that speakers areaware of the collocational statistics of there’s and words in the beginningof postcopular NPs. Thus, concord patterns may be determined by theshape of NPs because these statistics provide biases in online processing.

A processing explanation also underlies the proposal of several re-searchers (e. g., Cheshire 1999; Crawford 2005) that when speakers pro-duce nonstandard there’s, it may be the case that they are not in factcomputing agreement, but are choosing between lexified presentative sig-nals. That is, speakers may not be choosing to say there, followed by aform of the copula, followed by the rest of the sentence. Instead, thechoice could be between stored forms there’s or there’re. The finding inthis study that cardinal numerals and non-count quantificational nounsare more associated with there’s can be taken as evidence for the lexifica-tion of the there�BE form. Thus forms such as there’s a lot … may bebetter viewed as quasi-collocations, each separately stored by speakersfor reasons of processing efficiency.

However, the latter analysis is controversial because it reduces or elim-inates the role of syntax in what many (e. g., Schütze 1999) believe is afundamentally syntactic phenomenon. As a result, a number of propos-als now exist for integrating concord variation in existentials � or con-cord variation more generally � into existing syntactic theories. In whatfollows, I briefly sketch several of these proposals.

The approach adopted here of using logistic regression to model con-cord in existentials has traditionally been closely linked to the variablerule approach to linguistic variation pioneered by Labov and his col-leagues (Bayley 2002). Within a variable rule approach, the logistic re-gression model essentially reflects the grammar of a speech community.Given a certain context � that is, a constellation of linguistic and extral-inguistic factors � a rule producing one of a set of linguistic variantsapplies with a certain probability. These probabilities are estimated bythe regression procedure based on the frequencies of the linguistic vari-ants in the data. In terms of syntactic architecture, this implies that any

Brought to you by | Johns Hopkins UniversityAuthenticated | 134.208.103.160Download Date | 4/9/14 6:59 PM

Page 41: There's two ways to say it: Modeling nonprestige there's

Modeling nonprestige there’s 273

number of factors � linguistic, processing, social � will directly influ-ence the production of a syntactic variant; these factors form part ofspeakers’ grammars. Although individual variation is usually not consid-ered in this approach, it can be modeled as the base rate at which agiven rule applies (Bayley 2002). The adjustments to the intercepts forindividual speakers in the mixed effect model presented here reflect thesame intuition.

Although most work in the Principles-and-Parameters/Minimalist tra-dition has not countenanced variability in existentials (e. g., Chomsky1995, Hazout 2004, Moro 2006), nor syntactic variation in general, someproposals within this framework have begun to emerge.18

Schütze (1999) marshals considerable evidence to show that both sin-gular and plural agreement in existentials are best treated as grammaticalin English. He attacks the idea that plural agreement is a “grammaticalvirus” (Sobin 1997) that is layered imperfectly on a basic singular agree-ment pattern. Rather, Schütze argues that both concord and nonconcordpatterns represent options for speakers, not only in existentials but withany non-NP syntactic subjects. However, the roles of nonlinguisticfactors and idiolectal variation are left unspecified.

Based on her findings regarding the distribution of nonstandard -s ina Midlands variety of British English, Rupp (2005) argues for a universalconfiguration of functional projections that obtain in existentials, butnot other sentence types. Her proposal allows her to account for a pref-erence among her informants for nonstandard -s with no, and a dispref-erence with interrogatives. Rupp focuses on reconciling generativetheory with these findings and does not address the role of other linguis-tic or extralinguistic factors or inter-speaker variation.19

Although Adger and Smith (2005) do not address concord variationin existentials, their work represents the most explicit attempt to accountfor syntactic variation in the Minimalist framework to date. The crux oftheir proposal is that syntactic variants are represented in the lexicon bydifferences in “uninterpretable” syntactic feature specifications. After avariant is selected, the uninterpretable feature is erased during the courseof the syntactic derivation, but its presence to begin with has conse-quences for the “spellout” of a variant. Adger and Smith are clear thatthey envision only a single grammatical system for each speaker, and incontrast to other approaches, speakers’ “syntactic engines” are modular,shielded from the influence of non-syntactic factors. Syntactic variation,then, can be traced to the properties of lexical items. This approach isable to account for both extralinguistic influences and individual differ-ences, at least for cases involving simple alternations.20

An important question is whether concord variation in existentialsinvolves such a simple alternation. For modeling purposes, concord vari-

Brought to you by | Johns Hopkins UniversityAuthenticated | 134.208.103.160Download Date | 4/9/14 6:59 PM

Page 42: There's two ways to say it: Modeling nonprestige there's

274 B. Riordan

ation was construed as such in the current investigation. However,Henry (2005), based on speakers’ grammaticality judgments within asociolinguistic interview setting, argues that variation in individualgrammars with regard to existentials must include the types of factorsidentified in this and previous sociolinguistic studies, such as type of NPdeterminer, distance of the NP head from the copula, etc. Henry explic-itly rejects any Minimalist approach as far too restrictive to account forthe range of grammars for existentials that she and previous researchershave uncovered. In earlier work within the Principles-and-Parametersframework, Henry has advocated an approach to syntactic variationwhere optionality figures in the “core” of speakers’ grammars throughthe existence of a constrained set of parametric options, to whichstrengths are attached based on the statistical makeup of the input. Con-trary to most generative proposals, she endorses the idea of variable rulesand the role of frequency in the input as crucial to capturing patterns inacquisitional data on existentials (2002).

Pietsch (2005) adopts the theoretical framework of ConstructionGrammar to explain concord variation � including concord variationin existentials � in local varieties of Northern British English. In thisframework, speakers’ grammars are conceived of as comprised of hierar-chical construction “schemas”, with low-level constructions tied to fre-quencies in the input and higher level constructions gradually abstractedfrom low-level schemas over the course of acquisition. Pietsch proposesthat varying realizations of concord are the result of the presence ofmultiple low-level schemas attached to a general high-level subject-verbconstruction. Further, variants may compete with each other during pro-duction based on real-time linguistic and extralinguistic factors, resultingin utterance-to-utterance variability. Pietsch argues that a framework ofthis type can account for variability stemming from the range of linguis-tic and extralinguistic factors that have been identified, as well as vari-ability in individual speakers’ grammars, at any level of abstraction.

To summarize, the consistent observation of concord variability inexistentials has spurred the development of theoretical accounts that fo-cus on ways to incorporate optionality into speakers’ grammars. Exist-ing accounts differ widely in the specificity with which they deal with theinfluence of factors associated with concord variability, in their concep-tions of the relation between lexicon and grammar, and in how individ-ual variation is captured, if at all. Based on the results of this and previ-ous sociolinguistic and corpus studies, the yardstick for a successful ac-count of concord variability in existentials must be the degree to whichthe influence of a variety of factors, as well as idiolectal variation, arespecifically incorporated into the theory.

Brought to you by | Johns Hopkins UniversityAuthenticated | 134.208.103.160Download Date | 4/9/14 6:59 PM

Page 43: There's two ways to say it: Modeling nonprestige there's

Modeling nonprestige there’s 275

Acknowledgements

I would like to thank John Paolillo, Mike Gasser, Brian Jose, and VolyaKapatsinski for comments and encouragement during the developmentof this project. I would also like to extend my appreciation to Stefan Th.Gries, Bill Crawford, and two anonymous reviewers for their valuablefeedback. Preliminary versions of this work were presented at the 2006American Association of Applied Corpus Linguistics Conference in Flag-staff, Arizona, and at the 2006 New Ways of Analyzing Variation(NWAV) conference in Columbus, Ohio. This work was supported byNational Institute for Child Health and Human Development traininggrant T32 HD07475.

Bionote

Brian Riordan is a Postdoctoral Fellow in the Department of Psycholog-ical and Brain Sciences, Indiana University. He received his Ph.D. inLinguistics and Cognitive Science from Indiana University in 2007.

Notes

1. However, two-thirds of speakers in the Longman corpus are college-educated orabove. Thus, concord norms in the two corpora might be expected to be rathersimilar. To investigate the level of education variable, it would be fairer to com-pare different subsets of the Longman corpus, e. g., high-school educated withcollege-educated with those with higher degrees.

2. One important exception is Tagliamonte (1998), who examines only variation inthe past tense. This difference in sample may have contributed to differences inthe effects of various factors on concord that were found.

3. No statistical tests were performed by Martinez Insua and Palacios Martinez(2003) or Crawford (2005).

4. The New Zealand English result was not statistically tested because of categoricalnonconcord in Britain and Sudbury’s “3 or more” category.

5. It should be noted that although these two studies report a significant associationof distance with rate of nonconcord, they only report percentages of nonconcordfor each distance category, not the numbers of tokens on which these percentageswere based.

6. Available at: http://quod.lib.umich.edu/m/micase/. The MICASE web interfacewas revised in June, 2007. This study used the previous version of the search inter-face.

7. Note that there is not and there are not are subsumed under there is and there aresearches, respectively.

8. The html source files of the resulting search pages were parsed in two stages: firstusing Web2Text software (Burke 1999) and next using a script. The output waspartially annotated with information on several factors using a separate script.

9. Adjectives can potentially occur with all types of determiners, but there is notheoretical reason to code for such cases in addition to the categories coded here.This coding is consistent with previous studies.

Brought to you by | Johns Hopkins UniversityAuthenticated | 134.208.103.160Download Date | 4/9/14 6:59 PM

Page 44: There's two ways to say it: Modeling nonprestige there's

276 B. Riordan

10. As this article went to press, a finer-grained recoding of Primary Discourse Mode,called “Interactivity Rating”, was introduced via the MICASE web interface. Thiscoding eliminates the Panel category, and includes five categories: Highly Inter-active, Highly Monologic, Mostly Interactive, Mostly Monologic, Mixed.

11. Martinez Insua and Palacios Martinez (2003) do not report results for the pluralpostcopular NP context separately.

12. Using the vif function in the R “faraway” package.13. The difference in model fit using a fine-grained coding of determiner type and

distance versus the dichotomized coding presented in the text was examined (usingthe R glm function). The fine-grained coding included a term for each determinertype and word distance (zero, one, two, and three� words). A maximal logisticregression model using both the fine-grained coding and the dichotomized codingwas fit to the data. A likelihood ratio test (see footnote 13) of the maximal modelwith a model not including the dichotomized factor (but including the fine-grainedfactor) did not decrease the model fit, indicating that the fine-grained factor ac-counted for all of the variance accounted for by the dichotomized factor. A likeli-hood ratio test of the maximal model with a model not including the fine-grainedfactor (but including the dichotomized factor) did result in a decrease in modelfit, but this decrease was not significant (χ2 (5) � 2.57, p � 0.77). Therefore, thedichotomized factor can be employed without a significant decrease in overallmodel predictive ability.

14. The likelihood-ratio test uses the change in -2log-likelihood of the model when aterm is removed to test the null hypothesis that the coefficient for that term iszero (Norusis 2004).

15. Differences between speakers are modeled jointly by a random variable with amean of zero and a standard deviation that is estimated from the data (Baayen2007).

16. The age/discourse type interaction factor has 16 levels. Observations for a few ofthese levels are contributed by individual speakers, such that there is completeseparation of the dependent variable by speaker. In order to adequately estimatethe variance of the random effect for speaker as well as the effects of other factors,multiple observations are required for each level of each factor. This is especiallyimportant in fitting a generalized linear mixed model with a binary dependentvariable, since each observation in the data only contributes a bit of informationabout its distribution (Bates, personal communication).

17. For overviews of generative proposals regarding plural agreement in existentials,see Rupp (2005) and Moro (2006).

18. Rupp’s claims could not be addressed in the current study because her study usedgrammaticality judgments of sentence types that did not occur in MICASE.

19. A related approach is sketched in Parrott (2007) for existentials.

References

Adger, David. and Jennifer Smith2005 Variation and the Minimalist Programme. In Cornips, L., and K. Corri-

gan, Syntax and Variation: Reconciling the Biological and the Social. Am-sterdam/ Philadelphia: John Benjamins, 149�178.

Baayen, R. Harald2007 Analyzing Linguistic Data: A Practical Introduction to Statistics. Cam-

bridge: Cambridge University Press.

Brought to you by | Johns Hopkins UniversityAuthenticated | 134.208.103.160Download Date | 4/9/14 6:59 PM

Page 45: There's two ways to say it: Modeling nonprestige there's

Modeling nonprestige there’s 277

Baayen, R. Harald, Douglas J. Davidson, Douglas BatesSubmitted Mixed effects modeling with crossed random effects for subjects and

items.Bates, Douglas

2006 Conservative ANOVA tables in lmer. Retrieved 16 June, 2007, fromhttp://wiki.r-project.org/rwiki/doku.php?id�guides:lmer-tests.

Bayley, Robert2002 The quantitative paradigm. In Chambers, J. K., P. Trudgill and N. Schil-

ling-Estes, The Handbook of Language Variation and Change. Oxford:Blackwell, 117�141.

Breivik, Leiv E.1990 Existential There: A Synchronic and Diachronic Study. Oslo: Novus Press.

Bresnan, Joan, Anna Cueni, Tatiana Nikitina, R. Harald Baayen2005 Predicting the dative alternation. In Boume, G., I. Kraemer and J.

Zwarts, Cognitive Foundations of Interpretation. Amsterdam: RoyalNetherlands Academy of Science, 69�94.

Britain, David2002 Diffusion, levelling, simplification and reallocation in past tense BE in

the English Fens. Journal of Sociolinguistics 6(1), 16�43.Britain, David and Andrea Sudbury

2002 There’s sheep and there’s penguins: convergence, “drift” and “slant” inNew Zealand and Falkland Island English. In Jones, M. C., and E. Esch,Language Change: The Interplay of Internal, External and Extra-linguisticFactors. Berlin/New York: Mouton de Gruyter, 209�240.

Burke, Damien1999 Web2Text (Version 1.6) [computer software], http://www.jetman.dircon.

co.uk/software/web2text.html.Cheshire, Jenny

1999 Spoken standard English. In Bex, T., and R. J. Watts, Standard English:The Widening Debate. New York: Routledge, 129�148.

Chomsky, Noam1995 The Minimalist Program. Cambridge, MA: MIT Press.

Christian, Donna, Walt Wolfram, Nanjo Dube1988 Variation and Change in Geographically Isolated Communities. Tusca-

loosa: American Dialect Society.Crawford, William

2005 Verb agreement and disagreement: A corpus investigation of concordvariation in There � be constructions. Journal of English Linguistics33(1), 35�61.

Crawley, Michael. J.2005 Statistics: An Introduction Using R. Hoboken, NJ: John Wiley and Sons.

Eisikovits, Edina1991 Variation in subject-verb agreement in Inner Sydney English. In Chesh-

ire, J., English Around the World. Cambridge: Cambridge UniversityPress, 235�255.

Faraway, Julian2004 Linear Models with R. Boca Raton: Chapman and Hall/CRC.

Garson, G. Davidn.d. Logistic Regression, from StatNotes: Topics in Multivariate Analysis. Re-

trieved 25 May, 2007, from http://www2.chass.ncsu.edu/garson/pa765/statnote.htm.

Brought to you by | Johns Hopkins UniversityAuthenticated | 134.208.103.160Download Date | 4/9/14 6:59 PM

Page 46: There's two ways to say it: Modeling nonprestige there's

278 B. Riordan

Harrell, Frank E.2001 Regression Modeling Strategies: With Applications to Linear Models, Lo-

gistic Regression, and Survival Analysis. New York: Springer.Hay, Jennifer and Daniel Schreier

2004 Reversing the trajectory of language change: Subject-verb agreementwith be in New Zealand English. Language Variation and Change 16,209�235.

Hazout, Ian2004 The Syntax of Existential Constructions. Linguistic Inquiry 35(3), 393�

430.Henry, Allison

2002 Variation and syntactic theory. In Chambers, J. K., P. Trudgill and N.Schilling-Estes, The Handbook of Language Variation and Change. Ox-ford: Blackwell, 267�282.

Henry, Allison2005 Idiolectal variation and syntactic theory. In Cornips, L., and K. Corri-

gan, Syntax and Variation: Reconciling the Biological and the Social. Am-sterdam/ Philadelphia: John Benjamins, 109�122.

Huddleston, Rodney and Geoffrey K. Pullum2002 The Cambridge Grammar of the English Language. Cambridge: Cam-

bridge University Press.Jaeger, T. Florian

2005 Optional that indicates production difficulty: Evidence from disfluencies.Proceedings of DiSS’05, Disfluency in Spontaneous Speech Workshop.Aix-en-Provence, France.

Martinez Insua, Ana E. and Ignacio M. Palacios Martinez2003 A corpus-based approach to non-concord in present day English existen-

tial there-constructions. English Studies 3, 262�283.Meechan, Marjory and Michele Foley

1994 On resolving disagreement: Linguistic theory and variation � there’sbridges. Language Variation and Change 6(1), 63�86.

Milsark, George1977 Towards an explanation of certain peculiarities of the existential con-

struction in English. Linguistic Analysis 3, 1�31.Moro, Andrea

2006 Existential sentences and expletive there. In Everaert, M., and H. vanRiemsdijk, The Blackwell Companion to Syntax, volume 2. Oxford:Blackwell Publishing, 210�236.

Nichols, David1995 Interactions among dichotomous predictors in regression. Retrieved 27

June, 2006, from ftp://ftp.spss.com/pub/spss/statistics/nichols/articles/.Norusis, Marija. J.

2004 SPSS 13.0 Statistical Procedures Companion. Upper Saddle River, NJ:Prentice Hall, Inc.

Parrott, Jeffrey. K.2007 Distributed Morphology Mechanisms of Labovian Variation in Morphosyn-

tax. Department of Linguistics, Georgetown University.Pietsch, Lukas

2005 Variable Grammars: Verbal Agreement in Northern Dialects of English.Tübingen: Niemeyer.

Rupp, Laura2005 Constraints on nonstandard -s in expletive there sentences: a generative-

variationist perspective. English Language and Linguistics 9(2), 255�288.

Brought to you by | Johns Hopkins UniversityAuthenticated | 134.208.103.160Download Date | 4/9/14 6:59 PM

Page 47: There's two ways to say it: Modeling nonprestige there's

Modeling nonprestige there’s 279

Sankoff, David, Sali Tagliamonte, Eric Smith2005 Goldvarb X: A variable rule application for Macintosh and Windows

[computer software]. Toronto, Department of Linguistics, University ofToronto.

Schilling-Estes, Natalie, and Walt Wolfram1994 Convergent explanation and alternative regularization patterns: Were/

weren’t leveling in a vernacular English variety. Language Variation andChange 6(3), 273�302.

Schütze, Carson T.1999 English expletive constructions are not infected. Linguistic Inquiry 30(3),

467�484.Simpson, Rita C., Sarah L. Briggs, Janine Ovens, John M. Swales

2002 The Michigan Corpus of Academic Spoken English. Ann Arbor, MI: TheRegents of the University of Michigan.

Sobin, Nicholas1997 Agreement, default rules, and grammatical viruses. Linguistic Inquiry 28,

318�343.SPSS Inc.

2005 SPSS 13.0 for Windows [computer software]. Chicago, IL: SPSS, Inc.Tagliamonte, Sali

1998 Was/were variation across the generations: View from the city of York.Language Variation and Change 10(2), 153�192.

The English Language Institute2003 MICASE Manual: The Michigan Corpus of Academic Spoken English.

Edited by R. C. Simpson, D. Y. W. Lee and S. Leicher. Ann Arbor: TheRegents of the University of Michigan.

Brought to you by | Johns Hopkins UniversityAuthenticated | 134.208.103.160Download Date | 4/9/14 6:59 PM