Feature sharing in agreement - folk.uio.no · Feature sharing in agreement 3 which resist a feature-sharing treatment. This means that we cannot pin down a single agree mechanism

Natural Language and Linguistic Theory manuscript No.(will be inserted by the editor)

Feature sharing in agreement

Dag Trygve Truslew Haug · Tatiana Nikitina

the date of receipt and acceptance should be inserted later

Abstract This article discusses the mechanism of feature sharing in the analysis of agreement across theories.We argue that there are agreement phenomena that require an agreement mechanism which is both symmetric andfeature sharing. Our main argument relies on a Latin nominalized clause construction which has until now remainedill understood. We show that this construction requires a feature sharing and symmetrical approach to agreement.We also show that phenomena in Tsez and in Algonquian that have so far been described in terms of long distanceagreement lend themselves to a treatment in terms of feature sharing, and we look at the consequences for thetheory of agreement. We show that there are also cases of agreement which resist a feature-sharing treatment. Thismeans that we cannot pin down a single agree mechanism. Some agreement phenomena require feature sharing,others do not, and yet others are incompatible with feature sharing.

Keywords Agreement · Feature sharing · Long distance agreement · Latin

1 Introduction

Although deceptively simple on the surface, agreement has in recent years proven to be a complex phenomenonand a rich source for linguistic theorizing. In its canonical form, agreement involves a set of features being realizedin two different positions. However, the features are only ‘real’ (in the sense of being either inherent or syntacticallyor semantically interpretable) in one of these positions, called the controller, while they are redundant in the otherposition, called the target, cf. (1) from Latin.1

(1) rosarose:NOM;F

spinosathorny:NOM;F;SG

floruitbloomed:PST;3SG

‘The thorny rose bloomed.’

The noun rosa is feminine because this is an inherent property of the noun, nominative because it heads an NPwhose grammatical function demands a nominative (syntactic interpretability), and singular because it denotes a

Dag Trygve Truslew HaugDepartment of Philosophy, Classics, History of Arts and IdeasPO Box 1020 Blindern0315 OsloNorwayE-mail: [email protected]

Tatiana NikitinaUMR 8135 LLACAN, CNRS7 rue Guy Moquet94801 VillejuifFranceE-mail: [email protected]

1 To make the examples easier to read, information that can be expressed in the translation of the word is not repeated in the glosses, e.g.there is no gloss for number on nouns. The following glosses are used in the paper: 1 - first person; 3 - third person; I-IV - noun class I-IV; ABL- ablative; ABS - absolutive; ACC - accusative; CAUS - causative; COMP - complementizer; CONJ - conjunct; DAT - dative; DEF - definite; DIR -direct; EMPH - emphatic; ERG - ergative; F - feminine; GEN - genitive; IC - initial change; IMPF - imperfect; INCL - inclusive; INF - infinitive;LOC - locative; M - masculine; N - neuter; NEG - negation; NMLZ - nominalization; NOM - nominative OBJ - object; OBL - oblique; PASS -passive; PFV - perfective; PL - plural; PRES - present; PRF - perfect; PST - past; PTCP - participle; REFL - reflexive; SBJV - subjunctive; SECOBJ- second object; SG - singular; SUBJ - subject; TA - transitive animate; TI - transitive inanimate; TRANS - transitive.

2 Dag Trygve Truslew Haug, Tatiana Nikitina

Asymmetric SymmetricNo feature sharing Standard Minimalism Standard LFGFeature sharing e.g. Frampton and Gutmann (2000); Pesetsky and Torrego (2007) e.g. Kathol (1999); Ackema and Neeleman (2013)

Table 1 Analytical choices in theories of agreement

single object (semantic interpretability). The adjective spinosa carries the same features, but in this case they areneither inherent in the lexeme nor interpreted. So rosa is the controller and spinosa the target.

Such examples motivate the simple and powerful idea that agreement is just feature matching: the redundantexponents of features on the target must match those on the controller. This leads to an asymmetric view of agree-ment, since the status of the two sets of features is fundamentally different. In particular, since the target featuresmust match those of the controller, the controller cannot be underspecified for features that are present on thetarget. However, the target can be underspecified, since there is no matching requirement in the other direction.

Not all instances of agreement lend themselves easily to an asymmetric account. For example, Pollard and Sag(1994, 64) point to agreement with null (‘prodrop’) arguments, giving the Polish examples in (2), where the verbagrees with a null subject.2

(2) kochałem kochałes kochałI(masc) loved you(masc) loved he lovedkochałam kochałas kochałaI(fem) loved you(fem) loved she loved

To maintain an asymmetric view of agreement, we are essentially forced to assume that the examples in (2) involvea multiplicity of phonetically null pronominals, one for each distinct form of the verb. However, if we abandonasymmetry, we can have underspecified controllers. Hence, we can simply assume that there is a single null argu-ment, which is unspecified for gender and person. In the general case, the target and the controller in a symmetricapproach cospecify information about a single syntactic entity and in null argument structures it just so happensthat all the information comes from the target.

Symmetry is one dimension, then, along which theories of agreement may differ. Traditionally, derivationaltheories of syntax assumed asymmetric agreement and non-derivational theories symmetric agreement, but recentlyAckema and Neeleman (2013) have argued for symmetric agreement within an otherwise derivational, minimalistapproach. We return to symmetry in more detail in section 1.1.

Another dimension where theories of agreement may differ is whether they assume what we will call featuresharing, i.e. are the features involved in an agreement configuration available in the syntactic loci of both thecontroller and the target, or only in that of the controller? Returning to (1), it is clear that both rosa and spinosaare morphologically singular and equally clear that only rosa is semantically singular, since spinosa denotes aproperty and hence semantic number does not apply at all. The question, then, is whether the syntax goes with themorphology or with the semantics. Is there a syntactic feature NUMBER sg in the syntactic locus of spinosa or not?

The standard answer is no. On the asymmetric, “matching” view of agreement, spinosa is usually taken to havea non-interpretable NUMBER feature which gets deleted during the derivation. On the symmetric view of agreementfound in LFG, it is normally assumed that spinosa contributes a feature directly to its head, so again there isno feature sharing. But some linguists, working both within symmetric and asymmetric theories, have assumedfeature sharing. This is true for example of Kathol 1999, who works in LFG, but also of Pesetsky and Torrego(2007) who uphold the standard minimalist distinction between interpretable controller features and uninterpretabletarget features, but develop a view which dissociates interpretability and valuation: when Agree matches a pair ofuninterpretable and interpretable features, the result is that the features are valued in both the controller and thetarget position. In other words, the features in the target position will be valued, but uninterpretable. This yieldsan asymmetric theory with feature sharing. Similar views are found in e.g. Frampton and Gutmann (2000); Legate(2005); Bobaljik (2008). We discuss feature sharing in more detail in section 1.2.

We conclude that the question of the best architecture for a theory of agreement is both non-trivial and ofcross-theoretical interest. Table 1 sums up the analytical choices that some selected theories of agreement make.

In this paper we argue that there are phenomena that require an agreement mechanism which is both symmetricand feature sharing. Our main argument relies on a Latin nominalized clause construction which has until now re-mained ill understood: this construction is presented in section 3, and its special agreement properties are analyzedin section 4. We will also show that phenomena that have so far been described in terms of long distance agree-ment (see e.g. Polinsky, 2003; Boeckx, 2009) lend themselves to a treatment in terms of feature sharing (section5), although the converse is not true: current theories of long distance agreement cannot deal with the Latin data.Section 6 looks at the consequences for the theory of agreement. We show that there are also cases of agreement

2 A similar point was made by Barlow (1992).

Feature sharing in agreement 3

which resist a feature-sharing treatment. This means that we cannot pin down a single agree mechanism. Someagreement phenomena require feature sharing, others do not, and yet others are incompatible with feature sharing.Neither the feature sharing nor the non-feature sharing theory of agreement is more expressive than the other: theyare needed for different phenomena. This does not seem to be the case with the distinction between symmetric andasymmetric agreement. Asymmetric theories are mostly justified by metatheoretical concerns, such as restrictive-ness of the theory; but this means that if a single phenomenon can be shown to require symmetry, as the Latin datain fact does, the justification for asymmetry disappears. We do not want to rule out that empirical justification forasymmetric agreement could eventually be found, but in its absence we tentatively conclude that only symmetricagreement is required, and hence that only the two types of agreement defined by the absence or presence of featuresharing are needed.

1.1 The symmetry of agreement

While (2) shows that an asymmetric theory of agreement has to adopt an uneconomical analysis of agreement witha null argument, there is at least a way out by positing multiple null pronouns. But there are other cases, whichseem to be incompatible with an asymmetric approach. Consider so-called ‘unagreement’ in Spanish, as in (3) (=Ackema and Neeleman 2013, ex. 27a, originally from Harmer and Norton 1957, 270). Our discussion here willbe cursory, as the goal in this section is to illustrate the theoretical and empirical issues that are at stake in thediscussion on symmetry, not to argue that the Spanish data forces a symmetric analysis. For more details we referto Ackema and Neeleman (2013) and the references there.

(3) ¡Quéhow

desgraciad-asunfortunate-F.PL

somosbe:1PL

lasDEF;F;PL

mujer-es!women(F.)-PL

‘How unfortunate we women are!’

Clearly, the person features on the verb somos and its subject mujeres do not match: if we take mujeres to bespecified as third person, there is a direct contradiction, and if we take mujeres to lack a person feature (e.g. byunderspecification or by defining the third person as the absence of a person feature), then the verb (the target)appears to specify a person feature that is not present in the NP (the controller). To deal with such facts, anasymmetric theory of agreement will either have to assume a hidden subject with the appropriate features (towhich the overt ‘subject’ is in apposition or anaphorically linked), or use hidden features. In this particular case thelatter solution would involve the unlikely assumption that there is a noun mujeres that carries a first person feature.3

The first solution seems more promising, but Ackema and Neeleman (2013, p. 22) argue that the distribution ofsuch unagreeing subjects matches that of regular subjects (in particular, they need not be clause-peripheral).

A symmetric theory seems better placed to deal with these facts, since target features have the same statusas controller features. In (3), then, the controller mujeres is simply lexically underspecified for person and hencecompatible with any person value on the target verb, whose features are just as much bearers of information (ratherthan being matching, ‘superfluous’ features) as controller features are.4 We can illustrate the situation as in (4),continuing with the simplification that the third person is represented as the absence of a person feature.5

(4)target

[PERSON 1NUMBER pl

]6v

[GENDER fNUMBER pl

]controller6w

We can relate such feature structures by subsumption. A feature structure f subsumes (symbolically f v g, negatedf 6v g) another structure g iff f is at least as general (contains the same or less information) as g. Subsumption is apartial order, so there are four possible situations. Either the controller and the target are not comparable (neithersubsumes the other), as in (4), or they both subsume each other (they are identical), as in (5), or the target subsumesthe controller (6), or the controller subsumes the target (7)

(5)target

[PERSON 1NUMBER pl

]v

[PERSON 1NUMBER pl

]controllerw

3 Alternatively, a reviewer suggests that las could be an underspecified default spellout of a first person transitive D element. For our purposeshere, we do not need to dwell on such responses to the analysis in Ackema and Neeleman (2013), as the issue is orthogonal to our discussion.

4 As a reviewer notes, this analysis predicts that mujeres could also occur with a second person verb and this prediction is borne out.5 See e.g. Dalrymple and Kaplan (2000) for a more sophisticated treatment using set-valued features where the third person is the empty set.


(6)target

[

NUMBER pl

]v

[GENDER fNUMBER pl

]controller

(7)target

[PERSON 1NUMBER pl

]w

[

NUMBER pl

]controller

The notion of (a)symmetry in theories of agreement relates to the status of the target features: in asymmetrictheories target features are not independent but must be licensed by a corresponding feature on the controller.Asymmetric theories therefore come built in with the strong claim that universally, the agreement features ofthe target must subsume the agreement features of the controller, i.e. only (5) and (6) are allowed by universalgrammar, and the situations in (4) and (7) do not occur. The strength of this claim obviously depends on howsurface exceptions such as (3), which seems to instantiate (4), are dealt with. We will see that the Latin case wediscuss in this article is particularly challenging for asymmetric theories.

Let us now see more concretely how agreement works in an asymmetric theory.6 We illustrate the standardminimalist model in (8), which gives a simplified tree for (3).

(8)

V

u:φ

[PERSON 1NUMBER pl

]

NP

φ

[GENDER fNUMBER pl

]

Like other features, agreement (φ -) features come in two types: interpretable and uninterpretable ones. The latterare prefixed with u: in (8) and occur on the agreement target. Features that are uninterpretable must be deletedbefore they reach Logical Form (LF), and this deletion occurs via checking (matching) against an interpretablecounterpart. Thus, if the agreement target has features not present on the controller, they cannot be checked andwill remain at LF, causing the derivation to crash. This will be the case in (8), where the uninterpretable PERSONfeature does not have an interpretable counterpart. This is why ‘unagreement’ is problematic on an asymmetricapproach.

As we saw above, one way out would be to assume that the NP in (3) is not itself the subject, but is in appositionto a subject first person plural pronoun.7 In this structure, shown on the left side of (9), target and controller featuresmatch, and the target features can be checked by the controller and removed, to yield the structure on the right sideof (9), which is the input to semantic interpretation.

(9)

V

u:φ

[PERSON 1NUMBER pl

]

NP

φ

[PERSON 1NUMBER pl

] ⇒V

NP

φ

[PERSON 1NUMBER pl

]

Contrast this with the symmetric approach, as illustrated with LFG’s standard agreement theory. The idea hereis that agreement is co-specification of a single syntactic entity. When a verb agrees with its subject, it directlyspecifies features of its subject. For example, a simplified feature structure for the verb somos in (3) might be as in(10).

6 In derivational theories, especially when making use of covert or string-vacuous movement, the question will also arise whether agreementis upwards or downwards. We do not need to take a position in this debate here (see e.g. Zeijlstra (2012) and Preminger (2013)), as our goal issimply to illustrate the general workings of the theory.

7 As we also saw above, this solution is critized in Ackema and Neeleman (2013). We present it here for expository purposes only.


(10)

g

“BE”TENSE pres

SUBJ

h

AGR

[PERSON 1NUMBER pl

]

The feature structures are labelled: g is the verb’s feature structure and h that of its subject. Notice that LFG makesuse of recursive feature structures, so the subject’s feature structure is the value of the SUBJ attribute inside theverb’s feature structure. Inside SUBJ in turn, the feature structure AGR bundles the agreement features. Thus, thestructure in (10) makes the verb (partially) specify its subject’s features, in particular in this case, the values ofPERSON and NUMBER.

A simplified feature structure of the subject is given in (11).

(11)

i

“WOMAN”

AGR

[NUMBER plGENDER f

]

Other principles of the grammar ensure that i is identified as the subject of g, i.e. i = h. i and h must therefore unify,and since there are no conflicting features in the two structures, we get the well-formed feature structure in (12).

(12)

g

“ BE”TENSE pres

SUBJ

h,i

“WOMAN”

AGR

PERSON 1NUMBER plGENDER f

The AGR features of the subject have various origins: PERSON 1 is contributed by the verb, GENDER f by the noun,and NUMBER pl by both the verb and the noun. Agreement results from the unification of two AGR attributes viathe solving of an equation: the verb specifies information about h AGR and the subject specifies information abouti AGR, so when the grammatical function assignment tells us that h = i, we know that h AGR and i AGR are justtwo different names for the same feature structure. Importantly, therefore, the resulting feature structure only has asingle AGR attribute, in the syntactic locus of the subject: there is no AGR in the outer, verbal feature structure g.

We have illustrated the asymmetric theory with Minimalism, and the symmetric theory with LFG, but thesymmetry of agreement cuts across theories. For example, LFG has a mechanism of constraining equations, whichunlike other equations merely checks for the presence of a feature, in much the same way as uninterpretable featuresin Minimalism. Although we are not aware of any attempts within LFG to reduce all agreement to constrainingequations, specific phenomena have been treated in terms of such equations (see e.g. Andrews (1982) on morpholo-gical blocking, Dalrymple and Kaplan (2000) on feature indeterminacy and Wechsler (2011) on mixed agreement)And within Minimalism, Ackema and Neeleman (2013) have argued for a symmetric theory of agreement. In otherwords, the distinction between symmetric and asymmetric agreement is of cross-theoretical relevance.

1.2 Feature sharing

In addition to the question of symmetry, agreement theories differ in where in the syntax the agreement featuresare located. There are two possible options: either we assume that agreement features are present in the syntacticloci of both the controller and the target (‘feature sharing’), just as they are morphologically expressed in bothpositions; or the features are only present in the syntactic locus of the controller, despite their morphologicalrealization in both positions. The latter view is standard in both LFG and Minimalism, so the theories are similaron this parameter.8 The motivation is primarily semantic: the features are only represented in the locus where they

8 Similar enough for our purposes, that is. There are two main differences: First, the relevant notion of syntactic locus is different in thetwo theories. As we just saw, Minimalism typically represents agreement features in the tree structure, whereas LFG situates them in a feature


are interpreted. For example, although the first person feature in (3) is overtly represented on the target (the verb),its interpretation is that the denotation of the controller (the subject NP) is a set including the speaker.

But analyses in terms of feature sharing have also been proposed, sometimes within an otherwise asymmetricmodel (Frampton and Gutmann, 2000; Legate, 2005; Pesetsky and Torrego, 2007; Bobaljik, 2008), sometimesin a symmetric setting (Wechsler and Zlatic, 2003; Ackema and Neeleman, 2013). One way to think about themotivation for this is as an interface problem. When two items agree in a feature, this means they both bear amorphological exponent of the same value for that feature. In that sense, there is symmetry in the morphology,and that is the basis for observing agreement in the first place. By contrast, there is no doubt that semantically,agreement features are licensed (i.e. are either inherent or interpreted) on the controller only.9 This is equallycrucial to the notion of agreement: if two lexical items are both semantically plural, or both inherently feminine,then we would not say that they agree with each other. Jointly, the morphological symmetry and the semanticasymmetry are necessary and sufficient definitional properties of agreement. They are also necessary and sufficientconditions for distinguishing targets and controllers.

This leaves open what happens at the syntactic level. The standard answer is that syntax pairs with semantics,i.e. the agreement features are only realized in the syntactic locus of the controller, not in that of the target. Thisis not to say that target features play no role at all: as we saw above, they do serve to ‘filter’ syntactic structures.In derivational approaches, they rule out structures where the uninterpretable features cannot be matched by theagreement mechanism; and in LFG, they rule out feature structures where unification fails. However, neither the Vin the output structure of (9) nor the verbal feature structure g in (12) contain the agreement features. Put anotherway, syntax goes with semantics: the verb is morphologically first person plural, but it is neither syntactically norsemantically a first person plural. Instead, the first person plural features are syntactically represented in the locusof the noun phrase, which is also where they belong semantically, since they indicate that the reference of the nounphrase is plural and includes the speaker; they do not change the interpretation of the verb as such.

This is not a forced conclusion. It is possible to argue that syntax goes with the morphology, i.e. that featuresare present in the syntactic projection of words where they have morphological exponence, even when they are notlicensed semantically. In other words, the mismatch is not between morphology on one side and syntax-semanticson the other, but between morphology-syntax on one side and semantics on the other. We will refer to this view assyntactic feature-sharing. (13) and (14) show what the resulting syntactic structures could look like in Minimalismand in LFG, replacing (9) and (12), respectively. (These are just possible instantiations of a feature sharing analysis,which we will explore in more detail in section 4.3.)

(13)

V

u:φ

[PERSON 1NUMBER pl

]

NP

φ

[PERSON 1NUMBER pl

] ⇒V

φ

[PERSON 1NUMBER pl

]

NP

(14)

f

“ BE”TENSE pres

AGR[ ]

SUBJ

g,h

“WOMAN”

AGR

PERSON 1NUMBER plGENDER f

structure, an attribute-value matrix which in graph-theoretical terms is a directed, possibly cyclic graph, not a tree. Second, one could argue thatagreement features are not entirely absent from the target locus in standard Minimalism, since they are present as uninterpretable features, asshown in the lefthand side of (9). However, in the end result of the derivation (the righthand side of (9)) they disappear, and hence they cannotact as controllers in another agreement relation at the same time.

9 The situation with case is different from that of number and gender, since case has no semantics. However, case features are typically stillinterpreted on the controller only in the sense that they specify the controller’s function, not that of the target.


The line in (14) indicates that the two instances of AGR share the same feature structure value. This type of featuresharing agreement is sometimes assumed in HPSG. For example, Sag et al. (2003, 238) assume the Specifier-HeadAgreement Constraint in (15), which ensures that all inflecting lexemes agree with their specifier.

(15)

infl-lxm :

SYN

HEAD[

AGR 1]

VAL

[SPR

⟨[AGR 1

]⟩]

The index 1 indicates structure sharing (just like the line in the LFG representation), the presence of the same valuein two different positions in the attribute-value graph. So this constraint enforces structure sharing of the agreementfeatures (AGR) between any inflecting head and its specifier (SPR), leading to feature sharing in our terms.

Not all work in HPSG assumes feature sharing. Pollard and Sag (1994, 82) posit the alternative structure in(16) for a 3.sg. verb in English.

(16)

CATEGORY

HEAD[

VFORM fin]

SUBCAT⟨

NP[nom] 1 [3rd,sing]

⟩

CONTENT

[RELATION walkWALKER 1

]

Here too, there is structure sharing, but not feature sharing in our sense. The verb imposes agreement by constrain-ing the first (and only) element on its SUBCAT list to be 3rd person singular and that element is structure sharedwith the value of the WALKER feature. This way we get a symmetric approach with cospecification of the agree-ment features. But there is no sense in which the verb itself “has” the features 3rd person singular in the syntax,and the verb could not be the controller of agreement in these features. (It may seem strange to assume that theverb would be an agreement controller, but as we will see, this is what happens in Latin.)

In fact there is a strand of work in HPSG (see e.g. Wechsler and Zlatic, 2003) which assumes that both theseagreement mechanisms are available and that feature sharing (15) appears in so-called CONCORD agreement (pro-totypically CASE, GENDER, NUMBER agreement inside NPs), while standard symmetry (16) is typical of INDEXagreement (prototypically PERSON, NUMBER, CASE agreement in predicate-argument structures). We return tothis distinction in section 6.2.

The motivations for assuming feature sharing differ between authors. For some, such as Ackema and Neeleman(2013), which we will discuss in more detail in section 4.3, it appears to be mainly a byproduct of their way ofensuring symmetry. For Legate (2005), it is related to her view of phases in the syntactic derivation. In Pesetskyand Torrego (2007), whose feature sharing theory is not symmetric, feature sharing is mainly a way of passinginformation from lexical to functional categories, e.g. from the finite verb in vP to TnsP or between a relativephrase in spec,CP and the head C. This is a peculiarity of Minimalism’s Agree and not what we generally asso-ciate with agreement phenomena in a theory-neutral perspective. From a more empirical point of view, the mostcomprehensive defense of a feature-sharing approach is found in Kathol (1999).

One of his arguments is particularly relevant here, namely the observation that a non-feature sharing approachcannot explain why there often is a close morphological relationship between the form of the selector category andthe category selected. Consider for example the Latin case in (17) (= Kathol, 1999, ex. 13).

(17) illarumthose:GEN;F;PL

duarumtwo:GEN;F;PL

bonarumgood:GEN;F;PL

feminarumwomen:GEN;F;PL

‘of those two good women’

Clearly, we can simplify the morphology-syntax interface considerably if we asssume that the affix -arum contrib-utes the features genitive, feminine, plural to whatever stem it attaches to. As Kathol observes, such morphologicalcorrespondences are less common in predicate-argument agreement than in NP-internal agreement, but they doexist, cf. (18) (= Kathol 1999, ex. 14, originally from Welmers 1973, 171) from Swahili.

(18) a. Kikapubasket

kikubwalarge

kimojaone

kilianguka.fell

‘One large basket fell.’b. Vikapu

basketsvikubwalarge

vitatuthree

vilianguka.fell


‘Three large baskets fell.’

Kathol’s argument is conceptual rather than empirical in nature, since we can get the right predictions if we justcomplicate our theory of how affixes contribute features to the stem they attach to.10 Nevertheless, it is interestingto observe that Latin participle-subject agreement, which we will present in sections 2–3, displays exactly the samemorphological pattern.

So there are several ways of justifying feature sharing. But as far as we are aware, only Legate (2005) has madeuse of the property of feature sharing which is crucial to our analysis of Latin participles, namely that since theagreement features will be available in the locus of the target, the target can act as a controller of further agreementprocesses in these same features, yielding ‘cyclic agreement’.11 This, as we will see, is the key to understandingthe Latin dominant participle construction and moreover, it allows for a feature sharing treatment of so-called longdistance agreement in general without violating syntactic locality, a concept to which we now turn.

1.3 Syntactic locality and multiple agreement

Syntactic locality has long been an important concept in generative grammar. It goes back at least to Chomsky(1965) who formulated a strict locality condition on subcategorization, which was further refined by Kajita (1968).As pointed out by Sag (2010), the locality of subcategorization gives us a non-trivial prediction that there cannot bea verb evorp which is like prove except it imposes the non-local constraint that its complement clause be transitive:

(19) a. Lee evorped that someone bought the car.b. *Lee evorped that someone died.c. *Lee evorped that someone ran into the room.

Locality considerations are relevant for many linguistic phenomena; see Sag (2010) for an overview. Most syntactictheories assume some locality principle. Although the exact implementation will differ between frameworks, theyall seek to restrict the application of syntactic processes, including agreement, to local domains, and to analyzeapparent non-local processes as (sequences of) local processes.

What is a local domain? In a phrase structure grammar, the strictest definition of a local domain will restrictsyntactic processes to configurations such as head-specifier, head-complement and head-adjunct. Similar notionsmay be given dependency-based definitions in dependency grammars and in LFG’s feature structures. For ourpurposes, however, a more comprehensive and less theory-dependent notion is desirable. We adopt what Polinskyand Potsdam (2001, 609) dub ‘the clause-mate assumption’ in (20).

(20) The controller and the target are in the same clause at some level of syntactic representation

Following Chomsky (2000), a major conceptual argument in favor of cyclicity/locality is that it allows a majorreduction in computational complexity, in the sense that it limits the search space that the linguistic mechanismshave to consider. The smaller we assume local domains to be, the stronger this argument becomes, for the more welimit the search space.

Given that there are clear instances of verbs agreeing with oblique arguments, the clause-mate assumption isthe strongest locality constraint on agreement that we can plausibly entertain. As such, it is conceptually desirable.However, much modern work in Minimalism assumes that the clause-mate assumption is too strong. Instead, itassumes that syntactic locality (including in agreement) follows the Phase-Impenetrability Constraint (Chomsky,2000, 108)

(21) In phase α with head H, the domain of H is not accessible to operations outside α , only H and its edge[i.e. any specifiers of H, our comment] are accessible to such operations.

(21) is more liberal than the clause-mate assumption in that it makes an exception for the edge/specifiers of H. Pol-insky (2003) argues empirically that we need this exception in order to account for apparent long-distance agree-ment phenomena. More radically, Boškovic (2007) argues that Agree is not subject to the Phase-ImpenetrabilityConstraint at all and that agreement (but not movement) can look into phases, i.e. there are no locality constraintson agreement.

10 A reviewer objects that it is not the case that ‘morphological identity of exponents is a crucial factor for feature sharing, and that lack ofsuch identity interferes with feature sharing’. We agree, but this is not the force of Kathol’s argument. Rather, the point is that whenever there isformal identity of exponents, feature sharing theories can assume that the exponents are in fact the same. An non-feature sharing theory, on theother hand, will have to conclude that the two surface-identical exponents are in fact different and only one of them expresses an interpretablefeature (in Minimalist terms), or contributes information directly about the word it attaches too (in LFG terms).

11 The use of cycles of feature-sharing Agree to pass information up the tree in Pesetsky and Torrego (2007) is similar, but as we noted abovethis is a theory-internal use of Agree in Minimalism, not connected with what is usually understood as agreement.


Both Polinsky and Boškovic base their arguments on so-called long distance agreement. (22) (= Polinsky andPotsdam, 2001, ex 48a) shows an example from Tsez (Tsezic, Northeast Caucasian).

(22) eni-rmother-DAT

[už-aboy-ERG

magalubread:III;ABS

b-ac’-ru-ìi]III-eat-PST;PTCP-NMLZ

b-iy-xoIII-know-PRES

‘The mother knows the boy ate the bread.’

In Tsez, verbs regularly agree with their absolutive argument in noun class (glossed with Roman numerals). Butin (22), we see that the matrix verb b-iy-xo ‘know’ bears a noun class III feature that apparently comes from thenoun magalu inside the complement clause. In other words, the matrix verb agrees, not with its own absolutiveargument, but with the absolutive of its complement clause. Hence the term ‘long-distance agreement’.

Polinsky (2003) and Boškovic (2007) both conclude that such agreement is truly long-distance in that it violatesthe clause-mate assumption in (20). But Polinsky (2003) argues that the weaker Phase-Impenetrability Condition(21) is not violated: magalu ‘bread’ undergoes topicalization to the edge of the complement clause, which meansthat it is on the edge of that phase and hence available for agreement with the matrix verb. Note that this topical-ization must be covert, for magalu in (22) is not overtly at the edge of the complement clause, but rather precededby the ergative argument už-a. Boškovic (2007) argues against such covert topicalization. He concludes that Tsezlong distance agreement is even more radical in that it violates not only the clause-mate assumption but also thePhase-Impenetrability Condition. In fact, he claims there are no locality constraints on agreement, only interven-tion effects: the verb must agree with the closest eligible controller. Since only absolutives are eligible in Tsez, weget the long distance agreement in (22).

We will return to the analysis of Tsez long distance agreement in section 5. But let us observe already at thisstage that the agreement in (22) seems to be ‘mediated’ via agreement on the verb of the complement clause,b-ac’-ru-ìi, which is also marked for noun class III. This suggests a feature sharing analysis: the embedded verbagrees with its absolutive argument and thereby shares its noun class feature, which in turn is passed up to thematrix verb, i.e., we have cycles of local agreement relations rather than proper long distance agreement. We willexplore this analysis in more detail in section 5. For now, we note that such an analysis, if feasible, would accountwell for the data without violating syntactic locality.

If so, we have a powerful, conceptual argument in favor of maintaining locality. All varieties of generativesyntax hold that a strong locality constraint is a desirable principle because it reduces computational complexity. Ifsome, or even most, syntacticians working within Minimalism now think the strongest possible locality constraint– the clause-mate assumption – does not apply to agreement, it is precisely because of the long distance agreementdata. If this data can be explained through feature sharing operating purely locally, we should on conceptualgrounds – the reduction of computational complexity – prefer such an account to one that uses long-distanceagreement.

But the issue is not purely conceptual. We will see in section 4 that a feature sharing, local notion of agreementis needed for the Latin data and in section 5 that such an account will generalize to previously reported data onlong distance agreement, while the non-local analyses that have been devised for the latter cases cannot converselybe generalized to the Latin data. In sum, the local analysis is not only conceptually attractive but also empiricallymore succesful.

2 Latin agreement

The specific agreement phenomenon that is the focus of our article occurs in so-called dominant participle con-structions. To motivate our analysis of these, we first briefly survey the different usages of Latin participles and theagreement facts in these contexts.

Latin has three different participles. There is a future participle whose distribution in Classical Latin is limitedto periphrastic forms. We will ignore this form here. The two other participles are the present (active) and theperfect (passive) participles. These are similar to the English participles in -ing and -ed.

The present active and the perfect passive participle both have a variety of uses illustrated in (23)-(29): at-tributive (23), nominalized (24), subject predicative (25), object predicative (26), periphrasis (27), free predicative(28) and ablative absolute (29). In these examples the participle is bold-faced and its agreeing NP (if present) is initalics. Notice that the attributive (23) and the free predicative (28) are generally identical;12 it is a matter of textualinterpretation which analysis is correct for a given example.

(23) rosarose:NOM;F

florensblooming:NOM;F;SG

pulchrabeautiful:NOM;F;SG

estis

12 As far as we can tell from the written text, that is. But it is likely that attributive participles, unlike free predicates, formed constituentswith their nouns. This constituency could have been marked prosodically, but such evidence is of course no longer available to us.


‘The blooming rose is beautiful.’ (attributive)

(24) medicidoctors:NOM;M

leviterlightly

aegrotantesbeing.ill:ACC;M/F;PL

lenitermildly

curantcure:PRES;3P

‘Doctors cure the lightly ill mildly.’ (nominalized, Cic. de Off 1.83)

(25) rosarose:NOM;F

florensblooming:NOM;M/F/N;SG

estis

‘The rose is blooming.’ (subject complement)

(26) vidisaw

puerumboy:ACC;M

currentemrunning:ACC;M/F;SG

‘I saw the boy running.’ (object complement)

(27) puerboy:NOM;M

amatusloved:NOM;M;SG

estis

‘He was/has been loved.’ (periphrastic perfect)

(28) rosarose:NOM;F

florensblooming:NOM;M/F/N;SG

pulchrabeautiful:NOM;F;SG

estis

‘A rose is beautiful when it blooms.’ (free predicative)

(29) histhem:ABL;M/F/N;PL

pugnantibusfighting:ABL;M/F/N;PL

illumhim:ACC

inin

equumhorse:ACC;M

quidamsomeone:NOM;M

exfrom

suishis own:ABL;M/F/N;PL

intulitmount:PRF.3S

‘while they were fighting, one from his [attendants] mounted him on a horse.’ (absolute construction,Caes. Gal. 6.30)

In all cases except nominalizations, the clause contains an NP that agrees in CASE, NUMBER and GENDER withthe participle. Moreover, Haug and Nikitina (2012) argue for an analysis in which the agreeing NP is always thesubject of the participle. This analysis is obvious in the periphrastic case (27), and in the subject complement case(25) it follows on a standard analysis of the copula as either a raising verb or an auxiliary. Similarly, in the ablativeabsolute (29), although the exact structure of the construction and its relation to the matrix clause may be disputed,it is clear that the NP his must be the subject of pugnantibus.

In the object complement case (26), the subjecthood of the NP follows automatically from an analysis in termsof exceptional case marking (ECM, the NP is assigned case by the matrix verb but is structurally the subject ofthe participle) or raising to object (the NP is the thematic subject of the participle but ‘raises’ to the matrix clauseand receives case there). If instead we analyze (26) as object control, it is not clear whether puerum or a controlledPRO is the subject of the participle. But Latin in fact offers good evidence in favor of identity theories of control,where the controller has two theta roles (Cecchetto and Oniga, 2004). This can be implemented as in the movementtheory of control (Boeckx et al., 2010) or via LFG’s functional control mechanism (Bresnan, 1982), but the resultis the same: puerum would be the subject of the participle, even on a control analysis. The same holds for the freepredicative use in (28).

In attributive structures such as (23) it is less obvious that the NP is the participle’s subject. Indeed it is some-times assumed that attributive elements have no subjects at all. However, reflexives governed by attributive ornominalized participles can be bound by the participle’s semantic subject as in (30)–(31).

(30) testificortestify:PRES;1S

autembut

rursumagain

omnievery:DAT;M/F/N;SG

hominiman:DAT;M

circumcidenticircumcising:DAT;M/F/N;SG

seREFL;ACC

quoniamthat‘I declare to every man who circumcises himself that. . . ’ (Vulgate, Gal. 5:3)13

(31) notaviobserve:PRF;1S

etiamalso

inin

porticugallery:DAT

gregemtroop:ACC;M

cursorumrunners:GEN;M

cumwith

magistrocoach:ABL

seREFL;ACC

exercentempracticing:ACC;M/F;SG

‘I also noticed a troop of runners practicing in the gallery with their coach.’14 (Petronius, Satyricon 29)

13 This example is a translation from Greek, but the Greek original has a different structure without a reflexive pronoun, showing that thebinding is real Latin.

14 It is crucial here that the Latin verb noto, unlike its English counterpart notice is only constructed with an NP object, not with a (small)clause.


Latin reflexives must be bound by subjects unless they are used as logophors, which is not the case here. Matrixsubjects can bind into participle clauses, but this is clearly not what is happening in (30)–(31). We must concludethat these participles do have subject positions. The subject in (30) must be either the NP omni homini, or a positioncontrolled by it. Again, the interpretation of such a structure depends on the theory of control that one adopts, buton identity theories of control, which are supported by the Latin data, the conclusion must again be that the NP isthe subject of the participle. A similar analysis can be given for “nominalized” participles, implying that they arenull head modifiers (Devine and Stephens, 2000, 228-246).

It follows that the agreement facts that we see in (23)-(29) can all be stated in very simple terms as in (32).

(32) Participles agree in NUMBER, GENDER and CASE with their subject.

Moreover, the principle in (32) is just a general instance of predicate-subject agreement in Latin, which can bestated as in (33).

(33) If a predicate bears morphological exponents of PERSON, NUMBER, GENDER and/or CASE, then there isobligatory agreement in these features between the predicate and its subject.

This principle holds for participles, and also for adjectives in primary and secondary predication (agreeing inNUMBER, GENDER and CASE), and for finite verbs (which agree in PERSON and NUMBER). Agreement in PERSONis often (see e.g. Wechsler and Zlatic, 2003) taken as indicative of a different kind of agreement (INDEX-agreement)from agreement in CASE (CONCORD-agreement), with agreement in GENDER and NUMBER being possible in bothtypes. We will return to this distinction in section 6.2, but note that the principle in (33) covers both types ofagreement.

3 The dominant participle construction

There is one participle construction that we have not yet considered, but which displays very interesting agreementproperties. This is the so called dominant participle construction, also known as the ‘ab urbe condita’-constructionafter the famous instance used in the Roman dating system.15 This construction is a nexus of a participle and a nounthat can appear in all nominal contexts, but with clausal semantics, as shown in (34). (The participle is bold-facedand its agreeing NP in italics, as in examples (23)–(29).)

(34) abfrom

urbecity:ABL;F;SG

conditafounded:ABL;F;SG

‘from the city’s founding’

As indicated in the translation, this construction has a very close analogue in the English nominal gerund construc-tion. We will argue that the structure is essentially the same – that of a nominalized small clause – but that theLatin agreement system obscures this deeper similarity.

The structure and history of the dominant participle construction is discussed extensively in Nikitina and Haug(2015). We refer the reader to that paper for a detailed account, including previous scholarship. Here we onlysurvey the most important facts.

3.1 A mixed-category analysis of dominant participles

As far as we are aware, all previous analyses of the dominant participle construction assume that the noun is thesyntactic head and the participle is an attributive or sometimes predicative adjunct. The main argument quoted infavor of this analysis seems to be that the participle agrees with the noun. However, we have seen that agreementis not restricted to attributive constructions but is general across all uses of the participle. In fact there are at leastthree reasons to consider the participle the head of the dominant construction:

First, the dominant construction is commonly attested with a pronoun in the nominal slot, as in (35), where thesubject is the relative pronoun quibus.

(35) Quibuswhich:ABL;M/F/N;PL

latiscarried:ABL;M/F/N;PL

gloriabaturglory:IMPF;PASS;3S

‘[the laws] in the passing of which he gloried.’ (Cic. Phil. 1.10)

15 The Roman calendar counted the years from the foundation of Rome in (allegedly) 753 BCE.


Pronouns cannot normally be modified in Latin, so this construction cannot be attributive.Second, the meaning of the construction is clause-like, and (36) allows for a number of clausal paraphrases

(37), as noted by (Pinkster, 1990, 133):16

(36) occisuskilled:NOM;M;SG

dictatordictator:NOM;M

CaesarC.:NOM;M

aliisothers:DAT;M/F/N

pessimumworst:NOM;N;SG

aliisothers:DAT;M/F/N

pulcherrimummost.beautiful:NOM;N;SG

facinusdeed:NOM;N

videreturperceive:IMPF;SBJV;PASS;3SG

‘the slaying of Dictator Caesar seemed to some the worst, and to others, the most glorious deed.’ (Tac.Ann. 1.8)

(37) a. quodthat

dictatordictator:NOM;M

occisuskilled:NOM;M;SG

eratbe:IMPF;3S


facinusdeed:NOM;N

videbaturperceive:IMPF;PASS;3S‘That the dictator had been killed seemed the most glorious deed.’

b. dictatoremdictator:ACC;M

occisumkilled:ACC;M/N;SG

essebe:PRES;INF


facinusdeed:NOM;N

videbaturperceive:IMPF;PASS;3S‘That the dictator had been killed seemed the most glorious deed.’

The semantics of dictator occisus is propositional, i.e. it denotes the proposition that there was an event in whichCaesar was killed. This makes it different from constructions such as ‘the young Isaac Newton’ or ‘a more resoluteRoosevelt’, which are often taken as referring to a stage or a manifestation of the head noun (von Heusinger andWespel, 2006). In a sentence like The dead Caesar frightened everyone, the dead Caesar could be argued to referto Caesar’s manifestation as dead. On an analysis where stages and manifestations are inherent in the semanticsof nouns it would then be possible to preserve the noun’s status as the semantic (and syntactic) head. But in (36)(and its paraphrases in (37)), the reference is clearly to a proposition, which cannot plausibly be inherent in thenominal semantics. If it were, we would expect the noun alone to be as good a subject as the noun plus participlecombination, but it is not, as shown in (38).

(38) #dictatordictator:NOM;M


facinusdeed:NOM;N

videbaturperceive:IMPF;PASS;3S

#‘The dictator seemed a beautiful deed.’

Finally, while the participle is not omissible, the noun can be left out if the verb is impersonal, as in (39).

(39) inin

librisbooks:ABL

SibyllinisSibylline:ABL

propteron.account.of

crebriusmore.frequently

eothat:ABL

annoyear:ABL

defrom

caelosky:ABL

lapidatumrained.stones:ACC;M/N;SG

inspectisexamined:ABL;M/F/N;PL

‘. . . in the Sibylline books, which were consulted on account of the fact that it rained stones more fre-quently from the sky that year.’ (Liv. 29.10)

The participle lapidatum is from the impersonal verb lapidare ‘to rain stones’ and consequently, no noun occursand the dominant construction consists of the participle alone.

It is also possible to leave out the NP when it is easily recoverable in the context, as in (40).

(40) (For if no one had passed this way since I went indoors, the casket would be lying here. Why say “here?”It’s lost, I guess; it’s done for. It’s all over with unhappy and unlucky me! It’s nowhere, and nowhere amI.)

perditalost:NOM;F;SG

perdiditlose:PRF;3S

meme:ACC

‘Its being lost proved my loss.’ (Plautus, Cist. 686)

If the participle is the head and the NP its subject, then this is just normal null anaphora (prodrop) of an easilyrecoverable subject. Similar examples are found in later Latin too.

16 The two variants in (37) differ in that the first uses a finite complement clause introduced by the complementizer quod, whereas the seconduses a nonfinite accusative with infinitive structure (literally, ‘For the dictator to have been killed seemed the most glorious deed’). Both arerendered most naturally in English with a that-clause.


But while there is evidence that the participle is the head of the construction, it is also clear that the externalsyntax of the construction is nominal, as laid out in detail in Haug and Nikitina (2012); Nikitina and Haug (2015).For example, the dominant construction can be coordinated with NPs, and they can occur in all nominal positions,including subject, object (of verb and preposition) and adnominal genitive.

These facts motivate an analysis of the construction as an NP whose ultimate head is the participle V, whichtakes the embedded NP as its subject. To keep the two occurrences of NP separate we will use NPc and NPe(mnemonic for their having respectively clausal and entity-type semantics). There could be several intermediateprojections between V and NPc, but schematically we can represent the structure as in (41).

(41)

NPc

NPe . . . Vhead

(41) implies that the dominant participle is a mixed category construction, where the verbal head ultimately projectsa noun phrase. The English gerund is a prototypical example of this, e.g. on the analysis of Pullum (1991) illustratedin (42). (We have added the subscript c and e to clarify the correspondence to (41).)

(42) NPc

VP[VFORM:ptcp]

NP

N’

N

record

Det

the

V[VFORM:ptcp]

breaking

NPe[POSS+]

your

There is a large literature on mixed categories, which we cannot do justice to here. The fundamental question iswhat licences the category mismatch. On Pullum’s account it is the morphology (i.e. the [VFORM:ptcp] feature); inthe approach of (Bresnan, 2001, 100-1,291-292) the featural decomposition of syntactic categories does the work.In non-lexicalist treatments the V head typically moves to an abstract nominal head to join the nominalizationsuffix, schematically as in (43) (which ignores the position of NPe, perhaps in spec,NPc), cf. Bresnan (1997).

(43)

NPc

VP

. . .V

t

N

suffixstem

A non-lexicalist, movement-based account may be problematic for Latin (Lapointe, 1999), but we ignore the detailshere. For more on mixed categories we refer to Nikitina (2008). A more detailed syntactic analysis of the dominantparticiple construction is given in Haug and Nikitina (2012); Nikitina and Haug (2015).

3.2 Agreement in dominant participles

Whatever the exact analysis of the dominant participle construction, any structure compatible with the schema in(41) will have the same implications for how agreement and information flow through the structure. All linguistic


frameworks incorporate a mechanism for passing features from heads to their maximal projections, so all theinformation at V is in principle available at the position of NPc. Symmetric theories of agreement also provide amechanism for passing information from the agreeing head V to its subject NPe. But no theory without featuresharing provides a mechanism for passing information in the other direction, from the controller to the target, i.e.in our case from the subject NPe to its agreeing head V. This is the “wrong direction” for information passingvia agreement, since in theories without feature sharing (whether asymmetric or symmetric) the information iscollected in the locus of the controller (NPe) only, not that of the target (V). And yet the Latin data requires theinformation to be available at V so that it can in turn flow up to NPc. Consider (44).

(44) nelest

eumhim:ACC

LentulusL.:NOM;M

etand

CethegusC.:NOM;M

. . . deprehensicaptured:NOM;M;PL

terrerentfrighten:IMPF;SUBJ;3PL

‘lest the capture of Lentulus and Cethegus should frighten him.’ (Sall., Cat 48.4)

The coordinated subject NPe Lentulus et Cethegus induces plural agreement on the dominant participle depre-hensi.17 Together they form the nominalized clause Lentulus et Cethegus . . . deprehensi which is the subject ofthe matrix verb terrerent. As a subject, the clause induces agreement on its predicate, terrerent, in this case thirdperson and plural number.

Crucially, this is syntactic agreement, not semantic. One could imagine that Lentulus et Cethegus deprehensimeant ‘the captures of Lentulus and Cethegus’, which would be semantically plural and trigger semantic agreementon the verb terrerent. But there are numerous difficulties with this.

First, observe that if a dominant participle is morphologically singular but semantically plural, it does nottrigger plural agreement.

(45) eathis:NOM;F;SG

resthing:NOM;F;SG

saepeoften

temptataattempted:NOM;F;SG

. . . tardabatdelay:IMPF;3S

‘This thing often being tried delayed (his undertakings)’ (Caes. Civ. 1.26.2)

In (45), it is likely that it is totality of the repeated trying which causes the delay, so the interpretation is collective,or ‘pluractional’. But such a reading is not enough to force a singular in cases where the NPe is plural:

(46) HSforty

xlthousand

miliasesterces:NOM;N;PL

in singulos iudicesto each judge:ACC;M;PL

distributadistributed:NOM;PL;N

eum numerumthis number:ACC;M

sententiarumvotes:GEN;F;PL

conficeremake.up:INF

debebantought:IMPF;3P

‘Giving forty thousand sesterces to each judge ought to make up that number of votes’ (Cicero, ProCluentio 74)

This suggests that it is the syntactic NUMBER feature of the dominant construction as exposed morphologicallyon its compound phrases, which governs agreement in these cases, not the semantics. Moreover, in the materialcollected in Heick (1936) there is no instance of a mismatch between the NUMBER of the dominant participleconstruction and the agreeing number morphology on the matrix verb in cases where the dominant participle is asubject. This is unexpected on a semantic agreement theory.

It is in fact unlikely that dominant participle constructions like Lentulus et Cethegus deprehensi are semanticallyplural at all. Latin participles are consistently predicates – unlike e.g. English gerunds they do not have a secondexistence as event nominals. The semantic result of combining a predicate and its subject is a single proposition, nomatter whether the subject is singular or plural. And in fact, when other propositions occur in agreement-triggeringenvironments in Latin, they consistently trigger singular agreement, as in (47).

(47) discordiasdissensions:ACC;F;PL

versariexist:INF

essetwould.be:IMPF;SBJV;3S

necessenecessary:NOM;N;SG

‘that dissensions exist would be necessary’ (Cic. Att. 2.1.6)

We conclude that there is every reason to believe that, although the dominant construction in (44) is syntacticallyplural and triggers plural agreement, it is semantically a proposition and not in any sense a semantic plural.

Consider now (48)–(49).

17 Notice that, at least on LFG assumptions, the resolution of the coordinate NUMBER value to plural is entirely internal to the coordinationand therefore orthogonal to the question of how agreement should be modelled. Single conjunct agreement does not seem to be attested withdominant participles, but could in principe be captured in exactly the same way as other instances of single conjunct predicate-argumentagreement.


(48) unusone:NOM;M

annusyear:NOM;M

additusadded:NOM;M;SG

laborilabour:DAT;M

tuoyour:DAT;M/N;SG

multorummany:GEN;M/N;PL

annorumyear:GEN;M;PL

laetitiamjoy.ACC

nobisus.DAT

. . . adferretbring:SBJV;IMPF;3S

‘Adding one year to your labour would add many years of joy to us.’ (Cic., Q. fr. 1.1.3)

(49) adiecitadd:PRF;3S

decusglory:ACC;N;SG

natusborn:NOM;M;SG

eo annothat year:ABL;M;SG

divusdivine:NOM;M;SG

AugustusAugustus:NOM;M

‘The birth of the divine Augustus that year added glory [to Cicero’s consulate]’ (Vell. Paterc. 2.36.1)

In both these examples, as well as in (45) above, we see that noun and the participle in a dominant participleconstruction agree with each other, but are also both available to further agreement processes at each end of thechain. In the following we focus on (49), but the reasoning applies to (45), (48) and numerous other examples thatcan be found in the Latin data.

In (49), the phrase natus Augustus, literally ‘Augustus born’, i.e. ‘the birth of Augustus’ forms a dominantparticiple construction and its two parts agree in CASE, NUMBER and GENDER. The crucial point is that at eachend of this agreement chain, the features are available to further agreement processes: hence, Augustus and itsadjective divus agree in CASE, NUMBER and GENDER, while natus (as head of the matrix subject constituent) andthe matrix verb adiecit agree in NUMBER (and PERSON). For this to be possible, it is clear that the CASE, NUMBERand GENDER on natus and Augustus must be real (in some theory-specific sense) at each end of the chain. This isthe challenge that a theory of agreement must overcome in order to account for the Latin data, and as we will see,all non-feature sharing theories of agreement, asymmetric and symmetric, local and non-local, fail to account forthis.

4 Formalizing dominant participle agreement

4.1 Local agreement without feature sharing fails

Let us first consider a traditional derivational analysis. It could look like (50), where we have assumed for con-creteness that the clause in the dominant participle construction projects to AspP (given that Latin participles haveaspect), which is embedded inside an NPc with an abstract head N (to which the verb would move covertly onan analysis like (43)). For now we ignore CASE, since it is often treated differently from standard φ -features inderivational frameworks, and the standard features are enough to prove our present point; we will return to CASEbelow.

(50)TP

VP

adiecitu:φ [NUM=s] decus

NPc

AspP

VP

natusu:φ [GEND=m,NUM=s]

NPe

N

Augustusφ [GEND=m,NUM=s]

AP

divusu:φ [GEND=m,NUM=s]

Nφ :?

In this configuration, standard assumptions would make it possible for agreement to match the uninterpretablefeatures on the lower VP with the interpretable features on the lower NPe, ensuring correct subject-predicateagreement and making sure that the uninterpretable features are deleted before they reach Logical Form. Moreover,since the lower NPe has interpretable φ -features, it can be the controller of several agreement processes, including


one with its adjective divus. However, there is no way for the lower NPe to pass its features up to the higher NPc,which would be necessary for the construction as a whole to correctly induce agreement with its verb.

Alternatively, we could assume that it is the higher NPc that is the bearer of the interpretable φ -features, andthe features on the lower NPe and the V are uninterpretable. In this case, we correctly allow external agreementprocesses (i.e. the main predicate adiecit) to target these features, but we fail to allow agreement between thelower NPe and its adjective, since the lower NPe has no interpretable φ -features. To achieve this, we would needto posit that the lower NPe has both an interpretable and an uninterpretable set of features, which would blur theasymmetric nature of agreement. There also would not seem to be a way of preventing NPe from agreeing withitself, as it were, if it carries matching sets of both interpretable and uninterpretable φ -features.

We conclude that a standard derivational approach is unable to account for the Latin data. In fact, if we upholdthe distinction between interpretable and uninterpretable features, and assume there is only one set of the formerfeatures, then it seems we cannot explain the data without violating syntactic locality. And we will see below thateven non-local analyses break down once we consider CASE.

Standard LFG agreement theory does not fare better. It yields the analysis in (51) for (49).

(51)

“ADD GLORY”

SUBJ

“BE BORN”

AGR

[CASE nomNUMBER s

]

SUBJ

“AUGUSTUS”

AGR

NUMBER sGENDER mCASE nom

ADJ[“DIVINE”

]

There is an AGR attribute in the feature structure of the embedded subject Augustus, which is available for agree-ment with the adjunct divine and the predicate be born. But we fail to predict that the lower subject’s AGR isavailable for agreement with the matrix predicate add glory, since it is too deep in the structure: the matrix verbcan only agree with its subject, not with its subject’s subject. Therefore, there is a NUMBER feature in the highersubject’s AGR, but this is falsely predicted to be independent of the NUMBER feature in the lower subject’s AGR.We also fail to predict that the matrix predicate assigns case to the construction, again because the relevant CASEfeature is too deep in the structure. Instead, we get two independent CASE values, which are wrongly predicted tobe able to diverge.

We see that the standard agreement theories of both derivational approaches and of LFG have the same problem.The absence of feature sharing makes it impossible to pass the agreement features up the structure and make themavailable to subsequent agreement relations, assuming we maintain the locality of agreement.

4.2 Long distance agreement without feature sharing fails

On the other hand, it may seem that we can get the facts right if we give up locality, along the lines that were dis-cussed in section 1.3. For example, we could allow the matrix verb adiecit to agree with the subject NPe inside thenominalized clause, as ‘long distance agreement’. This could be implemented either with non-standard agreementequations in LFG or via a revised Agree mechanism in Minimalism (following e.g. Polinsky and Potsdam, 2001)– in fact, agreement in non-standard configurations was one of the main motivations for introducing the operationAgree in Minimalism.

However, while this approach will work for number and gender, it will not do for case. We already mentionedthat (51) wrongly contains two independent case features. Similarly, in (50), standard assumptions about caseassignment would lead us to expect that the higher verb adiecit can only assign nominative case to NPc, not to theembedded NPe. Yet recall that the case marking on NPe in (50) depends on the function of the nominalized clauseas a whole within the matrix structure. In (50) we get nominative because that is the case that the verb adiecitassigns to its subject; if the function of the nominalized clause requires another case, this is reflected on NPe (aswell as on the participle), as illustrated in (34)–(35).

We will see below that in a feature sharing system we can handle this by having adiecit assign case to NPc andletting agreement propagate the case feature through the structure. On the other hand, without feature sharing there


is no obvious way to propagate a case value on NPc down the structure. For that we would need long distance caseassignment directly to NPe.

The usual mechanism for long distance case assignment is Exceptional Case Marking (ECM): adiecit wouldcase-mark its subject’s subject just like English believe case-marks its complement’s subject on an ECM analysis.However, ECM involves case-marking without thematic role assignment. Therefore, on standard Minimalist as-sumptions (Pesetsky and Torrego, 2011, 63), ECM can only involve structural case, not quirky or inherent case.This is borne out by Icelandic ECM data, but crucially the Latin dominant participle construction behaves dif-ferently: it can appear in any case, including a semantically motivated adjunct case, or inherent case, such as theablative case assigned by the verb gloriabatur ‘glory in’ in example (35). For that reason, an ECM analysis is notviable and we must conclude that NPc gets its case from agreement with NPe; but this is incompatible with thenon-feature sharing approach of Polinsky and Potsdam (2001).

4.3 A symmetric feature sharing analysis

Feature sharing theories fare much better with the Latin data. We first illustrate this with the symmetric theorydeveloped in a Minimalist setting by Ackema and Neeleman (2013).18 This approach borrows ideas from autose-gmental phonology and assumes that φ features (represented as trees rather than attribute-value matrices but thatis irrelevant here) are generated independently on verbs and nouns and represented on a different tier from theircorresponding lexical items.19 Returning to (3), this could start out as in (52).20

(52) [NP φ ] . . . [V φ ]| |

GEND f PERSON 1NUM pl NUM pl

They further assume that φ -features unify in agreement, thus giving a symmetric account. Moreover, the unifiedφ -features are associated with both syntactic loci, so we also have feature sharing. The result is as in (53).

(53)

GEND fNUM pl

PERSON 1

[V φ ][NP φ ]

At this stage of the derivation, then, the set of φ -features is available both to the DP and the V. However, Ackemaand Neeleman assume that this structure can be affected by ‘dissociation’, which basically deletes a link betweena φ -node and a feature bundle. From (53), dissociation could produce the structure in (54).

(54) [NP φ ] . . . [V φ ]|

GEND fNUM pl

PERSON 1

(54) is the input to Logical Form (LF), so the features are only interpreted on the NP. Note that the applicationof ‘dissociation’ is optional and can be repeated, so we could derive various other representations from (53), e.g.by deleting the link between the NP and the feature bundle, or both links. However, these structures would beill-formed because of the principle of φ -feature licensing given in (55) (= Ackema and Neeleman, 2013, ex. 10),which essentially constrains the application of dissociation.

(55) At LF, each φ -feature F must be licensed in each position L with which it is associated. F is licensed in Liff (i) F is inherent in L’s lexical specification, or (ii) F receives a semantic interpretation in L.

This deletion happens in the LF branch of the grammar in the minimalist Y-model, so it is outside ‘core syntax’.In the core syntax itself, then, the agreement features are tied to all the lexical items that carry overt agreement

18 As already mentioned, there are several other variants of agreement with feature sharing in the minimalist literature. We cannot go throughthem all here, and choose to focus on Ackema and Neeleman (2013) as a recent and well worked-out, representative theory.

19 In LFG too, it has been proposed to model agreement in a separate structure (Falk, 2006).20 We have replaced their DP with NP. Nothing hinges on this.


features. This means that the model can directly account for the Latin dominant participle construction. We simplygeneralize the approach by having the subject NPe, its AP, the dominant participle predicate, the outer NPc andthe matrix predicate all share the same set of agreement features, as in (56). Again, we ignore CASE because it isnormally treated differently in Minimalism.

(56)

GEND mNUM s

[V φ ]. . .] ]φ[V] ]φ[APφ[NPeφ[NPc

The same effect can be achieved in other symmetric theories of agreement if they are augmented with featuresharing. In LFG, for example, we get an identical account if we think of the φ -nodes in the upper tier as AGRattributes, and the lower tier as the value of those attributes. This gives an analysis where we enforce agreementnot by having a single AGR attribute as in (51), but by having multiple AGR attributes that take the same valuethrough structure sharing.

Let us see how this works in more detail. If Haug and Nikitina (2012) are correct in arguing that in the functionalstructure, heads are subjects of their modifiers, the generalization in (33) can be captured via the equation in (57)inherent in the lexical entry of any predicate.21

(57) (↑ AGR) = (↑ SUBJ AGR)

(57) ensures that there is a single value for the two different AGR attributes: technically this is ensured by unifica-tion, meaning that information flows in both directions between the two positions. This is opposed to the traditionalLFG approach to agreement, which has a single AGR attribute (no feature sharing), whose value can be specifiedby both the controller and target (symmetry).

(58) gives the constituent structure of (49). For concreteness we have chosen a flat structure, where each clauseis an S node dominating the verb and all of its arguments. Nothing hinges on this, for in LFG the relevant agreementrelations are established based on function, as in (57), rather than constituent structure.

(58)S

NPc[CASE=nom]

↑ = ↓S

NPe

↑ = ↓N

Augustus[GEND=m,NUM=s,CASE=nom]

AP

divus[GEND=m,NUM=s,CASE=nom]

↑ = ↓Vp

natus[GEND=m,NUM=s,CASE=nom]

NP

decus

↑ = ↓Vm

adiecit[NUM=s]

The terminal nodes are marked with their features: these are part of the lexical entries of the relevant forms andbold-faced where they are inherent. NPc is also marked with a CASE=nom feature that it gets from the verb adiecit,which assigns nominative case to its subject. This feature is also bold-faced, since NPc is the position that gets casefrom the matrix verb. Headedness relations are marked with the standard LFG notation ↑ = ↓, which means that

21 Nothing hinges on the analysis in Haug and Nikitina (2012): it would be possible to dissociate adjunct structures from ordinary subject-predicate structures and treat them with an equation (↑ AGR ) = (ADJ ∈ ↑) AGR. This would be a structure-specific rule associated withadjunction structures rather than with particular lexical entries.


features flow between a head daughter and its mother node. Recall that on our analysis of the dominant participleconstruction, it is a mixed category, concretely an NP whose head is an S.

As we have seen, the relevant agreement configurations in Latin are predicate–subject and modifier–head(which we tentatively reduce to predicate–subject following the analysis in Haug and Nikitina (2012)), i.e. theAdjP agrees with NPe, NPe agrees with Vp; and NPc agrees with Vm. Under standard LFG assumptions aboutagreement, the features in an agreement configuration are only syntactically present on subjects (in predicate–subject agreement). Therefore, the features on natus are only registered in the feature structure of its subject NPeand not available to be passed up the head chain to NPc. This means we get the two independent feature bundlesin (51).

If on the other hand we revise our assumptions about agreement and assume feature sharing, as in (57), thefeatures on natus – although agreeing with NPe – are still present in the Vp position and can be passed up thechain to NPc. The features are represented syntactically in every locus where they appear on lexical items. Theagreement operations link these feature bundles so that instead of the two unconnected AGR attributes we see in(51), we get four connected AGR attributes, as in (59).

(59)

“ADD GLORY”

AGR[ ]

SUBJ

“BE BORN”

AGR[ ]

SUBJ

“AUGUSTUS”

AGR


ADJ

“DIVINE”

AGR[ ]

In (59), the shared value of AGR is only represented in the embedded subject position, but this is purely graphical:the interpretation of (59) is that the value of the different AGR attributes are token identical. Similar accounts of thistype of agreement can be devised within other symmetric theories, e.g. in HPSG, following the theory of Kathol(1999).

We have seen that the Latin case requires feature sharing. Does it also require symmetry? Once we considerwhere the agreement features originate, it becomes clear that the answer is yes. As indicated by the bold-facing in(58), the GENDER and NUMBER features are controlled by NPe, while the CASE feature is controlled by NPc, sincethis is the position that is assigned case by the verb. If agreement is asymmetric, we cannot capture this in a singleagreement operation, even if we assume feature sharing. We would have to assume two agreement processes: onein which V agrees with its subject NPe in GENDER and NUMBER; and another, non-standard process whereby thesubject NPe agrees in CASE with its verb. These two processes would therefore go in opposite directions, makingit impossible to define constraints on the directionality of Agree and thereby also undermining asymmetry. Bycontrast, a feature sharing and symmetric approach directly accounts for the Latin data.

Finally, it is worth noting that although it is the special properties of the dominant participle construction thatrequire a symmetric, feature sharing approach to agreement in Latin, this approach (both on the version in Ackemaand Neeleman (2013) and the LFG version) generalizes directly to all other types of agreement, i.e. we can (butneed not, see section 6.1) analyze all Latin agreement as involving feature sharing, as we have done in (59) - we donot have to assume that dominant participles require a special agreement mechanism.22 “Worst case” agreementgeneralizes to the normal case. This is likely to be important in the acquisition of a feature sharing agreementsystem: while nothing in normal agreement requires that the co-specified feature structure is available in bothsyntactic positions, there is also nothing that restricts the co-specified feature structure to a single position.

22 Note however that finite subordinate clauses in subject position behave differently from dominant participles. We return briefly to this insection 6.1.


5 Preserving the locality of agreement

We have seen that the Latin dominant participle construction requires feature sharing agreement. Now observethat the configuration that let us establish this is similar to the configurations which have often been analyzed aslong-distance agreement: the participle is the target of one agreement process with an argument inside its ownclause, but at the same time it is the head of a clause that is in an argument position and controls agreement witha structurally higher verb. This gives the impression that the higher verb agrees with the subject of the embeddedclause, i.e. there is long distance agreement. But crucially the agreement is mediated by the intermediate verb (theparticiple), so there is no violation of locality if we assume feature sharing. We also showed that a long distanceagreement analysis does not work for Latin because it cannot account for the case facts. We will now see that theconverse is not true: the feature sharing analysis we developed for Latin will carry over to other reported cases oflong distance agreement.

We cannot here hope to cover the whole issue of long distance agreement, which has engendered extensivetheoretical discussion, in particular within Minimalism (see e.g. Koopman, 2006; Boeckx, 2008, 2009, and manyothers), nor can we deal with all the reported cases of long distance agreement. Instead, in section 5.1, we focuson the three different types of long distance agreement identified by Polinsky (2003). As Polinsky shows, only oneof these, which she dubs ‘clause-periphery agreement’, is actually problematic for standard locality assumptions.Accordingly, in section 5.2, we discuss the best-documented cases of clause-periphery agreement (found in Tsez,Polinsky and Potsdam 2001, and in some Algonquian languages, Bruening 2001; Branigan and MacKenzie 2002)and show that, if we assume feature sharing in agreement, the problem simply ceases to exist. (A related pointis made in Legate 2005.) Finally, in section 5.3 we show that although these instances of apparent cross-clausalagreement provide the best evidence for feature sharing agreement, there are also some purely clause-internalagreement phenomena that lend themselves naturally to a feature-sharing account.

5.1 Types of long distance agreement

According to Polinsky (2003), (apparent) long distance agreement comes in three types: agreement through medi-ation, agreement through argument sharing and clause-periphery agreement.

In mediated agreement, the controller is represented by an unpronounced coindexed ‘proxy’ in the clausecontaining a target, as represented schematically in (60) (= Polinsky, 2003, ex 8).

(60) [IP Subject V+Agri NPi [CP/IP . . . NPi . . . ] ]

In this structure, which is found in several Algonquian languages (Polinsky, 2003, 284), the apparent long-distanceagreement is actually local. (61) shows an example from Blackfoot (= Polinsky 2003, ex (10b)).

(61) nitsíksstatawanit-wikixtatwaa-wa1.SUBJ-want;TRANS-3.OBJ

noxkówa[n-oxko-wa1-son-3

máxka’po’takssim-áxk-a’po’takixsi]3.SUBJ-might-work

‘I want my son to work.’

(61) has the proxy structure seen in (60), as shown in (62) (= Polinsky 2003, ex (10b)).

(62) pro-1SG want pro-3SGi [my.son-3SGi work ]

That is, the matrix verb does not directly agree with the downstairs subject, but rather with a coindexed ‘proxy’which has argument status in the higher clause. (Polinsky, 2003, 285–288) surveys the argument for this analysis,involving binding facts, indirect referential relationships, split antecedence and more. There is, then, no violationof locality here.

The second type of apparent long distance agreement, agreement through argument sharing, can come aboutwhen the controller of the apparent long distance agreement appears in the clause containing the target at somelevel of syntactic representation. There are two common processes that can give rise to this: raising and clauseunion (the formation of a monoclausal ‘complex predicate’). Neither of these processes are relevant for the Latindominant participles, but some of the examples discussed in the literature on Hindi/Urdu (Butt, 1995; Bhatt, 2005;Butt, 2014) are superficially very similar. Consider (63).

(63) a. Ram-neRam-ERG

[rot˙ii

bread:Fkhaa-nii]eat-INF;F

chaah-iiwant-PFV;F;SG

‘Ram wanted to eat bread’


b. Ram-neRam-ERG

[rot˙ii

bread:Fkhaa-naa]eat-INF;M

chaah-aawant-PFV;M;SG

‘Ram wanted to eat bread’

In (63-a), the embedded object rot˙ii, the infinitive khaa-nii and the matrix verb chaah-ii all agree in feminine

gender. This agreement is optional, as shown in (63-b). The agreement pattern in (63-a) is explained in Butt (1995,2014) as a series of local agreement relations: the infinitive agrees with its object, and the finite verb in turn agreeswith the infinitive. Although Butt does not spell this out, this analysis obviously entails a feature sharing approachto agreement, since the infinitive is valued for gender because of agreement with its object. However, Butt (1995)and Bhatt (2005) both agree that the infinitive and the matrix verb in (63) form a monoclausal structure, and so theagreement of the “embedded” object and the matrix predicate is in fact entirely local.

What makes the Hindi/Urdu case look like the Latin one is the fact that the intermediate verbal head – theinfinitive in Hindi-Urdu and the participle in Latin – bears morphological exponents of the agreement features.However, in Hindi-Urdu, the infinitive’s agreement is parasitic: the infinitive cannot agree with its object unlessthe matrix verb also agrees (although there is dialectal variation, see Bhatt, 2005, 785). In Latin, by contrast, theparticiple and its subject can and must agree under all circumstances, providing independent evidence for a localagreement process inside the dominant participle construction.

We now turn to clause-periphery agreement, which according to Polinsky (2003) is the only case that actuallyviolates the clause-mate assumption. These examples are also much closer to the Latin case. Clause-peripheryagreement is illustrated by the Tsez example (22) that was discussed in section 1.3, repeated here as (64-a). Thematrix verb agrees in noun class (glossed with Roman numerals) with the absolutive argument of an embeddedclause that is itself in an absolutive position. (64-b) (= Polinsky and Potsdam, 2001, ex 47a) shows that this kind ofagreement is optional - the sentential complement can also trigger class IV (abstract nominal) agreement. Noticethat the structure of these examples are very similar to what we proposed for Latin. The complement clause isin fact a participle that has been nominalized, just like in our analysis of the Latin dominant participles, with thedifference that in Tsez, the nominalization is signalled by the morphology, whereas in Latin it is purely syntactic.The same complementation strategy is found in the related language Hinuq, where it can also triggers long distanceagreement (Forker, 2011, 558-561,574–585).

(64) a. eni-rmother-DAT

[už-aboy-ERG

magalubread:III;ABS


b-iy-xoIII-know-PRES

‘The mother knows the boy ate the bread.’b. eni-r

mother-DAT[už-aboy-ERG

magalubread:III;ABS


r-iy-xoIV-know-PRES

‘The mother knows the boy ate the bread.’

Polinsky and Potsdam (2001, 610) argue that the alternation in (64) is governed by the topicality of the embeddedabsolutive: long distance agreement (64-a) is only possible when the absolutive is a topic; otherwise the comple-ment clause triggers class IV (abstract nominal) agreement (64-b). They further argue that the topic undergoescovert movement to a left-peripheral position (TopP) in the clause. Finally, they propose that locality in agreementis determined by government, which is looser than the clause-mate assumption and yields the same predictions asthe Phase-Impenetrability Condition: a head governs (and hence, following Polinsky and Potsdam (2001, 627), canagree with) its specifier, its complement, elements adjoined to its complement, and the specifier of its complement.The latter type of agreement, Polinsky and Potsdam (2001) argue, is instantiated in Tsez: whenever there is no CPabove TopP, the verb can agree with an absolutive in TopP. The structure is shown in (65).

(65) [IP . . . V+Agri [TopicP NPi [IP . . . ti . . . ] ] ]

Similar structures (though allowing agreement with both topics and foci) have been proposed for the Algonquianlanguages Passamaquoddy (Bruening, 2001) and Innu-Aimûn (Branigan and MacKenzie, 2002). (Note that Bruen-ing argues that Passamquoddy has both clause-periphery agreement and ‘proxy agreement’ as discussed above,in different configurations.) We now turn to a closer discussion of the data from Tsez, Passamaquoddy and Innu-Aimûn, which on the analysis in (65) clearly violate standard locality assumptions. We will see that we can maintainstrict locality (i.e. the clause-mate assumption) if we assume feature-sharing agreement.

5.2 Clause-periphery agreement as local, feature-sharing agreement

In Tsez, as we just saw, long distance agreement is contingent on topicality, and the same holds in Innu-Aimûn,as we will see. In Passamaquoddy, on the other hand, there is evidence that also foci can trigger long distance


agreement (Bruening, 2001, 282-3). In none of these languages, however, is the long distance agreement controllernecessarily surface peripheral in the embedded clause. For example, we saw in (64-a) that the controller is in situin the embedded clause. In derivational accounts this is typically captured as covert topicalization. This resortto covert movement may seem suspicious and, although Polinsky and Potsdam (2001, 629-633) provides theory-internal justification, Boškovic (2003, 2007) argues that it is problematic on Minimalist assumptions. Also, it isoften hard to identify topics in the absence of overt syntactic (or morphological) marking, so that the analysis runsthe risk of circularity. Nevertheless, to stay close to the interpretation of the data in Polinsky and Potsdam (2001);Bruening (2001); Branigan and MacKenzie (2002) we will assume that this description is basically correct and thatthe overarching generalization is that long distance agreement requires the controller argument to be in an operatorposition. In LFG terms this means that the agreement controller is functionally identified with an operator functioneven when it appears in situ. To capture the Passamaquoddy data we make use of the generalized operator functionUDF (Alsina, 2008; Asudeh, 2012), which is not associated with any specific discourse function, rather than themore commonly used TOPIC function.23

Assuming, then, that agreement in Tsez involves feature sharing when the controller is topical, the agreementfacts follow directly without any need to tinker with syntactic locality. In (64-a), then, the embedded clause willactually take on the class III feature of its topic and hence induce class III agreement on the matrix verb. In LFGterms we get the following structure:

(66)

PRED ‘know 〈SUBJ, OBJ〉’SUBJ

[“MOTHER”

]

OBJ

PRED ‘eat 〈SUBJ, OBJ〉’AGR

[CLASS III

]

UDF

...

AGR[ ]

SUBJ[

“BOY”]

OBJ

“BREAD”

AGR[

CLASS III]

The embedded object bread is functionally identified with the embedded clause’s operator function UDF, which inturn agrees in a feature-sharing way with its local predicate eat. In this way, the CLASS III feature gets passed upto the clausal level where in turn it agrees with the governing predicate know.

Observe also that the morphological facts are similar to the Swahili case (18) brought up by Kathol (1999) indefense of feature sharing agreement: in the upper part of the agreement chain, both the matrix and the embeddedverb bears the same morphological exponent b, indicating noun class III. Unlike the Swahili case, the same markeris not present on the lower element magalu, but this is not surprising since the relevant agreement feature here isthe noun class, which is obviously inherent in the noun.

A similar analysis of the Tsez facts (though couched in different, derivational terms) is in fact briefly discussedand rejected by Polinsky and Potsdam (2001, fn. 17), based on evidence from pronominalization. The embeddedclause retains its class IV specification in a situation of sentential anaphora:

(67) a. enirmother:DAT

[už-aboy:ERG

magalubread:III.ABS

b-ac’-ru-ìi]III:eat:PST;PTCP.NMLZ

b-iy-xoIII:know:PRES

‘The mother knows the boy ate the bread.’b. neìa

she:ERG[žathis

r-igu/*b-iguIV:good/III:good

yoì-ňin]is:COMP

eňissaid

‘She says it (=that the boy ate the bread) is good.’

23 A reviewer points out it is unclear what exactly is the connection between operator positions and long distance agreement. This is true, butit is equally problematic for a functional and a configurational approach to operators: the configurational approach has the advantage that theoperator position is closer in tree-geometric terms to the agreement target, but the disadvantage that the controller does not in fact always appearin this position. The functional approach to operators has the advantage of not predicting contrary to the surface facts that the controller must bein the periphery of the embedded clause, but the corresponding disadvantage that the controller is not closer to its target. So far feature sharingagreement has not been particularly well studied empirically. In the light of the evidence for interaction between information structure and(object) agreement amassed by Dalrymple and Nikolaeva (2011) we would not find it surprising if some of these interactions involve featuresharing.


(67-b) shows that the pronominal reference to the embedded clause from (67-a) has class IV, even if the matrixclause in (67-a) agrees in class III. Polinsky and Potsdam conclude that the embedded clause in (67-a) remainsclass IV and the matrix verb agrees with the embedded absolutive topic, not the embedded clause. However, thereis no reason to expect all features to survive in pronominalization, and in any case, it is well known that pronoun-antecedent agreement tends towards semantic resolution (see for example Corbett, 1979). So this argument isinvalid,24 and we think it is better to analyze Tsez in terms of feature sharing agreement, which does not require alocality violation.

The same analysis can be extended to reported instances of long distance agreement in Algonquian, althoughwe also need to capture the fact that the Algonquian languages have an inverse agreement system where theverb agrees with two arguments. Consider (68) (= Bruening 2001, ex 679b, Bruening 2009, ex (22)) from Pas-samaquoddy.

(68) Ma=teNEG=EMPH

n-wewitaham-a-wiy-ik1-remember-DIR-NEG-3P

mahtoqehsuw-okrabbit-3P

tamawhere

alUNCERTAIN

n-toli-putoma-n-ok1-there-lose-SECOBJ-3P

kcihku-kforest-LOC‘I don’t remember where in the forest I lost the rabbits’

In (68), the matrix verb ‘remember’ agrees with its own first person argument as well as with a downstairs thirdperson plural argument. (Notice that the two agreement slots are not directly associated with specific grammaticalfunctions – that is the job of the DIR morpheme. For our purposes we can ignore this aspect of inverse agreementsystems and simply take the verb form as a whole as specifying features of its subject and object, altough it is notglossed that way. When the verb form is marked as direct, the first agreement slot indicates subject features andthe second slot object features.) The downstairs argument has been fronted to the position before the wh-word ofthe embedded clause, but Bruening argues that it is still part of the lower clause. Therefore, the agreement betweenthis argument and the higher verb is apparently cross-clausal.

In Passamaquoddy like in Tsez, the controller of the apparent long distance agreement need not be fronted inthe surface structure but can remain in situ. It does need to have some special discourse function (Bruening, 2001,282-283), but unlike in Tsez, it does not have to be a topic. In (69) (shortened from Bruening 2001, ex (697a)), thecontroller is the wh-word in an embedded question, which is presumably a focus.

(69) N-kosiciy-a1-know.TA-DIR

wenwho

elomi-ya-tIC.away-go-3CONJ

‘I know who left’

The matrix verb is marked as transitive animate in agreement with the wh-word wen. This is one of the importantarguments against a ‘proxy agreement’ analysis, since that would require a structure that could be paraphrased ‘Iknow himi [whoi left]?’, which has no sensible interpretation.

For our purposes it is important to note that like in Tsez the fronted argument also induces agreement on itslocal verb. Therefore, the feature-sharing analysis works for Passamaquoddy too. Let us illustrate this with a sketchanalysis of (68). First, the agreement morphology on the embedded verb specifies that it takes a first person subjectand a third person plural object. In LFG terms, its f-structure looks like (70).

(70)

PRED ‘lose 〈 SUBJ, OBJ〉

SUBJ

[AGR

[PERSON 1

]]

OBJ

AGR

[PERSON 3NUMBER pl

]

As in Tsez, we analyze agreement with an argument in the higher operator position as feature sharing. Schemat-ically we then get the structure in (71) for the embedded clause. (We ignore all the material other than the verband the two arguments it agrees with. Presumably the wh-word is in a distinct operator position, as in Bruening’sanalysis.)

24 The same conclusion is reached by Koopman (2006, 174) and Boeckx (2009, 15).


(71)

PRED ‘lose 〈SUBJ, OBJ〉’

AGR

[PERSON 3NUMBER pl

]

UDF

...

AGR[ ]

SUBJ

“I”

AGR[

PERSON 1]

OBJ

“RABBITS”

AGR

[PERSON 3NUMBER pl

]

Thus, feature sharing passes up the agreement features from the operator UDF to the verb which heads the com-plement clause; therefore the matrix verb’s agreement in these features is entirely local. The same analysis canbe extended to the other Algonquian language that has been claimed to have optional long distance agreement,Innu-Aimûn (Montagnais), as seen in (72) (= Branigan and MacKenzie 2002, ex (4))

(72) a. Ni-tshissît-en1-remember-TI

kâ-uîtshi-shkPST-helped-3/2PL

PûnPaul

utâuia.father

‘I remember that Paul’s father helped you.’b. Ni-tshissît-âtin

1-remember-1/2PLkâ-uîtshi-shkPST-helped-3/2PL

PûnPaul

utâuia.father

‘I remember that Paul’s father helped you.’

In (72-a), the matrix verb is marked as having a first person prefix and is marked as transitive inanimate (TI),reflecting either agreement with the complement clause or some kind of default agreement. The verb of the com-plement clause is marked as taking a third person subject and a second person plural object. In (72-b) the matrixverb is in the transitive animate class and agrees with the downstairs second plural object (as well as its first personsubject), but the agreement in the complement clause stays the same. Therefore such structures can be analyzed asin (71).

The crucial point, then, is that the argument that participates in long distance agreement also participates inlocal agreement within its own clause. If the local agreement process is feature sharing, the features will be passedup the verb of the embedded clause, where they can participate in local agreement with the matrix clause, givingthe impression of long distance agreement. Nevertheless, it must be admitted that the feature sharing is not astransparently encoded in Algonquian as is the case in Tsez: we do not see the same morphological exponentrealized twice in the agreement chain; instead, a complex inverse agreement system is at work, where both thehigher and the lower verb agree with two arguments. The two agreement relations are to some extent fused in themorphological system, but at the syntactic level, the fact remains that there is one argument that both the higherand the lower verbs interact with. In section 6.2 we suggest that the difference between the type of agreement thatwe see in Latin and Tsez versus what we see in Algonquian may reflect different historical origins.

In conclusion, we have strong empirical evidence for the clause-mate assumption: While the extant long dis-tance agreement accounts cannot account for the Latin data presented here, as we saw in section 5, the featuresharing, strictly local analysis proposed here for Latin data does generalize to previously reported data on longdistance agreement.

5.3 Clause-internal feature sharing

Finally, we would like to note that although long distance agreement provides the most compelling case for featuresharing, it may also be found in local agreement. A case in point is Archi (Lezgic, Northeast Caucasian). In thislanguage, verbs that can agree always agree with their absolutive argument. This is shown in (73).25 Agreement isin gender class, which is glossed with Roman numerals I-IV as in the Tsez examples above.

25 The Archi examples come from the website of the project From competing theories to fieldwork, http://fahs-wiki.soh.surrey.ac.uk/groups/fromcompetingtheoriestofieldworkarchi/. See also Chumakina and Corbett (2008).

http://fahs-wiki.soh.surrey.ac.uk/groups/fromcompetingtheoriestofieldworkarchi/

http://fahs-wiki.soh.surrey.ac.uk/groups/fromcompetingtheoriestofieldworkarchi/


(73) to-w-mi-sthat.one-I.SG-OBL.SG-DAT

AjšaAisha.(II)[SG.ABS.]

d-ak:uII.SG-see.PFV

‘He has seen Aisha (female).’

More surprisingly, other elements in the clause, both arguments and adjuncts, show agreement if they have amorphological slot for agreement. Interestingly, even some adverbs and one postposition has such a slot. (74)shows an agreeing ergative argument (boldfaced).

(74) nena‹b›u‹III.SG›1PL.INCL.ERG

hanžugurhow

Qummarlife(III)[ABS.SG]

b-a‹r›ca-r?III.SG-‹IMPF›carry.out-IMPF

‘. . . how (should) we spend our life?’

Clearly it is possible to analyze this as the ergative argument nena‹b›u simply agreeing with the absolutive argu-ment Qummar. However, there is no syntactic dependency between the ergative and the absolutive argument, sothis is a puzzling type of agreement. A better approach may be to assume (following Kibrik 2003, 563–564 butrecasting the analysis in our terms; see also Kibrik 1994, 349) that the predicate-absolutive agreement is featuresharing so that the clausal head actually comes to bear a syntactic gender III feature by virtue of agreement withthe absolutive. The ergative (and other agreeing elements) can then be analyzed as agreeing with their clausal head.Although this is still an unusual type of agreement since the verb is the controller rather than the target, it wouldbe a less surprising configuration, since there is at least a syntactic dependency between the ergative and its clausalhead.

A related phenomenon is found in Warlpiri (Ngarrka, Pama-Nyungan). As discussed by Simpson (1991, 202–214), adverbials in this language agree in case with the event participants they ‘modify’: for example, a manneradjunct modifies the event, but may also have a special relationship to the subject and therefore agree with it in(ergative or absolutive) case. Similarly, location adjuncts may be understood as attributing location to an eventparticipant, and this can induce case agreement. Finally, even time adjuncts may have case agreement, as in (75)(= Simpson, 1991, p. 208, ex (189)).

(75) Jalangu-rlutoday-ERG

ka-lu-janaPRES-3P.SUBJ-3P.OBJ

pulukubullock

turnu-ma-nimuster-CAUS

yapa-ngku.man-ERG

‘The people are mustering the cattle today.’

It is optional for the time adjunct to agree in case, but if it does agree, it can only agree with the subject and not withother event participants. However, it is hard to see how time adjuncts can bear a special relationship to the subject.Simpson (1991, 209) provides a tentative analysis which would amount to feature sharing in our terminology: “thesubject’s CASE provides a case-feature for the event”. This in turn makes the case available for further agreementprocesses and the adverbial, then, agrees with the event (i.e. its governing verb) rather than the subject.

6 Consequences

6.1 Types of agreement

We have seen that the Latin data requires symmetric agreement, because NPe controls GENDER and NUMBER whileV/NPc controls CASE, so that agreement must work in both directions. And it also requires feature sharing, sinceall three features are available for further agreement at both NPe and NPc. In sum, the Latin dominant constructionrequires an approach with both symmetry and feature sharing, either along the lines of Ackema and Neeleman(2013), or the LFG theory outlined above. Moreover, the feature sharing approach enables us to deal with reportedcases of long distance agreement and hence avoid violations of syntactic locality.

But we cannot conclude, as seems to be implied in Ackema and Neeleman (2013) (and other work arguingfor feature sharing in agreement), that agreement always involves feature sharing.26 While the Latin dominantparticiple construction shows that feature sharing is needed in some cases, it does not show that it is needed in allcases. In fact, it is not clear that feature sharing is general inside Latin itself. For example, an alternative to theanalysis in (59) could assume that agreement involves feature sharing only in the dominant participle case, whileother instances of predicate-argument and head-modifier agreement are symmetric in the ordinary way. This yields(76).

26 Their only explicit claim is that agreement is symmetric, but their implementation of symmetry also incorporates feature sharing in thecore syntax.


(76)

“ADD GLORY”

SUBJ

“BE BORN”

AGR[ ]

SUBJ

“AUGUSTUS”

AGR


ADJ[“DIVINE”

]

(76) is a simpler structure than (59). On the other hand, (76) implies that the intuitive surface generalization in(33) does not actually correspond to a unified theoretical treatment of agreement in Latin. As far as we can tell,there is no empirical evidence in Latin that would let us decide between the two in a clear way.27 It is true that inLatin, finite CPs in subject position behave differently from dominant participles and always induce third singularmasculine agreement irrespective of the properties of the subject, just like in English (77).

(77) That I was elected is/*am surprising.

However, we can assume that the complementizer introduces a separate layer of structure so that the verb featuresdo not percolate upwards. Nevertheless, it is possible that a detailed investigation will show that both featuresharing and ordinary agreement is needed for Latin. We leave this matter here.

Although it is possible that all Latin agreement is feature sharing, there is clearly evidence from other languagesof agreement that cannot be feature sharing. For example, there are languages where predicates agree with morethan one argument, as we saw for Passamaquoddy and Innu-Aimûn above, and as illustrated by (78) from Ostyak(= Dalrymple and Nikolaeva, 2011, 142, ex 2d).

(78) (ma) tam kalaN-@t we:l-s@-l-amI these reindeer-PL kill-PAST-PL.OBJ-1SG.SUBJ‘I killed these reindeer.’

The predicate agrees with both its subject and its object, which have incompatible NUMBER features. Hence, ifthere was feature sharing in both these agreement processes, the verb would have an inconsistent feature bundle:feature sharing means that the agreement features are represented in the syntactic locus of the verb itself, so wecannot appeal to distinct agreement slots. (Recall that feature sharing means that the verb takes on the features ofits agreeing dependents and is able to act as a controller of those features on another target.)

It is still possible that one of the two agreement processes in (78) uses feature sharing, however, so that theverb takes on the features of, say, the subject, while it agrees “normally” (without feature sharing) with the object.

Another example comes from Portuguese inflected infinitives (79) (= Raposo, 1987, ex 27a)

(79) [Elesthey

aprovaremapprove-INF.3PL

athe

proposta]proposal

serábe.FUT.3SG

difícil.difficult

‘For them to approve the proposal will be difficult.’

The infinitive agrees in PERSON and NUMBER with its subject, but induces singular agreement on the matrix verb,so it cannot itself be PLURAL, as feature sharing would predict.

It is not the case, then, that a feature sharing agreement theory is strictly more expressive (less restrictive) thanan ordinary symmetric theory. There are phenomena that can be dealt with within a symmetric theory but not in afeature sharing one (at least not without ancillary hypotheses).

This means that we cannot pin down a single agree mechanism. Some agreement phenomena require featuresharing, others do not, and yet others are incompatible with feature sharing. Along the other axis of theoretical vari-ation, symmetry, it is clear that some phenomena require symmetry and others do not; but it is an open question

27 A reviewer suggests that NPs with possessive pronouns (i) show that agreement cannot always be feature sharing.

(i) filiusson.NOM;M;SG

nosterour[1.PL]:NOM;M;SG

‘our son’

The features PERSON 1 and NUMBER pl are INDEX features related to the reference of noster ‘our’. But there is no agreement in INDEX featuresin (i), so the question of feature sharing does not arise. What (i) shows is simply that INDEX and CONCORD features can diverge.


Agreement type Typical domain Typical featuresCONCORD NP CASE, GENDER, NUMBERINDEX clause PERSON, NUMBER, GENDER

Table 2 Types of agreement

whether there are cases which are incompatible with symmetry. The answer may well be no, for the arguments infavor of asymmetry typically revolve around methodological considerations and the desire to limit the expressivityof the formal theory of grammar. Laudable as this goal is in principle, it makes for arguments for asymmetry thatfall apart once there is a single case that clearly requires the formal theory to be more expressive. It is therefore pos-sible that the symmetry/asymmetry distinction, unlike feature sharing, classifies theories of agreement rather thanagreement phenomena. But at this stage, we leave open whether an empirical argument in favour of asymmetricagreement can be made for some agreement phenomena.

While it is possible within e.g. Minimalism to devise several agreement mechanisms, we take the diversity ofagreement phenomena to speak against an architecture where the agreement mechanisms are theoretical primitives(like Minimalism’s Agree), because this leads to a multiplication of these primitives. The LFG architecture, bycontrast, handles agreement by mechanisms that also appear elsewhere in the theory. Feature sharing is a generalmechanism for handling cases where a syntactic entity is present in more than one position in the structure, includ-ing topicalization, raising and – as we have seen here – agreement. Standard symmetric agreement is handled byfunctional descriptions, which is the general mechanism for lexical items to provide information about themselvesand their environment (subject to locality considerations): they are also used e.g. for case assignment, subcategor-ization and binding. LFG also has a mechanism that could be used for asymmetric agreement, if that should turnout to be necessary: these would be handled by constraining equations, a general mechanism for handling featurechecking in LFG. In sum, the more abstract nature of LFG’s primitives, which correspond closely to mathematicaloperations, makes it easier to generalize across grammatical phenomena.28

6.2 The origins of feature sharing and long distance agreement

Across frameworks, there is a widely shared assumption that there are theoretically interesting differences betweenpredicate-argument agreement, and agreement inside the noun phrase. In traditional terms, a distinction is oftendrawn between concord (inside the noun phrase) and agreement (in predicate-argument structures), although theseterms are not consistently used (Corbett, 2006, 5-7). Similar distinctions are drawn in Minimalism (Carstens,2000), HPSG (Wechsler and Zlatic, 2003) and LFG (Dalrymple and King, 2004). Here we follow the HPSG/LFGtradition and distinguish between CONCORD and INDEX, which are prototypically (but not exclusively) associ-ated with agreement inside the noun phrase and in the clause (predicate-argument). The two types of agreementalso prototypically involve different features, possibly for historical reasons which we will only sketch here; seeWechsler (2011, section 5) for a more thorough treatment. Table 2 sums up the differences between CONCORD andINDEX agreement.

HPSG develops an interesting view of INDEX agreement. INDEX features are thought of as properties of dis-course referents, partially specifying their referential index. This is why it includes PERSON, which may identifythe referent, but not CASE, which has no connection with reference. There is a historical explanation for this:predicate-argument agreement typically originates in pronouns via a path from full coreferent pronoun, to cliti-cization to incorporation (Givón, 1976; Bynon, 1992; Corbett, 1995). Hence, coindexation is part of predicate-argument agreement from the outset.

CONCORD agreement has a different origin, most likely in incorporated nominal classifiers (Greenberg, 1978;Corbett, 1991, 2006; Grinevald and Seifart, 2004), so there is no coindexation involved. We therefore predict thatCONCORD agreement can involve feature sharing without leading to coreference. The stronger hypothesis wouldbe that CONCORD always involves feature sharing. This view is implicit in much HPSG work (e.g. Wechsler andZlatic, 2003) and also in more typologically oriented work such as that of Corbett (2006, 133-137) who argues thatNP-internal agreement in case and definiteness results from these features being imposed on the noun phrase as awhole.

It may not be possible to decide whether CONCORD is always feature sharing. For crucially, as long as we haveCONCORD in its prototypical domain – the NP – we are unlikely to see visible effects of feature sharing, such aslong-distance agreement, as in Latin and Tsez, or apparent agreement between coarguments in a clause, as in Archiand Warlpiri. On the other hand, we do observe the frequent identity of the morphological exponents on targetsand controllers, which Kathol (1999) used as a conceptual argument for feature sharing.

28 HPSG is similar in this respect, so these observations hold for that framework too.


The effects of feature sharing, then, only become visible if CONCORD agreement appears in non-prototypicalenvironments, such as predicate-argument structures, as in the Latin dominant participle structure. This gives riseto a new question, however: Why do we find CONCORD agreement in predicate-argument structures at all? TheLatin case is not unique: Wechsler and Zlatic (2003, 84) argue that secondary (but not primary) predicates inSerbo-Croat agree in CONCORD. (Conversely, Dalrymple and King (2004, 83–87) argue that there are instancesof noun-determiner agreement in INDEX). If the hypotheses on the origin of CONCORD and INDEX agreement arecorrect, there must be a mechanism by which CONCORD agreement can spread outside its original, NP-internaldomain.

The history of the dominant participle construction offers one possible mechanism, namely headedness re-versal. As argued in Nikitina and Haug (2015), the dominant participle construction ultimately arose as a resultof reanalysis of headedness relations within a temporal expression. Expressions in which a noun with a temporalmeaning was originally modified by a participle became reanalyzed as a clausal adjunct in which the participleagrees with its subject:

(80) a. attributive construction“[with] winter [which was] ending”

NP

AP

A

ending

NP

N

winter

⇒

b. clausal adjunct“[with] winter ending”

S

V

ending

NP

N

winter

The construction’s origin is also what sets it apart from an otherwise similar construction, the English nominalgerund (e.g. my going away). The English nominal gerund is also semantically clausal and has the distribution ofan NP. However, the verbal element of the nominal gerund is originally a verbal noun rather than a verbal adjectivelike the Latin participle. For this reason the subject-predicate relation is established via ordinary case assignment,at first the expected adnominal genitive/possessive case, then accusative (me going away).

To the extent that long-distance agreement (and more generally, feature sharing agreement outside the NP)originates in such headedness reversals of an original NP CONCORD structure, this gives us a number of predic-tions about long distance agreement, both of which are borne out by the Latin and Tsez data, but contradicted byPassamaquoddy and Innu-Aimûn.

First, there should be no long distance agreement in PERSON, both because PERSON is a prototypical INDEXfeature and because PERSON is typically not found in NP-internal agreement Stassen (1997).

Second, more generally, we expect long distance agreement to involve the same features as in noun-modifieragreement in any given language. Prototypically, these are CASE, NUMBER, GENDER (also known as noun classin Tsez and other languages). Again, we don’t expect to find long distance agreement in PERSON, since PERSONis typically absent from noun-modifier agreement.

Third, we expect to see the same morphological exponents on both the higher and the lower verb, in line withthe observation of Kathol (1999) for NP-internal agreement. Note that we do not necessarily see this exponent onthe lower NP which ‘sets off’ the long distance agreement: if the relevant feature is inherent to that NP, it may nothave a morphological exponent at all.

Being based on diachrony, these are expectations rather than absolute predictions. First, subsequent evolutioncould disturb the pattern. Second, there may be alternative origins for long distance agreement. The Latin and Tsezconstructions are structurally very similiar and bear their nominal origin on their sleeves in that the lower predicateis even synchronically a nominalized form of a participle. The Algonquian constructions discussed in section 5.1are in this respect very different. The lower verb is finite with no nominal properties and the features relevant tolong distance agreement are PERSON and NUMBER. This may point to a different origin of the phenomenon. Aswe saw in section 5.1, several Algonquian languages have proxy agreement structures that only look like longdistance agreement. (Bruening, 2001, chapter 5) even argues that Passamaquoddy has both proxy agreement andreal long distance agreement in the form of clause-periphery agreement. It is tempting, then, to conclude thatPassamquoddy and Innu-Aimûn clause-periphery agreement arose from proxy agreement structures through theloss of the invisible proxy in the matrix clause, leading to a structure that can be captured synchronically by thesame mechanism (feature sharing) as the Latin and Tsez data.


7 Conclusions

In the introduction, we set out the analytic options that we have in dealing with agreement. This yielded a four-waytypology with two cross-cutting features, symmetry and feature sharing. Throughout the paper, we have populatedthe symmetric slots of this typology: symmetric, feature sharing agreement is needed to deal with the Latin case,but – we argued – is also a more general phenomenon, and yields a better explanation of so-called long distanceagreement than analyses which postulate violations of syntactic locality. So the locality of agreement is vindicated.However, there are also instances of agreement without feature sharing, as illustrated in (78)–(79). It follows thatagreement is not a uniform phenomenon, which suggests it should not be a primitive of syntactic theory. We alsosaw that the Latin construction originated in headedness reversal. If this is also true for other cases of long distanceagreement, it yields some interesting predictions about its behaviour.

Finally, we would like to stress the implications for syntactic locality, in particular in agreement relations.As discussed in section 1.3, locality is a long-standing desideratum of formal syntactic theories, but one thatscholars have recently tended to give up because of apparent long distance agreement in languages such as Tsez,Passamaquoddy and Innu-Aimûn. Instead they have assumed various non-local agreement mechanisms. In thispaper, we have argued that the cyclic feature sharing approach to agreement is not only conceptually superior tolong distance agreement, but also empirically more successful since it can account for the Latin dominant participleconstruction, which is incompatible with current long distance agreement theories, while also generalizing to thepreviously reported data.

Acknowledgements We thank Oleg Belyaev, Grev Corbett, Mary Dalrymple, Michael Daniel, Marius Jøhndal, Louisa Sadler, associate editorAd Neeleman and NLLT reviewers for valuable feedback on the research reported in this paper.

References

Ackema, Peter, and Ad Neeleman. 2013. Subset controllers in agreement relations. Morphology 23 (2): 291–323.Alsina, Alex. 2008. A theory of structure-sharing. In Proceedings of LFG08, eds. Miriam Butt and Tracy Holloway

King, 5–25. Stanford: CSLI Publications.Andrews, Avery. 1982. Long distance agreement in Modern Icelandic. In The Nature of Syntactic Representation,

eds. Pauline Jacobson and Geoffrey Pullum, 1–33. Dordrecht: D. Reidel.Asudeh, Ash. 2012. The Logic of Pronominal Resumption. Oxford: Oxford University Press.Barlow, Michael. 1992. A Situated Theory of Agreement. New York: Garland.Bhatt, Rajesh. 2005. Long distance agreement in Hindi-Urdu. Natural Language & Linguistic Theory 23 (4): 757–

807.Bobaljik, Jonathan. 2008. Where’s phi? Agreement as a post-syntactic operation. In Phi-theory: Phi-features across

Interfaces and Modules, eds. Daniel Harbour, David Adger, and Susana Béjar, 295–328. Oxford: Oxford Uni-versity Press.

Boeckx, Cedric. 2008. Aspects of the Syntax of Agreement. London: Routledge.Boeckx, Cedric. 2009. On long-distance Agree. Iberia: An International Journal of Theoretical Linguistics 1:

1–32.Boeckx, Cedric, Norbert Hornstein, and Jairo Nunes. 2010. Control as Movement. Cambridge: Cambridge Univer-

sity Press.Boškovic, Željko. 2003. Agree, phases, and intervention effects. Linguistic Analysis 33 (1–2): 54–96.Boškovic, Željko. 2007. On the locality and motivation of Move and Agree: An even more minimal theory. Lin-

guistic Inquiry 38 (4): 589–644.Branigan, Phil, and Marguerite MacKenzie. 2002. Altruism, A-movement, and object agreement in Innu-aimûn.

Linguistic Inquiry 33 (3): 385–407.Bresnan, Joan. 1982. Control and complementation. In The Mental Representation of Grammatical Relations, ed.

Joan Bresnan, 282–390. Cambridge, MA: MIT Press.Bresnan, Joan. 1997. Mixed categories as head sharing constructions. In Proceedings of LFG97, eds. Miriam Butt

and Tracy Holloway King. Stanford: CSLI Publications.Bresnan, Joan. 2001. Lexical-Functional Syntax. Oxford: Blackwell.Bruening, Benjamin. 2001. Syntax at the edge: Cross-clausal phenomena and the syntax of Passamaquoddy. PhD

diss, Massachusetts Institute of Technology.Bruening, Benjamin. 2009. Algonquian languages have a-movement and a-agreement. Linguistic Inquiry 40 (3):

427–445.Butt, Miriam. 1995. The Structure of Complex Predicates in Urdu. Stanford: CSLI Publications.


Butt, Miriam. 2014. Control vs. complex predication. identifying non-finite complements. Natural Language &Linguistic Theory 32 (1): 165–190.

Bynon, Theodora. 1992. Pronominal attrition, clitic doubling and typological change. Folia Linguistica Historica13: 27–63.

Carstens, Vicki. 2000. Concord in Minimalist theory. Linguistic Inquiry 31 (2): 319–355.Cecchetto, Carlo, and Renato Oniga. 2004. A challenge to null case theory. Linguistic Inquiry 35 (1): 141–149.Chomsky, Noam. 1965. Aspects of the Theory of Syntax. Cambridge, MA: MIT Press.Chomsky, Noam. 2000. Minimalist inquiries: The framework. In Step by Step. Essays on Minimalist Syntax in

Honor of Howard Lasnik, eds. Roger Martin, David Michaels, and Juan Uriagereka, 89–155. Cambridge, MA:MIT Press.

Chumakina, Marina, and Greville Corbett. 2008. Archi: The challenge of an extreme agreement system. In Fon-etika i nefonetika. K 70-letiju Sandro V. Kodzasova, ed. Aleksandr Arxipov et al., 184–194. Moscow: Jasykislavjanskix kul’tur.

Corbett, Greville. 1979. The agreement hierarchy. Journal of Linguistics 15 (2): 203–224.Corbett, Greville. 1991. Gender. Cambridge: Cambridge University Press.Corbett, Greville. 1995. Agreement (research into syntactic change). In Syntax: An International Handbook of

Contemporary Research, eds. Joachim Jacobs, Arnim von Stechow, Wolfgang Sternefeld, and Theo Venneman,Vol. II, 1235–1244. Berlin: de Gruyter.

Corbett, Greville. 2006. Agreement. Cambridge: Cambridge University Press.Dalrymple, Mary, and Ronald M Kaplan. 2000. Feature indeterminacy and feature resolution. Language 76 (4):

759–798.Dalrymple, Mary, and Tracy Holloway King. 2004. Determiner agreement and noun conjunction. Journal of Lin-

guistics 40 (1): 69–104.Dalrymple, Mary, and Irina Nikolaeva. 2011. Objects and Information Structure. Cambridge: Cambridge Univer-

sity Press.Devine, A. M., and L. D. Stephens. 2000. Discontinuous Syntax – Hyperbaton in Greek. Oxford: Oxford University

Press.Falk, Yehuda. 2006. On the representation of case and agreement. In Proceedings of LFG06, eds. Miriam Butt and

Tracy Holloway King. Stanford: CSLI Publications.Forker, Diana. 2011. A Grammar of Hinuq. PhD diss, Universität Leipzig.Frampton, Jon, and Sam Gutmann. 2000. Agreement is feature sharing. Ms. Northeastern University.Givón, Talmy. 1976. Topic, pronoun and grammatical agreement. In Subject and Topic, ed. Charles N. Li, 149–188.

New York: Academic Press.Greenberg, Joseph H. 1978. How does a language acquire gender markers?, eds. J. Greenberg, C. A. Ferguson, and

E. A. Moravcsik, 47–82. Stanford: Stanford University Press.Grinevald, Colette, and Frank Seifart. 2004. Noun classes in African and Amazonian languages: Towards a com-

parison. Linguistic Typology 8: 34–48.Harmer, Lewis, and Frederick John Norton. 1957. A Manual of Modern Spanish. London: University Tutorial

Press.Haug, Dag Trygve Truslew, and Tanya Nikitina. 2012. The many cases of non-finite subjects: The challenge of

"dominant" participles. In Proceedings of LFG12, eds. Miriam Butt and Tracy Holloway King, 292–311. CSLIPublications.

Heick, Otto William. 1936. The ab urbe condita Construction in Latin. PhD diss, University of Nebraska.Kajita, Masaru. 1968. A Generative-Transformational Study of Semi-Auxiliaries in Present-Day American English.

Tokyo: Sanseido.Kathol, Andreas. 1999. Agreement and the syntax-morphology interface in HPSG. In Studies in Contemporary

Phrase Structure Grammar, eds. Robert Levine and Georgia Green, 223–274. Cambridge: Cambridge UniversityPress.

Kibrik, Aleksandr E. 1994. Archi. In The Indigenous Languages of the Caucasus. vol. 4 North-East Caucasianlanguages, part 2, ed. Rieks Smeet, 297–366. Delmar, NY: Caravan Books.

Kibrik, Aleksandr E. 2003. Konstanty i peremennye jazyka. Sankt-Peterburg: Aleteia.Koopman, Hilda. 2006. Agreement configurations. In defense of "spec head". In Agreement Systems, ed. Cedric

Boeckx, 159–199. Amsterdam: John Benjamins.Lapointe, Steven G. 1999. Dual lexical categories vs. phrasal conversion in the analysis of gerund phrases. In Umop

24: Papers from the 25th anniversary, eds. Paul de Lacy and Anita Nowak, 157–189. Amherst: University ofMassachusetts Graduate Lingusitic Student Association.

Legate, Julie Anne. 2005. Phases and cyclic agreement. MIT Working Papers in Linguistics 49: 147–156.Nikitina, Tatiana. 2008. The mixing of syntactic properties and language change. PhD diss, Stanford.


Nikitina, Tatiana, and Dag Trygve Truslew Haug. 2015. Syntactic nominalization in Latin: a case of non-canonicalsubject agreement. Transactions of the Philological Society 113 (1): 1–26.

Pesetsky, David, and Esther Torrego. 2007. The syntax of valuation and the interpretability of features. In Phrasaland Clausal Architecture: Syntactic Derivation and Interpretation, eds. Simin Karimi, Visa Samiian, andWendy K. Wilkins, 262–294. Amsterdam: Benjamins.

Pesetsky, David, and Esther Torrego. 2011. Case. In The Oxford Handbook of Linguistic Minimalism, ed. CedrickBoeckx, 52–72. Oxford: Oxford University Press.

Pinkster, Harm. 1990. Latin Syntax and Semantics. London: Routledge.Polinsky, Maria. 2003. Non-canonical agreement is canonical. Transactions of the Philological Society 101 (2):

279–312.Polinsky, Maria, and Eric Potsdam. 2001. Long-distance agreement and topic in Tsez. Natural Language & Lin-

guistic Theory 19 (3): 583–646.Pollard, Carl, and Ivan Sag. 1994. Head-driven Phrase Structure Grammar. Chicago: Chicago University Press.Preminger, Omer. 2013. That’s not how you agree: A reply to zeijlstra. The Linguistic Review 30 (3): 491–500.Pullum, Geoffrey. 1991. English nominal gerund phrases as noun phrases with verb phrase heads. Linguistics 29

(5): 763–799.Raposo, Eduardo. 1987. Case Theory and Infl-to-Comp: The Inflected Infinitive in European Portuguese. Linguistic

Inquiry 18 (1): 85–109.Sag, Ivan. 2010. Feature geometry and predictions of locality. In Features. Perspectives on a Key Notion in Lin-

guistics, eds. Greville Corbett and Anna Kibort, 236–271. Oxford: Oxford University Press.Sag, Ivan A., Thomas Wasow, and Emily M. Bender. 2003. Syntactic Theory. A Formal Introduction. Stanford:

CSLI Publications.Simpson, Jane. 1991. Warlpiri Morpho-Syntax: A Lexicalist Approach. Dordrecht: Kluwer.Stassen, Leon. 1997. Intransitive predication. Oxford: Oxford University Press.von Heusinger, Klaus, and Johannes Wespel. 2006. Indefinite proper names and quantification over manifestations.

In Proceedings of Sinn und Bedeutung 11, ed. E. Puig-Waldmüller, 332–345. Barcelona: Universitat PompeuFabra.

Wechsler, Stephen. 2011. Mixed agreement, the person feature, and the index/concord distinction. Natural Lan-guage & Linguistic Theory 29 (4): 999–1021.

Wechsler, Stephen, and Larisa Zlatic. 2003. The Many Faces of Agreement. Stanford: CSLI Publications.Welmers, William E. 1973. African Language Structures. Berkeley: University of California Press.Zeijlstra, Hedde. 2012. There is only one way to agree. The Linguistic Review 29 (3): 491–539.

Documents

Feature sharing in agreement - folk.uio.no · Feature sharing in agreement 3 which resist a feature-sharing treatment. This means that we cannot pin down a single agree mechanism