January 24-25, 2003Workshop on Markedness and the Lexicon1 On the Priority of Markedness Paul Smolensky Cognitive Science Department Johns Hopkins University

January 24-25, 2003

Workshop on Markedness and the Lexicon

1

On the Priority of Markedness

Paul SmolenskyCognitive Science Department

Johns Hopkins University

January 24-25, 2003


2

Markedness Rules

Markedness is prior to lexical frequency Developmentally

Explanatorily Markedness determines possible

inventories (e.g., of lexical items) Markedness determines relative

frequency of structuresHave few solid results; mostly suggestive

evidence, empirical and theoretical

January 24-25, 2003


3

Developmental Priority

• Look to see whether young infants are sensitive to markedness before they’ve had sufficient relevant experience

• Before 6 months, infants have not shown sensitivity to language-particular phonotactics

January 24-25, 2003


4

Experimental Exploration of the

Initial State

January 24-25, 2003


5

Talk Outline




frequency of structures

January 24-25, 2003


6

Markedness and Inventories

• Insert: SHarC Theorem• Insert: Lango

January 24-25, 2003


7

Inherent Typology

• Method applicable to related African languages, where the same markedness constraints govern the inventory (Archangeli & Pulleyblank ’94), but with different interactions: different rankings and active conjunctions

• Part of a larger typology including a range of vowel harmony systems

January 24-25, 2003


8

Summary• OT builds formal grammars directly from

markedness: MARK … with FAITH

• Inventories consistent with markedness relations are formally the result of OT … with local conjunction: TLC[Φ], SHarC theorem

• Even highly complex patterns can be explained purely with simple markedness constraints: all complexity is in constraints’ interaction through ranking and conjunction: Lango ATR harmony

January 24-25, 2003


9

Talk Outline




frequency of structures [???]

January 24-25, 2003


10

• The question is not – why does John say X more frequently than Y?, but– why does John’s speech community say X more

frequently than Y?

Markedness Frequency

• How are markedness and frequency to be theoretically related?

• Markedness theory must predict frequency distributions– Frequencies are the data to be explained

• How, within generative grammar?• Consider an extreme (but important)

distribution in cross-linguistic typology

January 24-25, 2003


11

A Generativist Paradox• UG must not generate unattested

languages• What counts as unattested?• “The overwhelming generalization is

U; the proposed UG0 is right because all systems it generates satisfy U”

• “This UG generates the somewhat odd system X (violates U) … but this is actually a triumph because it so happens that the actual (but obscure) language L is odd like X”

Inconsistent !

celebrates: X not

generated

celebrates: X is generated

January 24-25, 2003


12

The Generativist Paradox

• That is, how to explain generalizations of the form “Overwhelmingly across languages, U is true, but in rare cases it is violated: (an ‘exception’) X”

• Generative grammar has only two options:– Generate only U-systems: strictly prohibits X

or– Generate both U and not-U systems: allows X

• Neither explains the generalization

January 24-25, 2003


13

The Generativist Paradox

• A proposed UG0 entails a universal U: T ≻ K

• UG0 thus predicts – if a language allows T it must also allow K – errors must be directed K T

• Suppose this is overwhelmingly true, but rarely:– a language X’s inventory includes K but not T – there are errors T K

• UG0-impossible!

– Is this evidence for or against UG0?

– Must UG0 be weakened to allow languages with K ≻ T ?

January 24-25, 2003


14

• UG is not responsible for X; not core– Linguists’ judgment determines the core data– Good approach

Approaches to the Paradox

January 24-25, 2003


15

• UG is not responsible for X; not core

• UG generates X and is not responsible for its rarity – Derives from extra-grammatical factors


January 24-25, 2003


16

• UG is not responsible for X; not core

• UG generates X and is not responsible for its rarity


• UG generates X and derives its rarity– qualitatively or– quantitatively

I have no idea

Well, maybe three ideas …

How, within a generative theory —

OT?

January 24-25, 2003


17

Graded Generability in OT

Idea : Ranking RestrictivenessRare systems are those produced by only a highly restricted set of rankings

• Parallel to within-language variation in OT

Grammar + Ø

January 24-25, 2003


18

• Consider first within-language variation– a language has a range of rankings– for a given input, the probability of an

output is the combined probability of all the rankings for which it is optimal • Rankings: equal probability (Anttila) • Rankings: “Gaussian probability” (Boersma)

– works surprisingly well


January 24-25, 2003


19

• Consider first within-language variation– a language has a range of rankings– for a given input, the probability of an

output is the combined probability of all the rankings for which it is optimal


• Can this work for cross-linguistic variation?– I haven’t a clue

• Well, maybe three clues

January 24-25, 2003


20

Distribution of Basic Syllable Languages

CV, 47CV(C), 20

(C)V(C), 13

(C)V, 20

• Encouraging or discouraging???

Clue 1: CV Theory

January 24-25, 2003


21

Clue 2: Constraint Sensitivity

The probabilistic interpretation would provide additional empirical constraints on OT theories:

• ¿Markedness of low-front-round (IPA Œ): ① *[+fr, +lo, +rd] or② *[+fr, +rd], *[+lo, +rd], [+fr, +lo] ?

• Faithfulness constraintsF[fr], F[rd], F[lo]

• Probability of in the inventory① 25%② 7%

Empirical probability informs constraint discovery

January 24-25, 2003


22

Clue 3: BO(WO)nW and &D

• In Basic Inventory Theory with Local Conjunction, the proportion of rankings yielding a BO(WO)nW inventory is

1 22

(2 2 1)2 1 2~ 2

(2 2 1)!

n

n nn n

nn

en

22n

• Even when many conjunctions are present, the likelihood that they matter becomes vanishingly small as n (the order of conjunction) increases

January 24-25, 2003


23


Idea . LearnabilityRarer grammars are less robustly learnable

Grammar + general learning theory

???

January 24-25, 2003


24


As with Ranking Restrictiveness, start with language-internal variation

Idea Connectionist substrate Given an input I, a rare output O is one that is rarely found by the search process

Grammar + general processing theory

January 24-25, 2003


25


• Problem identified by Matt Goldrick• Aphasic errors predominantly k t but

also t k occurs, rarely• Exceptional behavior w.r.t. markedness• How is this possible if *dor ≫ *cor in UG?

Under no possible ranking can t k • Must we allow violations of *dor ≫ *cor ?• Alternative approach via processing

theory• Crucial: global vs. local optimization

January 24-25, 2003


26

OT ⇒ pr[I→O] via Connectionism

• Candidate A: realized as an activation pattern a (distributed; or local to a unit)

• Harmony of A: H(a), numerical measure of consistency between a and the connection weights W

• Grammar: W• Discrete symbolic candidate space

embedded in a continuous state space• Search: Probability of A: prT(a) ∝ eH(a)/T

– During search, T 0

January 24-25, 2003


27

Harmony Maxima• Patterns realizing optimal symbolic

structures are global Harmony maxima• Patterns realizing suboptimal symbolic

structures are local Harmony maxima• Search should find the global optimum• Search will find a local optimum• Example: Simple local network for

doing ITBerber syllabification

January 24-25, 2003


28

σ σ

WONSET = 28 WONSET C

a

i

r

n

z

s

d

t

Wa

Wi

Wr

Wn

Wz

Ws

Wd

Wt

C

a

i

r

n

z

s

d

t

a

i

r

n

z

s

d

t

V

8

7

6

5

4

3

2

1

W8 = 28 1

W7 = 27 1

W6 = 26 1

W5 = 25 1

W4 = 24 1

W3 = 23 1

W2 = 22 1

W1 = 21 1

V

Nuc Ons Nuc Ons

/ / t b i a

BrbrNet

January 24-25, 2003


29

BrbrNet’s Local Harmony Maxima

An output pattern in BrbrNet is a local Harmony maximum if and only if it realizes a sequence of legal Berber syllables (i.e., an output of Gen)

That is, every activation value is 0 or 1, and the sequence of values is that realizing a sequence of substrings taken from the inventory {CV, CVC, #V, #VC},

where C denotes 0, V denotes 1 and # denotes a word edge

January 24-25, 2003


30

Competence, Performance• So how can t k ?

– t a global max, k a local max– now we can get k when should get t

• Distinguish Search Dynamics (‘performance’) from Harmony Landscape (‘competence’)– the universals in the Harmony Landscape require

that, absent performance errors, we must have k t

– an imperfect Search Dynamics allows t k

• The huge ‘general case/exception’ contrast– t’s output derives from UG– k’s output derives from performance error

January 24-25, 2003


31

Summary• Exceptions to markedness universals may

potentially be modeled as performance errors: the unmarked (optimal) elements are global Harmony maxima, but local search can end up with marked elements which are local maxima

• Applicable potentially to sporadic, unsystematic exceptions in I O mapping

• Extensible to systematic exceptions in I O or to exceptional grammars???

January 24-25, 2003


32

Markedness Rules



inventories (with local conjunction) Markedness determines relative

frequency of structures --- ???

Documents

January 24-25, 2003Workshop on Markedness and the Lexicon1 On the Priority of Markedness Paul Smolensky Cognitive Science Department Johns Hopkins University