Formal Typology: Explanation in Optimality Theory Paul Smolensky Cognitive Science Department Johns Hopkins University Géraldine Legendre Donald Mathis

Formal Typology: Explanation in Optimality

TheoryPaul Smolensky

Cognitive Science Department Johns Hopkins University

Géraldine LegendreDonald Mathis

Melanie Soderstrom

Alan PrinceSuzanne Stevenson

Peter Jusczyk†

with:

Advertisement

Blackwell 2002 (??) Develop the Integrated

Connectionist/Symbolic (ICS) Cognitive Architecture

Apply to the theory of grammar

The Harmonic Mind:

From neural computation to optimality-theoretic grammar

Paul Smolensky & Géraldine Legendre

Chomsky 1988

“1. What is the system of knowledge? 2. How does this system of

knowledge arise in the mind/brain? 3. How is this knowledge put to use? 4. What are the physical mechanisms

that serve as the material basis for this system of knowledge and for the use of this knowledge?” (p. 3)

Responsibilities of Grammatical Theory

Chomsky’s “Big 4” questions concerning

knowledge of grammar

Structure

Acquisition

Processing

Neuro-genetics

Nativist hypothesis

OT①

①②

③

④

Not new to Chomsky or generative grammar …

Jakobson’s Program

Linguistic theory is not just for theoretical linguists

The same principles that explain formal cross-linguistic and language-internal distributional patterns can also explain• Acquisition• Processing• Neurological breakdown

Jakobson’s Program

Markedness enables a Grand Unified Theory for the cognitive science of language: Avoid α①Structure

Inventories lack αAlternations eliminate α

②Acquisitionα is acquired late

③Processingα is processed poorly

④NeuralBrain damage most easily disrupts α

Talk Plan

① Structure

② Acquisition

③ Processing

④ Neuro- genetics

OT Explanation

Formal re

sult(

s)

Jakobso

n’s program

Question

Achieves g

oal

Empirical in

sights



knowledge of grammarStructure

Structure of UG: Captured in a general formalism for grammars and their variation

OT①

⇒Possible strong version – Explanatory Goal ①:

Analysis of phenomenon Φ in

language L

Universal typology of phenomenon Φ

①

Inherent typology

Acquisition

Processing

Neuro-genetics

From Markedness to OT

Formalizing markedness ⋯ OT• Markedness constraints• Faithfulness constraints• Competition• Strict domination• Strong universality & Richness of the

Base

Structure: Formal Result

Formalizing Markedness: Two Problems

Goal: Change epiphenomenal explanatory status of markedness • Markedness “explains grammars (e.g.,

rules)”; informal commentary about grammar vs.

• Markedness IS grammar: markedness-grammars formally determine languages



Problem 1: Multidimensional integration

Each dimension of linguistic structure independently has its own marked pole, but how do these dimensions combine?

Turns out to be related to another fundamental problem:



“α is marked” ⇝ “Avoid α” But when & how does “avoidance” happen?

Problem 2: Pervasive variability in “avoidance”• Inventories: If [θ] is absent in French “because it is

marked” how can it be present in English “despite being marked”?

¿The grammar of every language turns on or off: “No α ” = *α — a markedness constraint. OT: More subtle version that also solves:

• Alternations: If in environment E, α β “because α is more marked than β”, how do we explain that in E α ̷ β “even though” α is more marked than β?


Formalizing Markedness Most crudely: Why aren’t unmarked

elements always avoided? Something must oppose markedness

forces. Markedness cannot be the sole basis

of a formal grammatical theory: it is only one half of the complete story.


The Great DialecticPhonological representations serve two masters

Phonological Representation Lexico

nPhoneti

cs

Phonetic interface

[surface form]

Often: ‘minimize effort (motoric & cognitive)’;

‘maximize discriminability’

Locked in eternal conflict

Lexical interface

/underlying form/

‘be this invariant form’

FAITHFULNESSMARKEDNESS


The Core Constraints of Con MARKEDNESS: *α (“minimize effort; maximize distinctiveness”)

• “constraint *α Con” α meets empirical criteria for ‘marked’

• Freedom? Empirically constrained by universal patterns FAITHFULNESS (“be this invariant form”):

• /input/ [output] is the identity map, i.e., • elements /x/ and [x] are in one-to-one correspondence

and identical (McCarthy & Prince ’95)

• Constraints: MAX(x), DEP(x), IDENT(x), …• Essentially determined by elements {x} of

representation• Freedom? Representations — as always: empirically

constrained to allow statement of markedness constraints

¿ “In OT you can invent any constraint you want” ?


Conflict Dialectic: MARK vs. FAITH conflict

• Why aren’t marked elements always avoided? Because sometimes MARK is over-ruled by FAITH

• Why aren’t words always pronounced in their invariant, lexical form?Because sometimes FAITH is over-ruled by MARK

1 over-rules (dominates) 2: 1 ≫ 2

Whether M gets violated (whether marked elements fail to ‘be avoided’) varies by • Language (in some, M ≫ F; in others, F ≫ M)• Context (in some, M ≫ F2; in others F1 ≫ M)


Conflict Dialectic: MARK vs. FAITH conflict Whether M gets violated (whether marked

elements fail to ‘be avoided’) varies by • Language (in some, M ≫ F; in others, F ≫ M)• Context (in some, M ≫ F2; in others F1 ≫ M)

Why is there cross-linguistic variation?• Phonetic Lexical ~ MARK FAITH Dialectic

gets resolved differently • Typology by re-ranking: Factorial Typology

{possible human languages} {rankings of Con}

(n constraints give n! rankings — many are equivalent)


Formalizing Markedness

Problem 1: ‘Avoidance of the marked’ is pervasively variable; exactly where does marked material appear?• Solution: Constraint ranking

— MARK w.r.t. FAITH

Will now see this also solves: Problem 2: Multidimensional markedness

• Solution: single constraint ranking for all constraints in a given language


Formalizing Markedness Markedness is multidimensional

• Each dimension has its universally marked pole• How do dimensions combine? (M1, *M2) vs. (*M1, M2)

CV ́C.CV (STRESSHEAVY, *MAINSTRESSRIGHT) vs. CVC.CV ́• Integrate via a common markedness currency:

Harmony Numerical: *M1 = 3.2; *M2 = 2.8 Symbolic: *M1 absolutely worse than *M2

see below

OT:•For a given language, there is a single constraint ranking for all constraints•Strict domination hierarchy: markedness on higher-ranked constraints can never be compensated for by unmarkedness on lower-ranked ones


Competition for Optimality Given an input, an OT grammar does not

provide a procedure for how to construct the output — bur rather a description of the output: the structure that best-satisfies the constraint ranking

Best-satisfies is a comparative criterion; outputs compete and the grammar identifies the winner: the optimal — grammatical — highest Harmony — output for that input


Harmonic Competition Numerical Harmony

Stress is on the initial heavy syllable iff the number of light syllables n obeys

Pathological grammars

“Grammars can’t count”

TRESS EAVY

AIN TRESS IGHT

S H

M S Rany number

wn

w

Candidates STRESSHEAVY MAINSTRESSRIGHT Harmony

a. σHσ…σσ n

**…* n

n(wMAINSTRESSRIGHT)

b. σHσ…σσ n

* wSTRESSHEAVY

´

´

Candidates STRESSHEAVY MAINSTRESSRIGHT

a. σHσ…σσ

n

**…* n

b. σHσ…σσ n

*!

´

´


Harmonic Competition Symbolic Harmony: Strict domination

• STRESSHEAVY ≫ MAINSTRESSRIGHT

Stress the initial heavy syllable

Stress the final syllable

• MAINSTRESSRIGHT ≫ STRESSHEAVY

´

Candidates MAINSTRESSRIGHT STRESSHEAVY

a. σHσ…σσ

n

**…*! n

b. σHσ…σσ n

*

´

´ Strict domination “Grammars can’t count”


OT: ‘Formal’ definition Gen: Specifies candidate outputs for any

given input Con: The constraint set A grammar: A hierarchical ranking of Con H-Eval: Given two candidates and a

ranking, a formal definition employing strict domination of which has higher Harmony — which better-satisfies the ranking

I O mapping: I The maximal-Harmony candidate[s] in Gen(I)


Richness of the Base Universality: All systematic cross-linguistic

variation arises from differences in constraint ranking

Therefore: • Con is universal; H-Eval is universal• Gen is universal, including the space of possible inputs

as well as possible outputs i.e.: No systematic cross-linguistic variation is due to

differences in inputs e.g.: Languages with no surface codas cannot get this property

from limitations on the lexicon (e.g., a morpheme structure constraint *Cwd]) — but rather from the ranking

i.e.: The grammar must have the property that even if there were C-final inputs, there would still be no surface codas

Aside

Richness of the Base is a principle for inducing a grammar (generalizing) from a set of grammatical items

It can be justified by the central principle of John Goldsmith’s presentation: Maximize the probability of the data

Structure: Conceptual “Question”

Explanatory Power“OT is as unexplanatory as extrinsically-

ordered rule-theory”Stipulating ranking ~ stipulating ordering

Actually, OT achieves Explanatory Goal ①, Inherent Typology: In the analysis of phenomenon Φ in one language is inherent a typology of Φ in all languages

Structure: Explanatory Goal

Inherent Typology


Analytic Restrictiveness“You can make up any constraint you want in OT ”

Actually, in OT, positing in the analysis of a language L necessarily has a huge number of empirically falsifiable implications (one consequence of Inherent Typology)

E.g., Two pervasive patterns generated by ‘ Con’


Robust Falsifiability


Consequences of ‘ Con’ – I: The Subordination

Pattern E.g., = NOCODA

Recall: • If ‘No codas’ is in UG, why do codas ever appear?• Conflict

With faithfulness constraintsWith other markedness constraints – other dimensions

of markedness

Cross-linguistic variation: codas are less and less restricted as NOCODA is subordinated to more and more conflicting constraints (i.e., dimensions of markedness)

Structure: Empirical Application

Subordination Pattern: Codas

No codas at all

Codas only in stressed syllables

… + Geminate codas

Codas unrestricted …except prohibited inter-vocalically [~V.CV~]

STRESS-TO-WEIGHT

MAXμ

MAX

NOCODA


Multiplicity of ConstraintsFor second pervasive pattern generated by ‘ Con’:

“Any framework which leads to the morass of constraints found in OT analyses in phonology cannot possibly be explanatorily adequate.”

Actually, OT interaction-via-domination replaces many rules by fewer constraints


Factorial Interaction


Consequences of ‘ Con’ – II: Factorial Interaction

‘Factorial interaction’: with varying interaction (re-ranking), n simple modular constraints correspond to• Multiplicity of rules (many more than n) • Complex, non-modular rules• Rules + representational/notational

tricks• Rules + constraints

E.g., = NOCODA


Factorial Interaction: Codas

Consider Con {MAX} ↪ {MAX, DEP} Number of constraints increases by 1 Number of corresponding rules

doubles as set of ‘repairs’ now includes epenthesis as well as deletion:NOCODA ≫ MAX ~ CØ/—σ]

↪ NOCODA ≫ DEP ~ Ø V/Cσ]—

ONSET ≫ MAX ~ VØ/[σ—

↪ ONSET ≫ DEP ~ Ø C/[σ—V



MARKEDNESS ≫ FAITHFULNESS

MARKEDNESS

NOCODA ONSET

FAITHFUL-NESS

MAX CØ/—σ] VØ/[σ—

DEP Ø V/Cσ]— Ø C/[σ—V

In general, the number of comparable rules increases much faster than the number of constraints


Consequences of ‘ Con’ – II: Factorial Interaction

‘Factorial interaction’: with varying interaction (re-ranking), n simple modular constraints correspond to• Multiplicity of rules (many more than n) • Complex, non-modular rules• Rules + representational/notational

tricks• Rules + constraints

E.g., = NOCODA



STRESS-TO-WEIGHT ≫ NOCODA • Codas only in stressed syllables• CØ/—σ̆] segmental rule sensitive to foot structure

[‘non-modular rules’] ANCHOR-R ≫ NOCODA

• Codas only word-finally• CØ/—σ] plus final-C extrametricality

[‘representational trick’] MAXμ ≫ NOCODA

• Only geminate codas — /Cμ/• CØ/—σ] plus Hayes’ exclusivity of association

[‘notational trick’]


Factorial Interaction STRESS-TO-WEIGHT ≫ NOCODA Codas only in stressed

syllables• STRESS-TO-WEIGHT ≫ *Cμ • Geminates only after stressed V• μØ/—σ̆]

ANCHOR-R ≫ NOCODA Codas only word-finally• ANCHOR-R ≫ *[+voi,son] • Obstruent devoicing except word-finally• [+voi][voi]/[—, son] plus ?? to block word-finally

MAXμ ≫ NOCODA Only geminate codas; /C μ/

• MAXμ ≫ WEIGHT-TO-STRESS

• Geminates are the only codas in unstressed syllables• CØ/—σ̆] plus exclusivity of association

Structure: Jakobson’s ProgramMarkedness + Faithfulness =

HarmonyIn summary: Jakobson’s key insight concerning linguistic

structure: the central organizing principle of grammar is: Minimize Markedness

OT formalizes this as Maximize Harmony OT formalizes Markedness via violable constraints OT adds the crucial notion of Faithfulness – the

other (lexical) half of the phonological dialectic OT Harmony combines Markedness with

Faithfulness; their conflict is adjudicated via ranking

Ranking unifies multiple dimensions of markedness

Structure: Summary

OT achieves the explanatory goals of• Changing the epiphenomenal status

of markedness in grammatical theory: markedness is now in grammar, not about grammar

• A strongly universalist formalism exhibiting Inherent Typology

• Robust falsifiability




Acquisition

Processing

Neuro-genetics

Nativist hypothesis

OT①

①②

Possible strong version – Explanatory Goal ②:

⇒②

General Learning Theory

Substantive structure (①) of a UG module

governing phenomenon Φ

Acquisition theory — initial state, learning

algorithm — for phenomenon Φ

Acquisition: Formal Result I

Learning Theory Learning algorithm

• Provably correct and efficient (when part of a general decomposition of the grammar learning problem)

• Sources:Tesar 1995 et seq. Tesar & Smolensky 1993, …, 2000** See for how to exploit the analogy to

‘weighted OT’ (Goldsmith, today)

• If you hear A when you expected to hear E, increase the Harmony of A above that of E by minimally demoting each constraint violated by A below a constraint violated by E

in +possible

Candidates

FaithMark (NPA)

☹ ☞ Einpossibl

e *

A impossibl

e *

Faith

*☺ ☞

If you hear A when you expected to hear E, increase the Harmony of A above that of E by minimally demoting each constraint violated by A below a constraint violated by E

Correctly handles difficult case: multiple violations in E

Acquisition: Formal Result I

Constraint Demotion Algorithm

Acquisition: Conceptual “Question”

Large Grammar Space “Huge number of grammars” —

“OT is too unrestrictive”

Acquisition: Explanatory Goal

General Learning Theory Actually, OT achieves Explanatory Goal

②: General Learning Theory: A theory-general, UG-informed learning algorithm, provably correct and efficient (under strong assumptions)

Acquisition: Formal Result II

Learnability & the Initial State

M ≫ F is learnable with /in+possible/→impossible• ‘not’ = in- except when followed by …• “exception that proves the rule”: M = NPA

M ≫ F is not learnable from data if there are no ‘exceptions’ (alternations) of this sort, e.g., if no affixes and all underlying morphemes have mp: M and F, no M vs. F conflict, no evidence for their ranking

Thus must have M ≫ F in the initial state, ℌ0

Acquisition: Empirical Application

Initial State: Experimental Test Collaborators

Peter Jusczyk Theresa Allocco (Elliott Moreton, Karen Arnold)

Here, only a thumbnail sketch (more in the OT Workshop Thursday)

Acquisition: Empirical Application

Initial State: Experimental Test Linking hypothesis:

More harmonic phonological stimuli ⇒ Longer listening time

More harmonic: M ≻ *M, when equal on F F ≻ *F, when equal on M• When must chose one or the other,

more harmonic to satisfy M: M ≫ F M = Nasal Place Assimilation (NPA)

15.36

12.31

0

2

4

6

8

10

12

14

16

18

20

Faithfulness Markedness M ≫ F

Tim

e (s

ec)

Higher HLower H

4.5 Months (NPA)Higher

HarmonyLower Harmony

um…ber…umber

um…ber… iŋgu

p = .006 (11/16)

Acquisition: Empirical

Application

15.2315.36

12.7312.31

0

2

4

6

8

10

12

14

16

18

20


Tim

e (s

ec)

Higher HLower H

Higher Harmony

Lower Harmony

um…ber…umber

un…ber…unber

p = .044 (11/16)

4.5 Months (NPA) Acquisition:

Empirical Application

15.2315.36

12.7312.31

0

2

4

6

8

10

12

14

16

18

20


Tim

e (s

ec)

Higher HLower H

4.5 Months (NPA) Markedness * Faithfulness

* Markedness Faithfulness

un…ber…umber

un…ber…unber

???


Application

16.75

15.2315.3614.01

12.7312.31

0

2

4

6

8

10

12

14

16

18

20


Tim

e (s

ec)

Higher HLower H

4.5 Months (NPA)Higher

HarmonyLower Harmony

un…ber…umber

un…ber…unber

p = .001 (12/16)


Application

Acquisition: Jakobson’s ProgramMarkedness = Distance from Initial

State X is universally more marked than Y ~ In addition to the constraints M1, M2, …, Mk

violated by Y, X also violates markedness constraints M1, M2, …, Mn

Y will be acquired – become admitted into the child’s inventory – after M1, M2, … Mn are all demoted below relevant faithfulness constraints

These demotions are all necessary for X to be acquired, and additional demotions of M1, M2, …, Mn are also required ~

X will require more time to be acquired




Acquisition

Processing

Neuro-genetics

Nativist hypothesis

OT①

①②

③

Possible strong version – Explanatory Goal ③ :

⇒③

General Processing Theory



Processing theory — e.g., parsing algorithm — for

phenomenon Φ

Processing: Formal Results

Context-Free Parsing Algorithm

Theorem (Tesar 1994, 1995b, a, 1996). Suppose• Gen parses a string of input symbols into structures

specified via a context-free grammar• Con constraints meet a tree-locality condition and

penalize empty structure

Then a given dynamic programming algorithm is• Left-to-right • General (any such Gen, Con)• Guaranteed to find the optimal outputs• As efficient as parsers for conventional context-free

grammars.


Finite-State Parsing Algorithm

Theorem (Ellison 1994). Suppose• Gen(I) is representable as a (non-deterministic)

finite-state transducer (particular to I) mapping the input string to a set of output candidates

• Con constraints are reducible to multiply-violable binary constraints each representable as a finite-state transducer mapping an output candidate to a sequence of violation marks

Then composing the Gen(I) and rank-sequenced constraint-transducers yields a transducer that • Directly maps I to its optimal outputs • Can be efficiently pruned by dynamic

programming


Complexity of Violable Constraints

Theorem (Frank and Satta 1998). Suppose• Gen is representable as a (non-deterministic) finite-

state transducer mapping an input string to a set of output candidates

• Con: the set of structures incurring n violations of each constraint is generable by a finite-state machine, and n can be finitely bounded for each constraint

Then the mapping from inputs to optimal outputs has the complexity of a finite-state transducer.

Theorem (Hiller 1996, Smolensky 1997). If n is unbounded there are (extremely simple) OT

grammars with greater computational complexity.

Processing: Conceptual “Question”

Processing (Symbolic): Theory “Infinite candidate set uncomputable”

Actually, achieves Explanatory Goal ③ (computational)


Processing (Symbolic): Theory

⇒③

General Processing Theory



Processing theory — e.g., parsing algorithm — for

phenomenon Φ

Processing: Empirical Application

Sentence Processing Because an OT grammar assigns a

parse to any input, no additional principles (e.g., ‘parsing heuristics’) are needed for parsing the initial, incomplete segment of a sentence

Linking hypothesis:Processing difficulty arises when previously established structure needs to be abandoned in the face of further input


PP AttachmentThe servant of the actress who… (Cuetos & Mitchell 88)

[Assuming who is ambiguous for Case.]

Violates: *NOM, LOCALITY2

Violates: *NOM, AGRCASE

Violates: *GEN

who [+nom]

NP PP

P NP

NP

of theactress [+gen]

theservant

who [+nom]

who [+gen]

• LOCALITY: If XP c-commands YP, then XP precedes YP.• AGRCASE: A relative pronoun must agree in Case with the modified NP.• *CASE: *GEN ≫ *DAT ≫ *ACC ≫ *NOM (universal)


PP AttachmentThe servant of the actress who… (Cuetos & Mitchell 88)

• If *GEN, AGRCASE ≫ LOCALITY2, then : attach high• If LOCALITY2 ≫ *GEN or AGRCASE, then or : attach low

NP PP

P NP

NPwho [+nom]

who [+nom]

who [+gen]

Violates: *NOM, LOCALITY2

Violates: *NOM, AGRCASE

Violates: *GENof theactress

[+gen]

theservant


PP Attachment Preliminary result: A cross-linguistic

typology of PP attachment patterns (across differences in case and embedding depth)

Empirically promising, but not perfect Unclear yet how rankings determining

parsing preferences relate to rankings in the pure ‘competence grammar’

Processing: Jakobson’s Program

Processing and Markedness

Phonological analogy: Incrementally parse C…V…C…• /C/ [C]�• /CV/ [CV]• /CVC/ [CV][C]�

Now ‘expect’ a V … if get it, no ‘reanalysis’• But if get a C, need reanalysis difficulty:• /CVCC/ [CVC][C]�

Processing marked material (coda C) creates difficulty because it is initially analyzed as unmarked (as an onset)


Processing (Symbolic): Theory “OT not psychologically plausible”

Actually, achieves Explanatory Goal ③ (empirical perspective): a competence theory automatically entails an empirically fruitful performance (processing) theory


Processing (Symbolic): Theory




Acquisition

Processing

Neuro-genetics

Nativist hypothesis

OT①

①②

③

④

Possible strong version –Explanatory Goal ④:

⇒④

General Biological Realization

Substantive structure (①) of a UG module M

Neural network instantiating M (nativism:

with genetic encoding)

Neuro-genetics: Formal Results

Neural Representations (Gen)

Activation patterns: cat and its constituents

-1 4 9 14

Unit (Area = activation level)

k/r0

æ/r01

t/r11

σ/rε

[σ k [æ t]]

σ

ktæ

{ / }i i if r i ii f r

OT & Connectionism

OT derives from the numerical formalism, derived from connectionist Harmony maximization, of• Harmonic Grammar (Legendre,

Miyata, & Smolensky, 1990)


Neural Constraints (Con)NOCODA: A syllable has no codaσ

ktæ

* violation

W

* H(a[σ k [æ t]) =

–sNOCODA < 0

a[σ k [æ t ]] *

* violation


UGenome for CV Theory The game: take a first shot at a concrete

example of a genetic encoding of UG in a Language Acquisition Device

¿ Proteins ⇝ Universal grammatical principles ?

Case study: Basic CV Syllable Theory Introduce an ‘abstract genome’ notion

parallel to (and encoding) ‘abstract neural network’

Collaborators• Melanie Soderstrom• Donald Mathis


Network Architecture /C1 C2/ [C1 V C2]

CV

/C1 C2 /

[

C1

V

C2

]


PARSE

C

V

3 3

3

3

33

1

11

1

1

1

3 3

3

3

33

3 3

3

3

33

All connection coefficients are +2


ONSET All connection coefficients are 1

C

V


Connectivity geometry Assume 3-d grid geometry

V

C

‘E’

‘N’

‘back’


Constraint: PARSE

CV

3 33

3

33

111

11

1

3 33

3

33

3 33

3

33

Input units grow south and connect Output units grow east and connect Correspondence units grow north & west

and connect with input & output units.


Connectivity Genome Contributions from ONSET and PARSE:

Source:

CI VI CO VO CC VC xo

Projec-tions:

S LCC S L VC E L CC E L VC

N&S S VO

N S x0

N L CI

W L CO

N L VI

W L VO

S S VO

Key: Direction Extent Target

N(orth) S(outh)E(ast) W(est)F(ront) B(ack)

L(ong) S(hort)

Input: CI VI

Output: CO VO x(0)

Corr: VC CC

Φ

Ψ


Processing

11 0R c

[P1] ∝ s1

1 1 11 1w [ ]P R s c

W = wii

22 0R c

Φ

Ψ


Learning

2 22 2 2[ ]P K L G c

1 11 1 1

When and are simultaneously active,

[ ] is P K L G c

1 11L G c

11 1K L c

1 1[ ]P K

(during phase P+; reverse during P )


Learning Behavior A simplified system can be solved

analytically Learning algorithm turns out to ≈

si() = [# violations of constrainti

P ]

Conclusion

OT is enabling progress on several explanatory goals for linguistic theory

Inherent typology General learning theory General processing theory General biological realization

Thank you for your attention

Often, OT formalizes Jakobson’s program

Documents

Formal Typology: Explanation in Optimality Theory Paul Smolensky Cognitive Science Department Johns Hopkins University Géraldine Legendre Donald Mathis