48
uofclogo Introduction QI Stress The Models Predicting Attestation & Frequency Conclusion Grammatical Correlates of Cross-linguistic Frequency: Quantity-Insensitive Stress Max Bane [email protected] Department of Linguistics / Chicago Language Modeling Lab University of Chicago http://clml.uchicago.edu January 23, 2009 Council on Advanced Studies Workshop on Language, Cognition, and Computation

Grammatical Correlates of Cross-linguistic Frequency ...clml.uchicago.edu/~max/typfreq_cas2009_slides.pdfGrammatical Correlates of Cross-linguistic Frequency: Quantity-Insensitive

  • Upload
    vokhue

  • View
    236

  • Download
    1

Embed Size (px)

Citation preview

uofclogo

Introduction QI Stress The Models Predicting Attestation & Frequency Conclusion

Grammatical Correlates of Cross-linguisticFrequency: Quantity-Insensitive Stress

Max [email protected]

Department of Linguistics / Chicago Language Modeling LabUniversity of Chicago

http://clml.uchicago.edu

January 23, 2009Council on Advanced Studies

Workshop on Language, Cognition, and Computation

uofclogo

Introduction QI Stress The Models Predicting Attestation & Frequency Conclusion

A characteristic property of Language

The distribution of linguistic properties is very uneventypologically.Examples

Sound inventories921 distinct phonemes found in a sample of 451 languages;average language uses only about 30 (Maddieson 1984).Some sounds extremely common (≈universal): [m], [k];others extremely rare: [K], [œ]

Stress patterns26 distinct QI stress patterns in a sample of 306 languages(Heinz 2007).But over 60% of the languages use one of just 3 patterns.

Morphosyntactic, semantic properties

uofclogo

Introduction QI Stress The Models Predicting Attestation & Frequency Conclusion

Previous typological research

The goal of most typological studies is to construct atheory or model that predicts:

As many attested patterns/languages as possible, andAs few unattested patterns/languages as possible

The “inclusion-exclusion” criterion. (cf. precision/recall)Few attempt the additional (harder) task of predicting:

The typological frequencies of attested patterns

(Though see, e.g., Liljencrants & Lindblom 1972, de Boer2000, and others)

uofclogo

Introduction QI Stress The Models Predicting Attestation & Frequency Conclusion

This talk

Focus on the typology of quantity-insensitive (QI) stresssystems, as collected by Bailey (1995), Gordon (2002),and combined by Heinz (2007).Question: how do the typological predictions of threedifferent models of QI stress relate to the set of attestedsystems and their cross-linguistic frequencies?

Answer: attestedness and frequency are correlated withthree things:

the n-gram entropy of a stress pattern,its “confusability” with other predicted patterns (for at leastone model),the number of parameter settings that specify it in themodels.

uofclogo

Introduction QI Stress The Models Predicting Attestation & Frequency Conclusion

This talk

Focus on the typology of quantity-insensitive (QI) stresssystems, as collected by Bailey (1995), Gordon (2002),and combined by Heinz (2007).Question: how do the typological predictions of threedifferent models of QI stress relate to the set of attestedsystems and their cross-linguistic frequencies?Answer: attestedness and frequency are correlated withthree things:

the n-gram entropy of a stress pattern,its “confusability” with other predicted patterns (for at leastone model),the number of parameter settings that specify it in themodels.

uofclogo

Introduction QI Stress The Models Predicting Attestation & Frequency Conclusion

Assumptions, definitions

Stress patternAny “culminative” accentual system; there is one mostprominent accentual unit.Any given unit may bear primary or secondary stress;exactly one primary stressed unit per prosodic word.

I assume that the stress-bearing unit is the syllable.Quantity-insensitive stress pattern

The assignment of stresses to a word’s syllables dependsonly on the number of syllables, not on the contents of thesyllables.

uofclogo

Introduction QI Stress The Models Predicting Attestation & Frequency Conclusion

Examples

Notation: σ = unstressed syllable, σ́ = primary stressedsyllable, σ̀ = secondary stressed syllable.Penultimate primary stress (Mohawk, Albanian, . . . )

2 syl. word: σ́σ3 syl. word: σσ́σ4 syl. word: σσσ́σ5 syl. word: σσσσ́σ. . .

Even-numbered from right, leftmost primary (Malakmalak)2 syl. word: σ́σ3 syl. word: σσ́σ4 syl. word: σ́σσ̀σ5 syl. word: σσ́σσ̀σ6 syl. word: σ́σσ̀σσ̀σ. . .

uofclogo

Introduction QI Stress The Models Predicting Attestation & Frequency Conclusion

The stress typology

Heinz’s (2007) dissertation:Combines two previous typologies by Bailey (1995) andGordon (2002), collected from other studies and primarysource grammars.Samples a total of 422 languages with stress, 306 of whichhave quantity-insensitive systems.Sample chosen to balance representation of majorlanguage stocks.Caveats:

Only “dominant” stress patterns represented.Lexical expcetions, subregularities not included.Some may contribute multiple stress patterns. E.g., Lenakelnouns vs Lenakel verbs — counts as two “languages.”

uofclogo

Introduction QI Stress The Models Predicting Attestation & Frequency Conclusion

The stress typology

Between the 306 languages with QI stress, there are 26 distinctstress patterns, distributed as follows:

L076 L118 L004 L132 L110 L044 L008 L022 L143 L065 L054 L095 L040 L077 L071 L033 L113 L037 LXX1 L047 L042 L089 LXX2 LXX3 L082 L084

0.00

0.05

0.10

0.15

0.20

0.25

Frequencies of Attested Stress Patterns

Stress Pattern

Fre

quen

cy

uofclogo

Introduction QI Stress The Models Predicting Attestation & Frequency Conclusion

The stress typology

A very skewed distribution (power law? Gauss-Newtonregression to Zipf’s law: R2 = .809,p < 0.001).The top three most common patterns, togetherrepresenting a majority of languages surveyed:

Fixed final stress (24.2% of systems)Fixed initial stress (22.5% of systems)Fixed penultimate stress (19.6% of systems)

Of N = 306 sampled languages, n1 = 13 have patternsattested only once.Good-Turing estimate (Good 1953):

We should reserve about n1/N = 4.3% ofprobability/frequency-mass for unseen patterns.⇒ a fairly exhaustive sample.

uofclogo

Introduction QI Stress The Models Predicting Attestation & Frequency Conclusion

Modeling QI stress: Overview

Three models of QI stressGordon’s (2002) OT modelA dynamic linear modelA weighted (“harmonic grammar”) version of Gordon’smodel

All three overgenerate and undergenerate“Overgeneration” = predicting unattested patterns“Undergeneration” = not predicting attested patternsJust worrying about overgeneration today

Within the predictions of each model:What separates the attested patterns from the unattested?What separates the cross-linguistically frequent from theinfrequent?

uofclogo

Introduction QI Stress The Models Predicting Attestation & Frequency Conclusion

Gordon’s (2002) model

Gordon presents a model of QI stress that aims to includeas many attested languages, and exclude as manyunattested, as possible.Optimality Theoretic.

12 constraints (plus 1 “meta” constraint) on the assignmentof stress to syllables.A grammar is a ranking of the constraintsFor each word length (n = 2, . . . ,8 syllables), choose thestress assignment (∈ {σ, σ́, σ̀}n) that best satisfies thehighest ranked constraints.

uofclogo

Introduction QI Stress The Models Predicting Attestation & Frequency Conclusion

Constraint(s) Penalizes. . .ALIGNEDGE each edge of the word with no stressALIGN(x1, L/R) each stressed syllable for each other syllable be-

tween it and the left/right edgeALIGN(x2, L/R) each primary stressed syllable for each sec-

ondary stressed syllable between it and theleft/right edge

NONFINALITY the last syllable if it is stressed*LAPSE each adjacent pair of unstressed syllables*CLASH each adjacent pair of stressed syllables*EXTLAPSE each occurrence of three consecutive unstressed

syllables*LAPSELEFT/RIGHT the left/right-most syllable if more than one un-

stressed syllable is between it and the left/rightedge

*EXTLAPSERIGHT the right-most syllable if more than two un-stressed syllables are between it and the rightedge.

uofclogo

Introduction QI Stress The Models Predicting Attestation & Frequency Conclusion

Gordon’s (2002) model

Meta constraint:Only one of ALIGN(x2, L), ALIGN(x2, R) is active at once.⇒ two sub-models: one without ALIGN(x2, L), one withoutALIGN(x2, R)Reason: preserves an apparent universal property of stresssystems: secondary stress always appears to one side (leftor right) of primary stress, without vascillating back andforth across word lengths.

uofclogo

Introduction QI Stress The Models Predicting Attestation & Frequency Conclusion

Gordon’s (2002) model

Typological predictionsEach sub-model allows 11! possible grammars.2 · 11! = 79,833,600Multiple constraint-rankings may, on the surface, specify thesame QI stress pattern.Each sub-model gives some number of possible patterns.Typological predictions = union of sub-models’ possiblepatterns.

uofclogo

Introduction QI Stress The Models Predicting Attestation & Frequency Conclusion

Gordon’s (2002) model

The 2 · 11! possible constraint-rankings (grammars)correspond to only 152 distinct QI stress patterns (lookingat words up to 8 syllables long).

Computed by finite-state methods (Riggle 2004, et seq).

24 of the 26 attested patterns are predicted by Gordon’smodel.

Undergenerates by 26− 24 = 2 patterns (those of Bhojpuriand Içuã Tupi)Overgenerates by 152− 24 = 128 patterns.

Has the tightest inclusion/exclusion fit of the three models.

uofclogo

Introduction QI Stress The Models Predicting Attestation & Frequency Conclusion

Gordon’s (2002) model

The 2 · 11! possible constraint-rankings (grammars)correspond to only 152 distinct QI stress patterns (lookingat words up to 8 syllables long).

Computed by finite-state methods (Riggle 2004, et seq).24 of the 26 attested patterns are predicted by Gordon’smodel.

Undergenerates by 26− 24 = 2 patterns (those of Bhojpuriand Içuã Tupi)Overgenerates by 152− 24 = 128 patterns.

Has the tightest inclusion/exclusion fit of the three models.

uofclogo

Introduction QI Stress The Models Predicting Attestation & Frequency Conclusion

A dynamic linear model

Dynamic linear models (DLMs; also “dynamiccomputational networks,” DCNs). Goldsmith & Larson(1990, 1993).Treat stress assignment as a continuous “wave-like”phenomenon.“Connectionist” model

Each syllable corresponds to a node in a linear network.A continuous quantity of “activation” “flows” from node tonode, according to linear inhibition/amplificationparameters, until equilibrium is reached.At equilibrium, nodes with more activation than neighborscorrespond to stressed syllables.

uofclogo

Introduction QI Stress The Models Predicting Attestation & Frequency Conclusion

A dynamic linear model

! ! "#$%!&!'(!))!

'(!*+%!,-..#/.%0!!1#2+!345*!+#,!#!4%5$+/'6!*'!%5*+%6!,57%!'(!5*8!%92%:*!('6!*+%!(56,*!#47!.#,*!

345*,8!;+52+!+#<%!'4.-!#!65$+*!4%5$+/'68!#47!.%(*!4%5$+/'68!6%,:%2*5<%.-0!!=%5$+/'654$!

345*,!#6%!2'44%2*%7!/-!*;'!;%5$+*%78!756%2*%7!%7$%,8!'4%!.%(*;#678!'4%!65$+*;#670!!>+%!

;%5$+*!'(!#..!.%(*;#67!%7$%,!5,!#!?'7%.!:#6#?%*%68!*+%!.%(*!54+5/5*5'4!!8!#47!,5?5.#6.-8!*+%!

65$+*;#67!%7$%,!2#66-!*+%!?'7%.@,!65$+*!54+5/5*5'4!"0!!1#2+!345*!/%$54,!;5*+!#!:#6*523.#6!

#2*5<#*5'4!<#.3%8!A4';4!#,!5*,!!"#$%"&'(&)!#!*&#!+",!!>+%!4%*;'6A!5,!5..3,*6#*%7!/%.';B!

!

CD5$36%!73%!*'!E'.7,?5*+8!)FF&G!

! H%*!3,!6%(%6!*'!*+%!#2*5<#*5'4!'(!*+%!!#-!345*!#*!*5?%!#!#,! #

!. 0!!I.*5?#*%.-!;%!;5..!634!

*+%!4%*;'6A!*+6'3$+!#!,%65%,!'(!54+5/5*'6-!5*%6#*5'4,8!'6!*5?%J,*%:,8!%#2+!'(!;+52+!;5..!

#7K3,*!*+%!#2*5<#*5'4!<#.3%,!'(!*+%!345*,8!+%42%!'36!4%%7!*'!6%(%6!*'!*+%!#2*5<#*5'4!#*!#!

:#6*523.#6!*5?%0!!>+%!545*5#.!54*%64#.!#2*5<#*5'4!'(!%#2+!345*!5,!*+%!,3?!'(!*;'!<#.3%,B!#!

:',5*5'4#.!#2*5<#*5'4!!/ 8!;+52+!5,!#!?'7%.!:#6#?%*%6!,:%25(-54$!*+%!#2*5<#*5'4!54+%6%4*!*'!

*+%!345*@,!:',5*5'4!54!*+%!4%*;'6A8!#47!#!/5#,!#2*5<#*5'4!08!#!,%2'47!?'7%.!:#6#?%*%6!

#77%7!345('6?.-!*'!*+%!54*%64#.!#2*5<#*5'4!'(!#..!345*,!54!*+%!4%*;'6A0!!L4!'*+%6!;'67,B!

uti = activation of node i at time t .

Parameters:α, β = real-valued left/right inhibition constantsP1,P−1,P−2 = starting activations of initial, final,penultimate syllablesS = boolean whether left- or rightmost stressed syllable isprimary

uofclogo

Introduction QI Stress The Models Predicting Attestation & Frequency Conclusion

A dynamic linear model

Network evolution:

ut+1i = u1

i + αuti+1 + βut

i−1

Prince (1993) shows conditions for convergence and givesa closed-form solution for it.Typological predictions

Can be computed approximately by discrete sampling ofparameter space.Parameters vary freely with granularity 0.2, α and β mustbe inhibitory (∈ [0,−1]).

uofclogo

Introduction QI Stress The Models Predicting Attestation & Frequency Conclusion

A dynamic linear model

P1 = P−1 = P−2 = 1.0, colored regions represent distinctstress patterns.

uofclogo

Introduction QI Stress The Models Predicting Attestation & Frequency Conclusion

A dynamic linear model

ResultsGenerates 1,470 distinct QI stress systems14 attestedUndergenerates by 12, overgenerates by 1,456.

CaveatsPreliminary implementationOther parameterizations, granularities possibleCurrently cannot generate stress clash!

uofclogo

Introduction QI Stress The Models Predicting Attestation & Frequency Conclusion

A weighted constraints model

“Harmonic Grammar” (HG; Legendre et al 1990, Goldsmith1993, Smolensky & Legendre 2006)

Constraint-based like OT, but with constraint weightingsrather than strict rankings.Allows “additive” effects

E.g., many violations of low-weighted constraint canoverwhelm action of higher-weighted constraint.

ParameterizationFor k constraints, a grammar ~w is a k -vector (∈ Rk

+) ofweights.Each candidate output (∈ {σ, σ́, σ̀}n) gives a vector ~v ofconstraint violations (∈ Zk

+).Optimal candidate minimizes ~w · ~v .

uofclogo

Introduction QI Stress The Models Predicting Attestation & Frequency Conclusion

A weighted constraints model

As a starting point, we’ll consider an HG model using thesame constraints as Gordon’s (2002) OT model.

Same meta-constraint that only one of the ALIGN(x2, L/R)constraints be active at once.

Typological predictionsExact computation using finite-state methods (Bane &Riggle in preparation).36,846 distinct stress systems possible (!)25 of them attestedUndergenerates by 1 (Biangai), overgenerates by 36,821.

uofclogo

Introduction QI Stress The Models Predicting Attestation & Frequency Conclusion

n-gram Entropy

MotivationHypothesized principle of least effort or simplicityPredicts that cross-linguistically frequent patterns should besimpler (according to some metric) than rare onesPatterns predicted by a model should be more likely to beattested if they are less complex

We find evidence consistent with both predictions for QIstress

According to at least one information theoretic definition ofcomplexity

uofclogo

Introduction QI Stress The Models Predicting Attestation & Frequency Conclusion

n-gram Entropy

Compute conditional transition probabilities between thesymbols (σ, σ̀, σ́,#) of each word in a stress pattern (2–8syllables), given the previous n − 1 symbolsUse these to calculate the Shannon entropy of the pattern

The number of bits necessary to effeciently describe thepattern, according to the n-gram probability model

Patterns where it’s easy to predict next symbol based onprevious n − 1 symbols: low entropy = less “complex”

uofclogo

Introduction QI Stress The Models Predicting Attestation & Frequency Conclusion

Attestation: Bigram Entropy

Attested Unattested

25

35

45

55

Bigram Entropy of OT Typology

Bits/symbol

Attested Unattested

25

35

45

55

Bigram Entropy of DLM Typology

Bits/symbol

Attested Unattested

3040

5060

Bigram Entropy of HG Typology

Bits/symbol

Significant (p < 0.05): DLM, HG.

uofclogo

Introduction QI Stress The Models Predicting Attestation & Frequency Conclusion

Attestation: Trigram Entropy

Attested Unattested

25

30

35

Trigram Entropy of OT Typology

Bits/symbol

Attested Unattested

20

25

30

35

40

Trigram Entropy of DLM Typology

Bits/symbol

Attested Unattested

2025

3035

4045

50

Trigram Entropy of HG Typology

Bits/symbol

Significant (p < 0.05): OT, DLM, HG.

uofclogo

Introduction QI Stress The Models Predicting Attestation & Frequency Conclusion

Attestation: 4-gram Entropy

Attested Unattested

20

21

22

23

24

4-gram Entropy of OT Typology

Bits/symbol

Attested Unattested

20

22

24

26

28

30

4-gram Entropy of DLM Typology

Bits/symbol

Attested Unattested

2025

3035

4-gram Entropy of HG Typology

Bits/symbol

Significant (p < 0.05): DLM, HG.

uofclogo

Introduction QI Stress The Models Predicting Attestation & Frequency Conclusion

Cross-linguistic frequency: Trigram Entropy

25 30 35

0.00

0.10

0.20

Trigram Entropy vs Frequency

bits/symbol

Pat

tern

Typ

olog

ical

Fre

quen

cy

High Frequency Low Frequency

2530

35

Trigram Entropy

bits/symbol

Typological frequency of attestation:High frequency (above median) patterns have significantlylower trigram entropy than low frequency (below median)patterns. U = 51.5, p = 0.0428

uofclogo

Introduction QI Stress The Models Predicting Attestation & Frequency Conclusion

Confusability

Some patterns are very similar to each other, orconfusable

Must observe very long forms (i.e., many syllables) in orderto distinguish them from each other

Example: Albanian and Malakmalak2 syl. word: σ́σ vs σ́σ3 syl. word: σσ́σ vs σσ́σ4 syl. word: σσσ́σ vs σ́σσ̀σ5 syl. word: σσσσ́σ vs σσ́σσ̀σ6 syl. word: σσσσσ́σ vs σ́σσ̀σσ̀σIdentical stress assignments for two and three syllablewordsMust see words of 4+ syllables to tell them apart

uofclogo

Introduction QI Stress The Models Predicting Attestation & Frequency Conclusion

Confusability Vectors

Potential factor in learnability of stress patterns. Patternsthat are easily identified at short word-lengths from amongthe competing possibilities may be more faithfully acquiredby learners, and thus more “typologically stable”:

more likely to be attested, more frequently attested, or both

Test: construct confusability vector for each predictedpattern in each model.Ex: Albanian’s (fixed penultimate primary) stress pattern:

〈101,39,10,0,0,0,0〉 (in Gordon’s OT model)Confusable with 101 other patterns at 2 syllables, with 39 at3 syllables, with 10 at 4 syllables, with none at 5+ syllables

Confusability sum: sum all the numbers in the vectorConfusability pivot: number of syllables at which pattern isuniquely identifiable

uofclogo

Introduction QI Stress The Models Predicting Attestation & Frequency Conclusion

Confusability Pivots

Attested Unattested

34

56

78

OT Typology Pivots

Sylla

ble

s for

Uniq

ue Identification

Attested Unattested

45

67

8

DLM Typology Pivots

Sylla

ble

s for

Uniq

ue Identification

Attested Unattested

45

67

8

HG Typology Pivots

Syl

labl

es fo

r Uni

que

Iden

tific

atio

n Difference is significantonly in OT(U = 1005.5, p < 0.001)

uofclogo

Introduction QI Stress The Models Predicting Attestation & Frequency Conclusion

Confusability Sums

Confusability sums are not significantly different forattested vs unattested patterns.But, within attested patterns, frequency of attestationsignificantly correlated with linear combination ofconfusability sums and pivots (p < 0.05, R2 = 0.271).

Only among Gordon’s (OT) predicted languages.

uofclogo

Introduction QI Stress The Models Predicting Attestation & Frequency Conclusion

Parameter-volume (p-volume)

Within each model, each predicted pattern can beassociated with the number of model parameter settingsthat produce it.For OT:

The number of constraint rankings that generate thepattern.Efficiently computable by an algorithm worked out by Riggle(2008).

uofclogo

Introduction QI Stress The Models Predicting Attestation & Frequency Conclusion

Parameter-volume (p-volume)

For DLMs:Volume of the region in the {α, β,P1,P−1,P−2,S}-spacethat produces the pattern.Approximated by discrete search of the space.

For HG:Volume of the region in the constraint weighting space (Rk

+)that produces the pattern.Can be computed exactly, but NP-hard. . . still in progress (need better computer!)

uofclogo

Introduction QI Stress The Models Predicting Attestation & Frequency Conclusion

p-volumes of Gordon’s OT predicted languages

L118 L004 L071 L080 L140 L052 L129 L115 L068 L055 L127 L143 L133 L058 L070 L002 L042 L107 L142 L090 L150 L152 L040 L137 L067 L109

0.00

0.05

0.10

0.15

R−Volumes of Predicted Stress Patterns(152 patterns)

Language

R−

Vol

ume

Pro

port

ion

Majority of the parameter volume is concentrated in a fewstress patterns.Power law, similar to cross-linguistic frequency (nearlyZipfian, p < 0.001, R2 = 0.968 according toGauss-Newton)

uofclogo

Introduction QI Stress The Models Predicting Attestation & Frequency Conclusion

What can p-volume predict?

The p-volumes of the stress patterns generated byGordon’s OT model prove to correlate well with:

Which predicted patterns are actually attested, andThe frequencies of the attested patterns.

uofclogo

Introduction QI Stress The Models Predicting Attestation & Frequency Conclusion

OT p-volume correlates with attestation

L118 L004 L071 L080 L140 L052 L129 L115 L068 L055 L127 L143 L133 L058 L070 L002 L042 L107 L142 L090 L150 L152 L040 L137 L067 L109

0.00

0.05

0.10

0.15

R−Volumes of Predicted Stress Patterns(152 patterns)

Language

R−

Vol

ume

Pro

port

ion

L118 L004 L071 L080 L140 L052 L129 L115 L068 L055 L127 L143 L133 L058 L070 L002 L042 L107 L142 L090 L150 L152 L040 L137 L067 L109

0.00

0.05

0.10

0.15

R−Volumes of Predicted Stress Patterns(152 patterns; attested are blue)

Language

R−

Vol

ume

Pro

port

ion

uofclogo

Introduction QI Stress The Models Predicting Attestation & Frequency Conclusion

OT p-volume correlates with attestation

!

!

!

!!

!

!!!

!

!!

!!

Attested Unattested

0.0

e+0

04

.0e+

06

8.0

e+0

61

.2e+

07

r!volume

Co

nsi

sten

t R

ank

ing

s

Attested Unattested

1011

1213

1415

16

log(r−volume)

Nat

ural

Log

of C

onsi

sten

t Ran

king

s

The p-volumes of attested stress patterns are significantlygreater than those of the unattested.Mann-Whitney U test: U = 2113.5, ρ = 71.2%, p < 0.01

uofclogo

Introduction QI Stress The Models Predicting Attestation & Frequency Conclusion

OT p-volume correlates with frequency

●● ●●

●●

●● ●

0.00 0.05 0.10 0.15

0.00

0.05

0.10

0.15

0.20

0.25

R−Volume vs Frequency of Attested Patterns

R−Volume Proportion

Fre

quen

cy

● ●●

●●

●●

−2 −1 0 1 2

−0.

050.

000.

050.

100.

15

Normal Q−Q Plot:Residuals of Regression of Frequency on R−Vol

Theoretical QuantilesS

ampl

e Q

uant

iles

Linear regression: Frequency ∝ r -volumeR2 = 0.712,p < 0.001But overly sensitive to outliers (Q-Q plot nonlinearity,significant Cook’s distance).

uofclogo

Introduction QI Stress The Models Predicting Attestation & Frequency Conclusion

OT p-volume correlates with frequency

11 12 13 14 15 16

0.00

0.05

0.10

0.15

0.20

log(r−volume) vs Frequency

log(r−volume)

Pat

tern

Fre

quen

cy

●●

● ●

●● ●

Linear RegressionExponential Regression

Frequency is more robustly predicted by a nonlinearfunction of the logarithm of the p-volume.Exponential regression: R2 = 0.704, p < 0.001

uofclogo

Introduction QI Stress The Models Predicting Attestation & Frequency Conclusion

DLM p-volume correlates with attestationTop 250 Parameter Volumes of DLM-Generated Stress Systems

Para

mete

r V

olu

me

02000

4000

6000

8000

10000

12000

14000

Attested Unattested

05000

10000

15000

DLM Parameter Volume

Par

amet

er S

ettin

gs

uofclogo

Introduction QI Stress The Models Predicting Attestation & Frequency Conclusion

Discussion

Surprising, since these correlates depend entirely on theset of languages generated by each model.p-volume findings might suggest learning model inwhich. . .

Learners randomly select a grammar consistent with theevidence they’ve seen so far.

⇒ learners will tend to choose languages with frequencyproportional to p-volume

Or where learners select from the languages consistentwith their observations according to p-volume.

In progress: predicting the typological frequencies withiterated learning simulations.

uofclogo

Introduction QI Stress The Models Predicting Attestation & Frequency Conclusion

Discussion

In probabilistic and information-theoretic theories, we oftenneed to quantify the prior probability of a grammar.Often not obvious

How is one constraint-ordering, or one vector in Rk+ more

probable than another?p-volume results suggest an answer:

P(g) ∝ pvol(L(g))

uofclogo

Introduction QI Stress The Models Predicting Attestation & Frequency Conclusion

Discussion

A different perspective on typological modelsOne usually wants overgeneration to be as small aspossible (inclusion-exclusion)And as non-“pathological” as possible. . . i.e., similar to whatis attested.

But maybe large amounts of overgeneration are not aproblem

as long as unattested languages are systematically,detectably different from attested.i.e., as long as overgeneration is pathological(systematically, detectably).especially if those same systematic differences also relateto cross-linguistic frequency.

uofclogo

Introduction QI Stress The Models Predicting Attestation & Frequency Conclusion

Discussion

A different perspective on typological modelsOne usually wants overgeneration to be as small aspossible (inclusion-exclusion)And as non-“pathological” as possible. . . i.e., similar to whatis attested.

But maybe large amounts of overgeneration are not aproblem

as long as unattested languages are systematically,detectably different from attested.i.e., as long as overgeneration is pathological(systematically, detectably).especially if those same systematic differences also relateto cross-linguistic frequency.

uofclogo

Introduction QI Stress The Models Predicting Attestation & Frequency Conclusion

Thanks

Special thanks to Jeffrey Heinz (stress data available athttp://www.ling.udel.edu/heinz/diss/index.html),

Jason Riggle, Alan Yu, Morgan Sonderegger, and theSIGMORPHON anonymous reviewers.