Upload
alisha-fowler
View
215
Download
0
Embed Size (px)
Citation preview
Bayesian TypologySlide 1
Hal Daumé III ([email protected])
A Bayesian Model for DiscoveringTypological Implications
Hal Daumé IIISchool of Computing
University of Utah
Lyle CampbellDepartment of Linguistics
University of Utah
Bayesian TypologySlide 2
Hal Daumé III ([email protected])
A “Typological What?!”
English:I eat dinner in restaurants.
French:je mange le diner dans les restaurantsI eat the dinner in the restaurants
Japanese:boku-wa bangohan -o resutoran -ni taberuI -topic dinner -obj restaurants -in eat
Hindi:main raat ka khaana restra mein khaata hoonI night-of-meal restaurants in eat am
Verb-Object (VO)
Object-Verb (OV)
Prepositional (PreP)
Postpositional (PostP)
VO ⊃ PrePPostP ⊃ OV
Bayesian TypologySlide 3
Hal Daumé III ([email protected])
The Typologist's Life
PreP PostPVOOV
Now, repeat for lots of feature pairs
(Greenberg, 1963) – Based on 30 diversely
sampled languages
16 0 3 11
Bayesian TypologySlide 4
Hal Daumé III ([email protected])
Difficulties with Typical Approach
A ⊃ B (99%) uninteresting when ∅ ⊃ B (99%)
Search process tedious
Sampling problem whenmany languages considered
Process is inherently noisy
Bayesian TypologySlide 5
Hal Daumé III ([email protected])
A Typological Database
2150 Languages 35 language families 275 language geni
139 Features 11 feature categories
Sparsely sampled 85% missing data
Bayesian TypologySlide 9
Hal Daumé III ([email protected])
Consider two features --> 2xN matrix
First, generate first column withprior probability π1
Next, decide if the implication holds
Finally, generate the second column: With probability π2 if feature 1 is not “+”
or if the implication doesn't hold Forced to be “+” otherwise
An Initial Model
VO PreP++-++?+??+-+?+-
+?+-+++--?-+-++
Bayesian TypologySlide 10
Hal Daumé III ([email protected])
Consider two features --> 2xN matrix
First, generate first column withprior probability π1
Next, decide if the implication holds
Finally, generate the second column: With probability π2 if feature 1 is not “+”
or if the implication doesn't hold Forced to be “+” otherwise
An Initial Model
VO PreP++-++?+??+-+?+-
+?+-+++--?-+-++
Problems: Cannot handle noisy data Doesn't address sampling problem
Bayesian TypologySlide 11
Hal Daumé III ([email protected])
Consider two features --> 2xN matrix
First, generate first column withprior probability π1
Next, decide if the implication holds
Finally, generate the second column: With probability π2 if feature 1 is not “+”
or if the implication doesn't hold Forced to be “+” otherwise
An Initial Model
VO PreP++-++?+??+-+?+-
+?+-+++--?-+-++
Problems: Cannot handle noisy data Doesn't address sampling problem
m
π2
π1
f2
f1
Bayesian TypologySlide 12
Hal Daumé III ([email protected])
Fixing the Noise Problem Assume language-specific noise
Model remains unchanged, excepta new variable causes “f” to be flipped
m π2
π1
f2
f1
e1
ε e2
Bayesian TypologySlide 13
Hal Daumé III ([email protected])
Fixing the Sampling Problem
Hierarchical Bayes prior...
f2
f1
e1
ε e2
f2
f1
e1
ε e2
f2
f1
e1
ε e2
. . .
m0
mIE
mGer
mRom
mAus
mOce
m π2
π1
f2
f1
e1
ε e2
Bayesian TypologySlide 14
Hal Daumé III ([email protected])
Inference
Binomials get Beta priors m ~ Uniform ~ Beta with 5% mean, 0-10% with 50%
probability
Everything else gets uniform priors
Inference by Gibbs sampling Plus a rejection sampler subroutine
Bayesian TypologySlide 15
Hal Daumé III ([email protected])
Three Models
Flat – All languages independent
LingHier – Typological Hierarchy
DistHier – Obtained by clustering positionally
Bayesian TypologySlide 16
Hal Daumé III ([email protected])
Automatically Extracting Implications
Search only over pairs with: 250 languages for which both features are
known 15 languages for which both hold simultaneously When f1 is true, f2 is true with >50% probability
Reduces space from 19,000 to 3442
Sort by probability that m is true
Evaluate: Compare restorative accuracy versus each other Compare against well-known implications
Bayesian TypologySlide 18
Hal Daumé III ([email protected])
Top Implications – LingHierPostpositions Gen-N Greenberg #2a OV Greenberg #4 OV Gen-N Greenberg #4 + Greenberg \#2a Gen-Noun Greenberg #2a (converse)
OV Greenberg #2b (converse) SV Gen-N ??? Adj-N Greenberg #18 Suffixing Clear explanation VO
Appeal to economy Dem-NVO Greenberg #3 (converse)
Adj-N Dem-N Greenberg #18 Noun-AdjSV ??? VO Greenberg #3
Prefixing Greenberg #27b N-Adj ???
Labial-velars No uvulars See paperNegative word See paperStrong prefixing VO
Suffixing ??? Final Sub. Word
Many vowels See paperPlural prefix N-Gen ??? No fricatives No tones ???
See paperDem-N
PostP
PostPPostPositions
Num-NTense Suf.Noun-RelC Lehmann
Intr. verb No question prt.Num-N Hawkins XVI (for postpositional languages) PreP
PostP Lehmann PostPPreP
Init. Subord. PreP Operator-operand principle (Lehmann) PreP
Little affixation
No pron poss afxLehmann
Subord. SuffixPostP Operator-operand principle (Lehmann)
High+Mid F.V.s
Oblig. subj. pron No pron poss afxTense Suf. Operator-operand principle (Lehmann)
Bayesian TypologySlide 19
Hal Daumé III ([email protected])
Discussion
Model for automatically discovering implications
Accounts for noise and sampling problem
Different hierarchical modelsquantitatively different
Discovered implicationscorrelated with known ones
Many worthy of further exploration:http://hal3.name/WALS
?Thanks! Questions?