19
Bayesian Typolog Slide 1 Hal Daumé III ([email protected] A Bayesian Model for Discovering Typological Implications Hal Daumé III School of Computing University of Utah [email protected] Lyle Campbell Department of Linguistics University of Utah [email protected]

Bayesian Typology Slide 1 Hal Daumé III ([email protected]) A Bayesian Model for Discovering Typological Implications Hal Daumé III School of Computing University

Embed Size (px)

Citation preview

Page 1: Bayesian Typology Slide 1 Hal Daumé III (me@hal3.name) A Bayesian Model for Discovering Typological Implications Hal Daumé III School of Computing University

Bayesian TypologySlide 1

Hal Daumé III ([email protected])

A Bayesian Model for DiscoveringTypological Implications

Hal Daumé IIISchool of Computing

University of Utah

[email protected]

Lyle CampbellDepartment of Linguistics

University of Utah

[email protected]

Page 2: Bayesian Typology Slide 1 Hal Daumé III (me@hal3.name) A Bayesian Model for Discovering Typological Implications Hal Daumé III School of Computing University

Bayesian TypologySlide 2

Hal Daumé III ([email protected])

A “Typological What?!”

English:I eat dinner in restaurants.

French:je mange le diner dans les restaurantsI eat the dinner in the restaurants

Japanese:boku-wa bangohan -o resutoran -ni taberuI -topic dinner -obj restaurants -in eat

Hindi:main raat ka khaana restra mein khaata hoonI night-of-meal restaurants in eat am

Verb-Object (VO)

Object-Verb (OV)

Prepositional (PreP)

Postpositional (PostP)

VO ⊃ PrePPostP ⊃ OV

Page 3: Bayesian Typology Slide 1 Hal Daumé III (me@hal3.name) A Bayesian Model for Discovering Typological Implications Hal Daumé III School of Computing University

Bayesian TypologySlide 3

Hal Daumé III ([email protected])

The Typologist's Life

PreP PostPVOOV

Now, repeat for lots of feature pairs

(Greenberg, 1963) – Based on 30 diversely

sampled languages

16 0 3 11

Page 4: Bayesian Typology Slide 1 Hal Daumé III (me@hal3.name) A Bayesian Model for Discovering Typological Implications Hal Daumé III School of Computing University

Bayesian TypologySlide 4

Hal Daumé III ([email protected])

Difficulties with Typical Approach

A ⊃ B (99%) uninteresting when ∅ ⊃ B (99%)

Search process tedious

Sampling problem whenmany languages considered

Process is inherently noisy

Page 5: Bayesian Typology Slide 1 Hal Daumé III (me@hal3.name) A Bayesian Model for Discovering Typological Implications Hal Daumé III School of Computing University

Bayesian TypologySlide 5

Hal Daumé III ([email protected])

A Typological Database

2150 Languages 35 language families 275 language geni

139 Features 11 feature categories

Sparsely sampled 85% missing data

Page 6: Bayesian Typology Slide 1 Hal Daumé III (me@hal3.name) A Bayesian Model for Discovering Typological Implications Hal Daumé III School of Computing University

Bayesian TypologySlide 6

Hal Daumé III ([email protected])

Typological Map: VO

Page 7: Bayesian Typology Slide 1 Hal Daumé III (me@hal3.name) A Bayesian Model for Discovering Typological Implications Hal Daumé III School of Computing University

Bayesian TypologySlide 7

Hal Daumé III ([email protected])

Typological Map: PreP

Page 8: Bayesian Typology Slide 1 Hal Daumé III (me@hal3.name) A Bayesian Model for Discovering Typological Implications Hal Daumé III School of Computing University

Bayesian TypologySlide 8

Hal Daumé III ([email protected])

Typological Map: VO and PreP

Page 9: Bayesian Typology Slide 1 Hal Daumé III (me@hal3.name) A Bayesian Model for Discovering Typological Implications Hal Daumé III School of Computing University

Bayesian TypologySlide 9

Hal Daumé III ([email protected])

Consider two features --> 2xN matrix

First, generate first column withprior probability π1

Next, decide if the implication holds

Finally, generate the second column: With probability π2 if feature 1 is not “+”

or if the implication doesn't hold Forced to be “+” otherwise

An Initial Model

VO PreP++-++?+??+-+?+-

+?+-+++--?-+-++

Page 10: Bayesian Typology Slide 1 Hal Daumé III (me@hal3.name) A Bayesian Model for Discovering Typological Implications Hal Daumé III School of Computing University

Bayesian TypologySlide 10

Hal Daumé III ([email protected])

Consider two features --> 2xN matrix

First, generate first column withprior probability π1

Next, decide if the implication holds

Finally, generate the second column: With probability π2 if feature 1 is not “+”

or if the implication doesn't hold Forced to be “+” otherwise

An Initial Model

VO PreP++-++?+??+-+?+-

+?+-+++--?-+-++

Problems: Cannot handle noisy data Doesn't address sampling problem

Page 11: Bayesian Typology Slide 1 Hal Daumé III (me@hal3.name) A Bayesian Model for Discovering Typological Implications Hal Daumé III School of Computing University

Bayesian TypologySlide 11

Hal Daumé III ([email protected])

Consider two features --> 2xN matrix

First, generate first column withprior probability π1

Next, decide if the implication holds

Finally, generate the second column: With probability π2 if feature 1 is not “+”

or if the implication doesn't hold Forced to be “+” otherwise

An Initial Model

VO PreP++-++?+??+-+?+-

+?+-+++--?-+-++

Problems: Cannot handle noisy data Doesn't address sampling problem

m

π2

π1

f2

f1

Page 12: Bayesian Typology Slide 1 Hal Daumé III (me@hal3.name) A Bayesian Model for Discovering Typological Implications Hal Daumé III School of Computing University

Bayesian TypologySlide 12

Hal Daumé III ([email protected])

Fixing the Noise Problem Assume language-specific noise

Model remains unchanged, excepta new variable causes “f” to be flipped

m π2

π1

f2

f1

e1

ε e2

Page 13: Bayesian Typology Slide 1 Hal Daumé III (me@hal3.name) A Bayesian Model for Discovering Typological Implications Hal Daumé III School of Computing University

Bayesian TypologySlide 13

Hal Daumé III ([email protected])

Fixing the Sampling Problem

Hierarchical Bayes prior...

f2

f1

e1

ε e2

f2

f1

e1

ε e2

f2

f1

e1

ε e2

. . .

m0

mIE

mGer

mRom

mAus

mOce

m π2

π1

f2

f1

e1

ε e2

Page 14: Bayesian Typology Slide 1 Hal Daumé III (me@hal3.name) A Bayesian Model for Discovering Typological Implications Hal Daumé III School of Computing University

Bayesian TypologySlide 14

Hal Daumé III ([email protected])

Inference

Binomials get Beta priors m ~ Uniform ~ Beta with 5% mean, 0-10% with 50%

probability

Everything else gets uniform priors

Inference by Gibbs sampling Plus a rejection sampler subroutine

Page 15: Bayesian Typology Slide 1 Hal Daumé III (me@hal3.name) A Bayesian Model for Discovering Typological Implications Hal Daumé III School of Computing University

Bayesian TypologySlide 15

Hal Daumé III ([email protected])

Three Models

Flat – All languages independent

LingHier – Typological Hierarchy

DistHier – Obtained by clustering positionally

Page 16: Bayesian Typology Slide 1 Hal Daumé III (me@hal3.name) A Bayesian Model for Discovering Typological Implications Hal Daumé III School of Computing University

Bayesian TypologySlide 16

Hal Daumé III ([email protected])

Automatically Extracting Implications

Search only over pairs with: 250 languages for which both features are

known 15 languages for which both hold simultaneously When f1 is true, f2 is true with >50% probability

Reduces space from 19,000 to 3442

Sort by probability that m is true

Evaluate: Compare restorative accuracy versus each other Compare against well-known implications

Page 17: Bayesian Typology Slide 1 Hal Daumé III (me@hal3.name) A Bayesian Model for Discovering Typological Implications Hal Daumé III School of Computing University

Bayesian TypologySlide 17

Hal Daumé III ([email protected])

Restoration Accuracy by Model

Page 18: Bayesian Typology Slide 1 Hal Daumé III (me@hal3.name) A Bayesian Model for Discovering Typological Implications Hal Daumé III School of Computing University

Bayesian TypologySlide 18

Hal Daumé III ([email protected])

Top Implications – LingHierPostpositions Gen-N Greenberg #2a OV Greenberg #4 OV Gen-N Greenberg #4 + Greenberg \#2a Gen-Noun Greenberg #2a (converse)

OV Greenberg #2b (converse) SV Gen-N ??? Adj-N Greenberg #18 Suffixing Clear explanation VO

Appeal to economy Dem-NVO Greenberg #3 (converse)

Adj-N Dem-N Greenberg #18 Noun-AdjSV ??? VO Greenberg #3

Prefixing Greenberg #27b N-Adj ???

Labial-velars No uvulars See paperNegative word See paperStrong prefixing VO

Suffixing ??? Final Sub. Word

Many vowels See paperPlural prefix N-Gen ??? No fricatives No tones ???

See paperDem-N

PostP

PostPPostPositions

Num-NTense Suf.Noun-RelC Lehmann

Intr. verb No question prt.Num-N Hawkins XVI (for postpositional languages) PreP

PostP Lehmann PostPPreP

Init. Subord. PreP Operator-operand principle (Lehmann) PreP

Little affixation

No pron poss afxLehmann

Subord. SuffixPostP Operator-operand principle (Lehmann)

High+Mid F.V.s

Oblig. subj. pron No pron poss afxTense Suf. Operator-operand principle (Lehmann)

Page 19: Bayesian Typology Slide 1 Hal Daumé III (me@hal3.name) A Bayesian Model for Discovering Typological Implications Hal Daumé III School of Computing University

Bayesian TypologySlide 19

Hal Daumé III ([email protected])

Discussion

Model for automatically discovering implications

Accounts for noise and sampling problem

Different hierarchical modelsquantitatively different

Discovered implicationscorrelated with known ones

Many worthy of further exploration:http://hal3.name/WALS

?Thanks! Questions?