48
Fine-Grained Soft Semantic Constraints Yuval Marton University of Maryland http://umiacs.umd.edu/~ymarton/pub/ibm/Hybrid Knowledge-CorpusBasedSem-IBM_090728.ppt

Fine-Grained Soft Semantic Constraints Yuval Marton University of Maryland ymarton/pub/ibm/Hybrid Knowledge-CorpusBasedSem-IBM_090728.ppt

Embed Size (px)

Citation preview

Page 1: Fine-Grained Soft Semantic Constraints Yuval Marton University of Maryland ymarton/pub/ibm/Hybrid Knowledge-CorpusBasedSem-IBM_090728.ppt

Fine-Grained Soft

Semantic Constraints

Yuval MartonUniversity of Maryland

http://umiacs.umd.edu/~ymarton/pub/ibm/Hybrid Knowledge-CorpusBasedSem-IBM_090728.ppt

Page 2: Fine-Grained Soft Semantic Constraints Yuval Marton University of Maryland ymarton/pub/ibm/Hybrid Knowledge-CorpusBasedSem-IBM_090728.ppt

Yuval Marton, IBM talk 2

Why Care?

Tell’em apart:

In spite of similar contexts

These, too:

In spite of same form

Page 3: Fine-Grained Soft Semantic Constraints Yuval Marton University of Maryland ymarton/pub/ibm/Hybrid Knowledge-CorpusBasedSem-IBM_090728.ppt

Yuval Marton, IBM talk 3

Road map

• Brief overview of doctoral work

• Hybrid knowledge / corpus-based semantic similarity methods– Pure and hybrid methods

– Hard and soft constraints

– Fine-grained

– Named-entities

Page 4: Fine-Grained Soft Semantic Constraints Yuval Marton University of Maryland ymarton/pub/ibm/Hybrid Knowledge-CorpusBasedSem-IBM_090728.ppt

Yuval Marton, IBM talk 4

Dissertation Theme

• Hybrid Knowledge/Corpus-Based Statistical NLP Models Using Fine-Grained Soft Syntactic and Semantic Constraints– Soft Constraints– Fine-Grained– Syntactic (parsing)– Semantic (“concepts”, paraphrases)

• Evaluated in – Word-pair semantic similarity ranking and – Statistical Machine Translation (SMT)

Page 5: Fine-Grained Soft Semantic Constraints Yuval Marton University of Maryland ymarton/pub/ibm/Hybrid Knowledge-CorpusBasedSem-IBM_090728.ppt

Yuval Marton, IBM talk 5

Soft Constraints

• Hard constraints– [0,1]; in/out– Decrease search space– “structural zeroes”– Theory-driven– Faster, slimmer

• Soft constraints– [0..1]; fuzzy– Only bias the model– Data-driven: Let patterns emerge

Universe

Hard

Universe

Soft

Page 6: Fine-Grained Soft Semantic Constraints Yuval Marton University of Maryland ymarton/pub/ibm/Hybrid Knowledge-CorpusBasedSem-IBM_090728.ppt

Yuval Marton, IBM talk 6

Fine-grained

• Granularity is a big deal– Soft syntactic constraints in SMT

• Chiang 2005 vs. Marton and Resnik 2008

• Neg results pos results

– Soft semantic constraints in word-pair similarity ranking

• Mohammad and Hirst 2006 vs. Marton, Mohammad and Resnik 2009

• Pos results better results

Page 7: Fine-Grained Soft Semantic Constraints Yuval Marton University of Maryland ymarton/pub/ibm/Hybrid Knowledge-CorpusBasedSem-IBM_090728.ppt

Yuval Marton, IBM talk 7

Soft Syntactic Constraints• X X1 speech ||| X1 espiche

– What should be the span of X1?

• Chiang’s 2005 constituency feature– Reward rule’s score if rule’s

source-side matches a constituent span

– Constituency-incompatible emergent patterns can still ‘win’ (in spite of no reward)

– Good idea -- Neg-result • But what if…

Page 8: Fine-Grained Soft Semantic Constraints Yuval Marton University of Maryland ymarton/pub/ibm/Hybrid Knowledge-CorpusBasedSem-IBM_090728.ppt

Yuval Marton, IBM talk 8

Rule granularity

• Chiang: Single weight for all constituents (parse tags)

• … But what if we can assign a separate feature and weight for each constituent?

• E.g., NP-only: (NP= )

• Or VP-only: (VP= )

Page 9: Fine-Grained Soft Semantic Constraints Yuval Marton University of Maryland ymarton/pub/ibm/Hybrid Knowledge-CorpusBasedSem-IBM_090728.ppt

Yuval Marton, IBM talk 9

Fine-grained

• Granularity is a big deal

Soft syntactic constraints in SMT• Chiang 2005 vs. Marton and Resnik 2008• Neg results pos results

– Soft semantic constraints in word-pair similarity ranking

• Mohammad and Hirst 2006 vs. Marton, Mohammad and Resnik 2009

• Pos results better results

Page 10: Fine-Grained Soft Semantic Constraints Yuval Marton University of Maryland ymarton/pub/ibm/Hybrid Knowledge-CorpusBasedSem-IBM_090728.ppt

Yuval Marton, IBM talk 10

Word-pair similarity ranking

• Give each word pair a similarity score– Rooster – voyage– Coast – shore

• Noun-noun (Rubinstein & Goodenough, 1965)

• Verb-verb (Resnik & Diab, 2000)

• Result: list of pairs ordered by similarity• Spearman rank correlation

Page 11: Fine-Grained Soft Semantic Constraints Yuval Marton University of Maryland ymarton/pub/ibm/Hybrid Knowledge-CorpusBasedSem-IBM_090728.ppt

Yuval Marton, IBM talk 11

Similarity measures

• Distributional profiles (DP)– Which words did I occur next to?

• Context vectors

• Similar vectors similar meaning

Page 12: Fine-Grained Soft Semantic Constraints Yuval Marton University of Maryland ymarton/pub/ibm/Hybrid Knowledge-CorpusBasedSem-IBM_090728.ppt

Yuval Marton, IBM talk 12

Bank (pure word-based)

Bank

Page 13: Fine-Grained Soft Semantic Constraints Yuval Marton University of Maryland ymarton/pub/ibm/Hybrid Knowledge-CorpusBasedSem-IBM_090728.ppt

Yuval Marton, IBM talk 13

Bank (pure concept-based)

BankTellerMoney

Financial Institution

Water

RiverBankWater

–Compare closest senses

–Bankriver = water ??

Page 14: Fine-Grained Soft Semantic Constraints Yuval Marton University of Maryland ymarton/pub/ibm/Hybrid Knowledge-CorpusBasedSem-IBM_090728.ppt

Yuval Marton, IBM talk 14

Bank (Hybrid Model)

BankRiverBankFin.Inst

Page 15: Fine-Grained Soft Semantic Constraints Yuval Marton University of Maryland ymarton/pub/ibm/Hybrid Knowledge-CorpusBasedSem-IBM_090728.ppt

Yuval Marton, IBM talk 15

Fine-grained

• Granularity is a big deal

Soft syntactic constraints in SMT• Chiang 2005 vs. Marton and Resnik 2008• Neg results pos results

Soft semantic constraints in word-pair similarity ranking

• Mohammad and Hirst 2006 vs. Marton, Mohammad and Resnik 2009

• Pos results better results

Page 16: Fine-Grained Soft Semantic Constraints Yuval Marton University of Maryland ymarton/pub/ibm/Hybrid Knowledge-CorpusBasedSem-IBM_090728.ppt

Yuval Marton, IBM talk 16

Unified Model

• Soft constraints in a log-linear model– Syntactic

– Semantic

– …

• ihi(x)

• Constraints = Add more terms to the sum

Page 17: Fine-Grained Soft Semantic Constraints Yuval Marton University of Maryland ymarton/pub/ibm/Hybrid Knowledge-CorpusBasedSem-IBM_090728.ppt

Yuval Marton, IBM talk 17

Road map

Brief overview of doctoral work

• Hybrid knowledge / corpus-based semantic similarity methods– Pure and hybrid methods

– Hard and soft constraints

– Fine-grained

– Named-entities

Page 18: Fine-Grained Soft Semantic Constraints Yuval Marton University of Maryland ymarton/pub/ibm/Hybrid Knowledge-CorpusBasedSem-IBM_090728.ppt

Yuval Marton, IBM talk 18

Distributional profiles (DPs)

• Distributional Hypothesis (Harris 1940; Firth 1957)

• First order vs. second order (vector representation)

• Strength of association– Counts, PMI, TF/IDF-based,

Log-likelihood ratios …

• Vector similarity (cosine, L1, L2,..)

word x word

Bush Obama

President .93 .96

Democrat .13 .89

Republican .88 .15

White-house

.76 .91

… .45 .74

Page 19: Fine-Grained Soft Semantic Constraints Yuval Marton University of Maryland ymarton/pub/ibm/Hybrid Knowledge-CorpusBasedSem-IBM_090728.ppt

Yuval Marton, IBM talk 19

Taxonomies and Groupings

• WordNet– Synsets– Relations (“is-a”)– Arc distance– The tennis problem

• UMLS• Thesaurus

– Flat– Coarse – Implicit relations,

potentially non-classical

job

Academic job

Is-a

Postdoc

Is-a

Industry job

Is-a

CEO

Is-a

Page 20: Fine-Grained Soft Semantic Constraints Yuval Marton University of Maryland ymarton/pub/ibm/Hybrid Knowledge-CorpusBasedSem-IBM_090728.ppt

Yuval Marton, IBM talk 20

Hybrid measures

• WordNet– Resnik’s method (info content)– Lin and others

• Thesaurus Concept-based – Mohammad and Hirst (coarse-grained)– Distance b/w most similar senses– Pro: Semantic relatedness (non-classical relations)

Resource-poor languages and domains– Con: Small thesaurus low applicability

Bankriver = water ??

Page 21: Fine-Grained Soft Semantic Constraints Yuval Marton University of Maryland ymarton/pub/ibm/Hybrid Knowledge-CorpusBasedSem-IBM_090728.ppt

Yuval Marton, IBM talk 21

Concept-Word DPs

• Concept-word collocation matrix

• Aggregate collocation info of words under concept

• Potentially iterative process

• Clean-up

Concept x word

Fin.Inst Water

bank .97 .85

teller .88 .07

money .94 .15

water .32 .91

… .45 .74

Page 22: Fine-Grained Soft Semantic Constraints Yuval Marton University of Maryland ymarton/pub/ibm/Hybrid Knowledge-CorpusBasedSem-IBM_090728.ppt

Yuval Marton, IBM talk 22

Use concept-based DPs to bias word-based DPs

Bank

BankTellerMoney

WaterFinancial Institution WaterFinancial Institution

RiverBankWater

–Compare closest senses

–Bankriver = water ??

BankRiverBankFin.Inst

+

=

Page 23: Fine-Grained Soft Semantic Constraints Yuval Marton University of Maryland ymarton/pub/ibm/Hybrid Knowledge-CorpusBasedSem-IBM_090728.ppt

Yuval Marton, IBM talk 23

Fine-grained soft constraints

• DPWS: distributional profile of word senses

• Use concept-based DPs to bias word-based DPs– Hybrid-filtered

– Hybrid-proportional

Page 24: Fine-Grained Soft Semantic Constraints Yuval Marton University of Maryland ymarton/pub/ibm/Hybrid Knowledge-CorpusBasedSem-IBM_090728.ppt

Yuval Marton, IBM talk 24

Hybrid-filteredFin.Inst concept

DP

Water concept

DP

bank

DP

bankriver

DPWS

bank .97 .85 .76 .76

teller .88 .07 .54 .54

money .94 .15 .68 .68

water .00 .91 .62 .00

… .45 .74 .25 .25

Filter out collocates in word DP,

if not appearing in concept DP

Page 25: Fine-Grained Soft Semantic Constraints Yuval Marton University of Maryland ymarton/pub/ibm/Hybrid Knowledge-CorpusBasedSem-IBM_090728.ppt

Yuval Marton, IBM talk 25

Hybrid-proportional

Fin.Inst concept

DP

Water concept

DP

bank

DP

bankriver

DPWS

bank .97 .85 .76 .33

teller .88 .07 .54 .05

money .94 .15 .68 .08

water .00 .91 .62 .00

… .45 .74 .25 .15

Only discount collocate’s value in word DP in proportion to the ratio of its count in current concept DP relative to all concept DPs of the target word

Page 26: Fine-Grained Soft Semantic Constraints Yuval Marton University of Maryland ymarton/pub/ibm/Hybrid Knowledge-CorpusBasedSem-IBM_090728.ppt

Yuval Marton, IBM talk 26

WSD with DPWS

• Each sense of each word has a unique profile

– Bankfin.inst ≠ Bankriver ≠ water !

• Pro:– Not aggregated: unlike concept DPs

– Non/less smearing: unlike word DPs that smear all senses in a single profile

Page 27: Fine-Grained Soft Semantic Constraints Yuval Marton University of Maryland ymarton/pub/ibm/Hybrid Knowledge-CorpusBasedSem-IBM_090728.ppt

Yuval Marton, IBM talk 27

Results

Page 28: Fine-Grained Soft Semantic Constraints Yuval Marton University of Maryland ymarton/pub/ibm/Hybrid Knowledge-CorpusBasedSem-IBM_090728.ppt

Yuval Marton, IBM talk 28

evaluation

• Word-pair similarity ranking– Spearman Rank correlation

• Paraphrasing in SMT– BLEU, TER, METEOR, ..

Page 29: Fine-Grained Soft Semantic Constraints Yuval Marton University of Maryland ymarton/pub/ibm/Hybrid Knowledge-CorpusBasedSem-IBM_090728.ppt

Yuval Marton, IBM talk 29

comparison

• WordNet results

• LSA results

Page 30: Fine-Grained Soft Semantic Constraints Yuval Marton University of Maryland ymarton/pub/ibm/Hybrid Knowledge-CorpusBasedSem-IBM_090728.ppt

Yuval Marton, IBM talk 30

Challenges

• Antonyms (black – white)

• “Hyperonyms” (vehicle – car)

• Co-hypernyms / co-taxonyms

Page 31: Fine-Grained Soft Semantic Constraints Yuval Marton University of Maryland ymarton/pub/ibm/Hybrid Knowledge-CorpusBasedSem-IBM_090728.ppt

Yuval Marton, IBM talk 31

conclusion

• Hybrid Knowledge/Corpus-Based Statistical NLP Models Using Fine-Grained Soft Constraints– Soft Constraints

– Fine-Grained

– Semantic (“concepts”)

– Semantic relatedness,resource-poor setting, special domains

Univ.

Soft

BankRiverBankFin.Inst

Page 32: Fine-Grained Soft Semantic Constraints Yuval Marton University of Maryland ymarton/pub/ibm/Hybrid Knowledge-CorpusBasedSem-IBM_090728.ppt

Yuval Marton, IBM talk 32

Thank you!

Questions?

[email protected]

Advisors: Philip Resnik & Amy Weinberg

Department of Linguistics and CLIP Lab

Page 33: Fine-Grained Soft Semantic Constraints Yuval Marton University of Maryland ymarton/pub/ibm/Hybrid Knowledge-CorpusBasedSem-IBM_090728.ppt

Paraphrase generation

For some target OOV phrase Phr:

• Build distributional profile DPPhr

• Gather contexts of Phr

• Gather paraphrase candidates

• Score / Rank candidates

• Output K-best candidates

Paraphrase Generation for Phr

Build distributional profile DPPhr

Gather contexts of Phr

Gather paraphrase candidates

Score / Rank candidates

Output K-best candidates

Page 34: Fine-Grained Soft Semantic Constraints Yuval Marton University of Maryland ymarton/pub/ibm/Hybrid Knowledge-CorpusBasedSem-IBM_090728.ppt

Distributional Profiles

• Example of collocational distributional profile (DP) for word “cord”:

• Sliding window(+/- 6 tokens)

• SoA: conditional probability (CP), mutual info (PMI), log-likelihood ratios (LLR), …

• Using LLR

Collocate Co-occurrence Count

Strength-of-Association (SoA)

Hanging 8 12.20

Ventral 6 18.44

Trousers 14 62.19

… … …

Paraphrase Generation for Phr

Build distributional profile DPPhr

Gather contexts of Phr

Gather paraphrase candidates

Score / Rank candidates

Output K-best candidates

Page 35: Fine-Grained Soft Semantic Constraints Yuval Marton University of Maryland ymarton/pub/ibm/Hybrid Knowledge-CorpusBasedSem-IBM_090728.ppt

DP Similarity

• Each DP is represented as a vector

• Use any vector similarity

• Using cosine: cos(DPcord , DPrope)

• Example: estimating similarity between “cord” and “rope”:

SoA with “cord”

12.20

18.44

62.19

SoA with “rope”

10.43

4.97

31.82

Paraphrase Generation for Phr

Build distributional profile DPPhr

Gather contexts of Phr

Gather paraphrase candidates

Score / Rank candidates

Output K-best candidates

Page 36: Fine-Grained Soft Semantic Constraints Yuval Marton University of Maryland ymarton/pub/ibm/Hybrid Knowledge-CorpusBasedSem-IBM_090728.ppt

Gather contexts

• Gather all contexts L _ R for “cord”:

• Length of context: start small, increase if too frequent

Left context (L) _ Right context (R)

A full cord is a large amount of wood.

History of the Cord 810 and 812

a soft tufted cord used in embroidery

a knotted cord that runs out from a reel

the cord of his electric razor.

living well after spinal

cord injury or disease

… cord …

Page 37: Fine-Grained Soft Semantic Constraints Yuval Marton University of Maryland ymarton/pub/ibm/Hybrid Knowledge-CorpusBasedSem-IBM_090728.ppt

Gather paraphrase candidates

• What else appears b/w L _ R ?

Left context (L) _ Right context (R)

A full wave analysis is required since it

is a large amount of electromagnetic

History of the world since his death in

810

a soft tufted soft tufted cord of silk, cotton, or worsted

used in embroidery

a knotted rope that runs out

the cable of his electric razor.

spinal accessory nerve injury

… … …

Page 38: Fine-Grained Soft Semantic Constraints Yuval Marton University of Maryland ymarton/pub/ibm/Hybrid Knowledge-CorpusBasedSem-IBM_090728.ppt

Score / Rank candidates

• Measure distributional similarityof target (“cord”) with each candidate:

candidate score

rope cos(DPcord , DPrope) = .83

cable cos(DPcord , DPrope) = .79

accessory nerve cos(DPcord , DPaccessory nerve) = .46

world since his death in

cos(DPcord , DPworld since his death in) = .03

… …

Page 39: Fine-Grained Soft Semantic Constraints Yuval Marton University of Maryland ymarton/pub/ibm/Hybrid Knowledge-CorpusBasedSem-IBM_090728.ppt

Output k-best candidates

• K = 20

• Limit span between L _R to 10 tokens

• Use best candidates to augment phrase table

Page 40: Fine-Grained Soft Semantic Constraints Yuval Marton University of Maryland ymarton/pub/ibm/Hybrid Knowledge-CorpusBasedSem-IBM_090728.ppt

Some real examples (unigrams)

Page 41: Fine-Grained Soft Semantic Constraints Yuval Marton University of Maryland ymarton/pub/ibm/Hybrid Knowledge-CorpusBasedSem-IBM_090728.ppt

Some real examples (ngrams)

Page 42: Fine-Grained Soft Semantic Constraints Yuval Marton University of Maryland ymarton/pub/ibm/Hybrid Knowledge-CorpusBasedSem-IBM_090728.ppt

English to Chinese

• 29k line subset created to emulate low density language setting

Page 43: Fine-Grained Soft Semantic Constraints Yuval Marton University of Maryland ymarton/pub/ibm/Hybrid Knowledge-CorpusBasedSem-IBM_090728.ppt

Spanish to English

Page 44: Fine-Grained Soft Semantic Constraints Yuval Marton University of Maryland ymarton/pub/ibm/Hybrid Knowledge-CorpusBasedSem-IBM_090728.ppt

Comparison with Pivoting

Page 45: Fine-Grained Soft Semantic Constraints Yuval Marton University of Maryland ymarton/pub/ibm/Hybrid Knowledge-CorpusBasedSem-IBM_090728.ppt

Comparison with Pivoting

• Pivoting is subject to translational “shift”– Due to double translation step

• Pivoting suffers from having function words as top candidates– Perhaps by-product of their alignment

“promiscuity”

• Monolingual paraphrases suffer from having antonyms as top candidates

Page 46: Fine-Grained Soft Semantic Constraints Yuval Marton University of Maryland ymarton/pub/ibm/Hybrid Knowledge-CorpusBasedSem-IBM_090728.ppt

Monolingually-Derived Paraphrases: Advantages

• Significant gains in SMT results for small sets

• Good for resource-poor languages

• Not relying on bitexts (a limited resource)

• Larger monolingual paraphrase training set

yields better paraphrases

• General: Can plug in any similarity measure

Page 47: Fine-Grained Soft Semantic Constraints Yuval Marton University of Maryland ymarton/pub/ibm/Hybrid Knowledge-CorpusBasedSem-IBM_090728.ppt

Challenges

• Quality: distributional paraphrases suffer from high ranking antonyms, co-hypernyms

• Smaller gains than the pivoting technique Callison-Burch et al. (2006), but can scale up.

• How to benefit from POS and syntactic info e.g, Callison-Burch (2008)

• How to benefit from semantic info / WSDe.g., Marton, Mohammad & Resnik 2009; Erk & Pado 2008

• Scaling: need to explore if can get gains on bigger SMT sets before exhausting capacity of handling huge monolingual set.

Page 48: Fine-Grained Soft Semantic Constraints Yuval Marton University of Maryland ymarton/pub/ibm/Hybrid Knowledge-CorpusBasedSem-IBM_090728.ppt

Thank you!

• Questions