37
Fine-Grained Soft Semantic Constraints Yuval Marton University of Maryland http:// umiacs.umd.edu/~ymarton/pub/umanch/Hybrid Knowledge-CorpusBasedSem-Manchester_090614.ppt

Fine-Grained Soft Semantic Constraints

  • Upload
    vic

  • View
    32

  • Download
    0

Embed Size (px)

DESCRIPTION

Fine-Grained Soft Semantic Constraints. Yuval Marton University of Maryland http://umiacs.umd.edu/~ymarton/pub/umanch/Hybrid Knowledge-CorpusBasedSem-Manchester_090614.ppt. Why Care?. Tell’em apart: These, too:. FOX. FOX = FOX = FO rkhead/winged-heli X replicator gene. Road map. - PowerPoint PPT Presentation

Citation preview

Page 2: Fine-Grained  Soft  Semantic  Constraints

Yuval Marton, U Manchester talk 2

Why Care?

Tell’em apart:

These, too:

Page 3: Fine-Grained  Soft  Semantic  Constraints

Yuval Marton, U Manchester talk 3

FOX

• FOX =

• FOX = FOrkhead/winged-heliX replicator gene

Page 4: Fine-Grained  Soft  Semantic  Constraints

Yuval Marton, U Manchester talk 4

Road map

• Brief overview of doctoral work

• Hybrid knowledge / corpus-based semantic similarity methods– Pure and hybrid methods

– Hard and soft constraints

– Fine-grained

– Named-entities

Page 5: Fine-Grained  Soft  Semantic  Constraints

Yuval Marton, U Manchester talk 5

Dissertation Theme

• Hybrid Knowledge/Corpus-Based Statistical NLP Models Using Fine-Grained Soft Syntactic and Semantic Constraints– Soft Constraints– Fine-Grained– Syntactic (parsing)– Semantic (“concepts”, paraphrases)

• Evaluated in – Word-pair similarity ranking and – Statistical Machine Translation (SMT)

Page 6: Fine-Grained  Soft  Semantic  Constraints

Yuval Marton, U Manchester talk 6

Soft Constraints

• Hard constraints– [0,1]; in/out– Decrease search space– “structural zeroes”– Theory-driven– Faster, slimmer

• Soft constraints– [0..1]; fuzzy– Only bias the model– Data-driven: Let patterns emerge

Univ.

Hard

Univ.

Soft

Page 7: Fine-Grained  Soft  Semantic  Constraints

Yuval Marton, U Manchester talk 7

Fine-grained

• Granularity is a big deal– Soft syntactic constraints in SMT

• Chiang 2005 vs. Marton and Resnik 2008

• Neg results pos results

– Soft semantic constraints in word-pair similarity ranking • Mohammad and Hirst 2006 vs.

Marton, Mohammad and Resnik 2009

• Pos results better results

Page 8: Fine-Grained  Soft  Semantic  Constraints

Yuval Marton, U Manchester talk 8

Soft Syntactic Constraints• X X1 speech ||| X1 espiche

– What should be the span of X1?

• Chiang’s 2005 constituency feature– Reward rule’s score if rule’s

source-side matches a constituent span

– Constituency-incompatible emergent patterns can still ‘win’ (in spite of no reward)

– Good idea -- Neg-result • But what if…

Page 9: Fine-Grained  Soft  Semantic  Constraints

Yuval Marton, U Manchester talk 9

Rule granularity

• Chiang: Single weight for all constituents (parse tags)

• … But what if we can assign a separate feature and weight for each constituent?

• E.g., NP-only: (NP= )

• Or VP-only: (VP= )

Page 10: Fine-Grained  Soft  Semantic  Constraints

Yuval Marton, U Manchester talk 10

Fine-grained

• Granularity is a big deal

Soft syntactic constraints in SMT• Chiang 2005 vs. Marton and Resnik 2008• Neg results pos results

– Soft semantic constraints in word-pair similarity ranking • Mohammad and Hirst 2006 vs.

Marton, Mohammad and Resnik 2009• Pos results better results

Page 11: Fine-Grained  Soft  Semantic  Constraints

Yuval Marton, U Manchester talk 11

Word-pair similarity ranking

• Give each word pair a similarity score– Rooster – voyage– Coast – shore

• Noun-noun (Rubinstein & Goodenough, 1965)

• Verb-verb (Resnik & Diab, 2000)

• Result: list of pairs ordered by similarity• Spearman rank correlation

Page 12: Fine-Grained  Soft  Semantic  Constraints

Yuval Marton, U Manchester talk 12

Similarity measures

• Distributional profiles (DP)– Which words did I occur next to?

• Context vectors

• Similar vectors similar meaning

Page 13: Fine-Grained  Soft  Semantic  Constraints

Yuval Marton, U Manchester talk 13

Bank (pure word-based)

Bank

Page 14: Fine-Grained  Soft  Semantic  Constraints

Yuval Marton, U Manchester talk 14

Bank (pure concept-based)

BankTellerMoney

Financial Institution

Water

RiverBankWater

–Compare closest senses

–Bankriver = water ??

Page 15: Fine-Grained  Soft  Semantic  Constraints

Yuval Marton, U Manchester talk 15

Bank (Hybrid Model)

BankRiverBankFin.Inst

Page 16: Fine-Grained  Soft  Semantic  Constraints

Yuval Marton, U Manchester talk 16

Fine-grained

• Granularity is a big deal

Soft syntactic constraints in SMT• Chiang 2005 vs. Marton and Resnik 2008• Neg results pos results

Soft semantic constraints in word-pair similarity ranking • Mohammad and Hirst 2006 vs.

Marton, Mohammad and Resnik 2009• Pos results better results

Page 17: Fine-Grained  Soft  Semantic  Constraints

Yuval Marton, U Manchester talk 17

Unified Model

• Soft constraints in a log-linear model– Syntactic

– Semantic

– …

• ihi(x)

• Add more terms to the sum

Page 18: Fine-Grained  Soft  Semantic  Constraints

Yuval Marton, U Manchester talk 18

Road map

Brief overview of doctoral work

• Hybrid knowledge / corpus-based semantic similarity methods– Pure and hybrid methods

– Hard and soft constraints

– Fine-grained

– Named-entities

Page 19: Fine-Grained  Soft  Semantic  Constraints

Yuval Marton, U Manchester talk 19

Distributional profiles (DPs)

• DPW: word-based distributional profile– First order

– Distributional Hypothesis (Harris 1940; Firth 1957)

– Second order (vector representation)

– Strength of association• Counts, PMI, TF/IDF-based,

Log-likelihood ratios …

– Vector similarity (cosine, L1, L2,..)

word x word

Bush Obama

Presi-dent

.93 .96

Demo-crat

.13 .89

Repub-lican

.88 .15

White-house

.76 .91

… .45 .74

Page 20: Fine-Grained  Soft  Semantic  Constraints

Yuval Marton, U Manchester talk 20

Taxonomies and Groupings

• WordNet– Synsets– Relations (“is-a”)– Arc distance

• UMLS• Thesaurus

– Flat– Coarse

– Bankriver = water ??

job

Academic job

Is-a

Postdoc

Is-a

Industry job

Is-a

CEO

Is-a

Page 21: Fine-Grained  Soft  Semantic  Constraints

Yuval Marton, U Manchester talk 21

Hybrid measures

• WordNet– Resnik’s method (info content)– Lin and others

• Thesaurus Concept-based – Mohammad and Hirst (coarse-grained)– word may be listed under several concepts– Distance b/w most similar senses– Pro: Resource-poor languages and domains– Con: Small thesaurus low applicability– WCCM: Financial instit. ~ academic instit.

– Bankriver = water ??

Page 22: Fine-Grained  Soft  Semantic  Constraints

Yuval Marton, U Manchester talk 22

WCCM: Concept-Word matrix

• WCCM: word-concept collocation matrix

• DPC: concept-based distributional profile

• Potentially iterative process

• Clean-up

conceptx word

Fin.Inst Water

bank .97 .85

teller .88 .07

money .94 .15

water .32 .91

… .45 .74

Page 23: Fine-Grained  Soft  Semantic  Constraints

Yuval Marton, U Manchester talk 23

Use concept-based DPCs to bias word-based DPWs

Bank

BankTellerMoney

WaterFinancial Institution WaterFinancial Institution

RiverBankWater

–Compare closest senses

–Bankriver = water ??

BankRiverBankFin.Inst

+

=

Page 24: Fine-Grained  Soft  Semantic  Constraints

Yuval Marton, U Manchester talk 24

Fine-grained soft constraints

• DPWS: distributional profile of word senses

• Use concept-based DPCs to bias word-based DPWs– Hybrid-filtered

– Hybrid-proportional

Page 25: Fine-Grained  Soft  Semantic  Constraints

Yuval Marton, U Manchester talk 25

Hybrid-filtered

Fin.Inst DPC

Water DPC

bank

DPW

bankriver

DPWS

bank .97 .85 .76 .76

teller .88 .07 .54 .54

money .94 .15 .68 .68

water .00 .91 .62 .00

… .45 .74 .25 .25

Filter out collocates in DPW, if not appearing in DPC

Page 26: Fine-Grained  Soft  Semantic  Constraints

Yuval Marton, U Manchester talk 26

Hybrid-proportional

Fin.Inst DPC

Water DPC

bank

DPW

bankriver

DPWS

bank .97 .85 .76 .33

teller .88 .07 .54 .05

money .94 .15 .68 .08

water .00 .91 .62 .00

… .45 .74 .25 .15

Only discount collocate’s value in DPW, in proportion to the ratio of its count in current DPC relative to all DPCs of the target word

Page 27: Fine-Grained  Soft  Semantic  Constraints

Yuval Marton, U Manchester talk 27

WSD with DPWS

• Each sense of each word has a unique profile

– Bankfin.inst ≠ Bankriver ≠ water !

• Pro:– Not aggregated: DPC profiles are

– Non/less smearing: DPW profiles smear all senses in a single profile

Page 28: Fine-Grained  Soft  Semantic  Constraints

Yuval Marton, U Manchester talk 28

Results

Page 29: Fine-Grained  Soft  Semantic  Constraints

Yuval Marton, U Manchester talk 29

evaluation

• Word-pair similarity ranking– Spearman Rank correlation

• Paraphrasing in SMT– BLEU, TER, METEOR, ..

Page 30: Fine-Grained  Soft  Semantic  Constraints

Yuval Marton, U Manchester talk 30

comparison

• WordNet results

• LSA results

Page 31: Fine-Grained  Soft  Semantic  Constraints

Yuval Marton, U Manchester talk 31

Challenges

• Antonyms (black – white)

• “Hyperonyms” (vehicle – car)

• Co-hypernyms / co-taxonyms

Page 32: Fine-Grained  Soft  Semantic  Constraints

Yuval Marton, U Manchester talk 32

Named Entities

• Challenges:– Bush – Obama

• Potentially helpful:– H2O – Water– FOX – “forkhead/winged-helix replicator”– FOXP2 – SPCH1

• “SPCH1” turned out to be a member of the FOX (forkhead/winged-helix replicator genes) family, of which several other genes are known all across the animal world. It was then labeled FOXP2, that being its current, and more conventional, name.

Page 33: Fine-Grained  Soft  Semantic  Constraints

Yuval Marton, U Manchester talk 33

Biomedical/Chemical WSD

• Explore hybrid methods to create DPWS – FOXgene , FOXanimal

• requires a lexical resource – UMLS or other resources

• Useful for smaller training sets!

Page 34: Fine-Grained  Soft  Semantic  Constraints

Yuval Marton, U Manchester talk 34

conclusion

• Hybrid Knowledge/Corpus-Based Statistical NLP Models Using Fine-Grained Soft Constraints– Soft Constraints

– Fine-Grained

– Semantic (“concepts”)

– resource-poor setting, special domains

Univ.

Soft

BankRiverBankFin.Inst

Page 35: Fine-Grained  Soft  Semantic  Constraints

Yuval Marton, U Manchester talk 35

Thank you!

Questions?

[email protected]

Advisors: Philip Resnik & Amy Weinberg

Department of Linguistics and CLIP Lab

Page 36: Fine-Grained  Soft  Semantic  Constraints

Yuval Marton, U Manchester talk 36

Fine-grained semantic

• Word-based: – Bank: river, money, water, teller, …

• “concept”-based– River: water, bank, boat, …– Financial institution: bank, money, teller,…– Humans compare closest senses

– Bankriver = water ??

• Hybrid: – Bankriver: more strongly associated with water

– Bankfin.inst: more strongly associated with money

Page 37: Fine-Grained  Soft  Semantic  Constraints

Yuval Marton, U Manchester talk 37

SMT

• Statistical Machine Translation– What translational units to use?

– Syntactic constituents, re-ordering

– “es gibt”

• Paraphrases– Pivoting vs. bitext-free paraphrasing

– Typically monolingual

– Translation = bilingual / cross-domain paraphrasing

– Can be evaluated in SMT