SemEval 2014 Task-3 Cross-Level Semantic Similarity · Dealing with social media text . CLSS:...

Preview:

Citation preview

Cross-Level Semantic Similarity

MultiJEDI ERC 259234

SemEval 2014 Task-3

Semantic Similarity

Semantic Similarity

Mostly focused on similar types of lexical items

Semantic Similarity

What if we have different types of inputs?

CLSS: Cross-Level Semantic Similarity A new type of similarity task

CLSS: Cross-Level Semantic Similarity

A new type of similarity task

CLSS: Comparison Types

Paragraph to Sentence

CLSS: Comparison Types

Sentence to Phrase

Paragraph to Sentence

CLSS: Comparison Types

Sentence to Phrase

Paragraph to Sentence

Phrase to Word

CLSS: Comparison Types

Sentence to Phrase

Paragraph to Sentence

Word to Sense

Phrase to Word

Task Data

Training set Test set

4000 pairs in total

Task Data

A wide range of domains and text styles

word-to-sense pairs Word to Sense

word-to-sense pairs Word to Sense

word-to-sense pairs Word to Sense

word-to-sense pairs Word to Sense

Rating Scale

Crafting an idealized similarity distribution

Crafting an idealized similarity distribution

larger side

Crafting an idealized similarity distribution

larger side

Crafting an idealized similarity distribution

2

4

0

1

3

larger side

Crafting an idealized similarity distribution

2

4

0

1

3

larger side

Crafting an idealized similarity distribution

2

4

0

1

3

larger side smaller side

Crafting an idealized similarity distribution

2

4

0

1

3

larger side smaller side

Crafting an idealized similarity distribution

2

4

0

1

3

Crafting an idealized similarity distribution

2

4

0

1

3

Crafting an idealized similarity distribution

2

4

0

1

3

Test and Training data IAA

Training (all) Training (unadjudicated) Test (all) Test (unadjudicated)

Kri

ppendorf

f’s α

Paragraph-Sentence Sentence-Phrase Phrase-Word Word-Sense

The annotation procedure produces a

balanced rating distribution

Experimental Setup

The quick brown fox

The brown fox was quick

The quick brown fox

The brown foxes were quick

Baslines:

Experimental Setup

The quick brown fox

The brown fox was quick

The quick brown fox

The brown foxes were quick

Baslines:

Evaluation Measure:

Number of participants

Paragraph-Sentence Sentence-Phrase Phrase-Word Word-Sense

0 1 2 3 4

Meerkat Mafia pw*

SimCompass run1

ECNU run1

UNAL-NLP run2

SemantiKLUE run1

GST Baseline

LCS Baseline

Gold

paragraph-sentence sentence-phrase phrase-word word-sense

Top 5 Systems and Baselines

0 1 2 3 4

Meerkat Mafia pw*

SimCompass run1

ECNU run1

UNAL-NLP run2

SemantiKLUE run1

GST Baseline

LCS Baseline

Gold

paragraph-sentence sentence-phrase phrase-word word-sense

Top 5 Systems and Baselines

0 0.75 1.5 2.25 3

Meerkat Mafia pw*

SimCompass run1

ECNU run1

UNAL-NLP run2

SemantiKLUE run1

GST Baseline

LCS Baseline

paragraph-sentence sentence-phrase phrase-word word-sense

Where do the baselines stand?

0 0.75 1.5 2.25 3

Meerkat Mafia pw*

SimCompass run1

ECNU run1

UNAL-NLP run2

SemantiKLUE run1

GST Baseline

LCS Baseline

paragraph-sentence sentence-phrase phrase-word word-sense

Where do the baselines stand?

0 0.75 1.5 2.25 3

Meerkat Mafia pw*

SimCompass run1

ECNU run1

UNAL-NLP run2

SemantiKLUE run1

GST Baseline

LCS Baseline

paragraph-sentence sentence-phrase phrase-word word-sense

Where do the baselines stand?

Correlation per genre paragraph-to-sentence

Correlation per genre paragraph-to-sentence

Correlation per genre paragraph-to-sentence

Correlation per genre phrase-to-word

Correlation per genre phrase-to-word

What makes the task difficult?

Handling OOV words

and novel usages

Dealing with social media text

CLSS: Cross-Level Semantic Similarity

Similarity of different types of lexical items

High-quality dataset: 4000 pairs for four comparison types

38 systems from 19 teams

Thank you!

David Jurgens

Mohammad Taher Pilehvar

Roberto Navigli

MultiJEDI ERC 259234

Recommended