103
Jose Camacho Collados AI Wales, Cardiff, 6 February 2020 Word, Sense and Contextualized Embeddings: Vector Representations of Meaning in NLP 1

AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

Jose Camacho Collados

AI Wales, Cardiff, 6 February 2020

Word, Sense and Contextualized Embeddings: Vector Representations of Meaning in NLP

1

Page 2: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

About me

➢ 5-year BSc in Mathematics and 2-year Masters in Natural Language

Processing (NLP): Spain-France-UK

➢ Research engineer (2013-2014): ATILF-CNRS (France)

➢ PhD in Computer Science (2014-2018): Università La Sapienza (Italy)

+ Google (Google AI Doctoral Fellowship in NLP)

➢ Postdoc (2018) and Lecturer (2019-*): Cardiff University, computer

science

2

Page 3: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

At Cardiff University

Research in Natural Language Processing (NLP):

➢ Semantics - Embeddings

➢ Multilinguality

➢ Applications

3

Page 4: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

At Cardiff University

Research in Natural Language Processing (NLP):

➢ Semantics - Embeddings

➢ Multilinguality

➢ Applications

4

Page 5: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

At Cardiff University

Research in Natural Language Processing (NLP):

➢ Semantics - Embeddings

➢ Multilinguality

➢ Applications

Teaching:

Artificial Intelligence and Data Science MSc.

-> “Applied Machine Learning”, with applications in NLP and

Computer Vision.

5

Page 6: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

Outline

6

❖ Background➢ Vector Space Models (word embeddings)

➢ Lexical resources

❖ Sense representations

➢ Knowledge-based: NASARI, SW2V

➢ Contextualized: ELMo, BERT

❖ Applications

Page 7: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

Word vector space models

7

Words are represented as vectors: semantically similar words

are close in the vector space

Page 8: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

Neural networks for learning word vector representations from text corpora -> word embeddings

8

Page 9: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

Word embeddings: How to learn them

9

... London is the capital of UK …

... Last night I travelled from Cardiff to London.

.

.

.

London

[0.25, 0.32, -0.1 …. 0.1]

Page 10: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

Why word embeddings?Embedded vector representations:

• are compact and fast to compute

• are geared towards general use

• preserve important relational information between

words (actually, meanings):

10

Page 11: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

• Syntactic parsing (Weiss et al. 2015)

• Named Entity Recognition (Guo et al. 2014)

• Question Answering (Bordes et al. 2014)

• Machine Translation (Zou et al. 2013)

• Sentiment Analysis (Socher et al. 2013)

… and many more!

11

Applications for word representations

Page 12: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

AI goal: language understanding

12

Page 13: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

• Word representations cannot capture ambiguity. For instance, bank

13

Limitations of word embeddings

Page 14: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

14

Problem 1: word representations cannot capture ambiguity

Page 15: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

07/07/2016 15

Problem 1: word representations cannot capture ambiguity

Page 16: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

16

Problem 1: word representations cannot capture ambiguity

Page 17: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

NAACL 2018 Tutorial: The Interplay between Lexical Resources and Natural Language ProcessingCamacho-Collados, Espinosa-Anke, Pilehvar

17

Word representations and the triangular inequality

Example from Neelakantan et al (2014)

plant

pollen refinery

Page 18: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

NAACL 2018 Tutorial: The Interplay between Lexical Resources and Natural Language ProcessingCamacho-Collados, Espinosa-Anke, Pilehvar

18

Example from Neelakantan et al (2014)

plant1

pollen refinery

plant2

Word representations and the triangular inequality

Page 19: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

• They cannot capture ambiguity. For instance, bank

-> They neglect rare senses and infrequent words

• Word representations do not exploit knowledge from existing lexical resources.

19

Limitations of word representations

Page 20: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

NAACL 2018 Tutorial: The Interplay between Lexical Resources and Natural Language ProcessingCamacho-Collados, Espinosa-Anke, Pilehvar

Motivation: Model senses instead of only words

He withdrew money from the bank.

Page 21: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

NAACL 2018 Tutorial: The Interplay between Lexical Resources and Natural Language ProcessingCamacho-Collados, Espinosa-Anke, Pilehvar

Motivation: Model senses instead of only words

...

...

He withdrew money from the bank.

bank#1

bank#2

Page 22: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

NAACL 2018 Tutorial: The Interplay between Lexical Resources and Natural Language ProcessingCamacho-Collados, Espinosa-Anke, Pilehvar

Motivation: Model senses instead of only words

...

...

He withdrew money from the bank.

bank#1

bank#2

Page 23: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

23

a Novel Approach to a Semantically-Aware

Representations of Items

http://lcl.uniroma1.it/nasari/

Page 24: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

24

Key goal: obtain sense representations

Page 25: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

25

Key goal: obtain sense representations

We want to create a separate representation for each entry of a given word

Page 26: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

WordNet

Idea

26

+Encyclopedic knowledge Lexicographic knowledge

Page 27: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

WordNet

Idea

27

+Encyclopedic knowledge Lexicographic knowledge

+Information from text corpora

Page 28: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

28

WordNet

Page 29: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

WordNetMain unit: synset (concept)

electronic devicetelevision, telly,

television set, tv, tube, tv set, idiot box, boob tube,

goggle box

the middle of the day

Noon, twelve noon, high noon, midday, noonday, noontide

29

synset

word sense

Page 30: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

the branch of biology that

studies plantsbotany

WordNet semantic relations

((botany) a living organism lacking

the power of locomotion

plant, flora, plant life

a living thing that has (or can

develop) the ability to act or function independently

organism, being

any of a variety of plants grown indoors

for decorative purposes

houseplant

a protective covering that

is part of aplant

hood, capHypernymy

(is-a)

DomainHyponymy (has-kind)

Meronymy(part of)

30

Page 31: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

NAACL 2018 Tutorial: The Interplay between Lexical Resources and Natural Language ProcessingCamacho-Collados, Espinosa-Anke, Pilehvar

Knowledge-based Representations (WordNet)

X. Chen, Z. Liu, M. Sun: A Unified Model for Word Sense Representation and Disambiguation (EMNLP 2014)

S. Rothe and H. Schutze: AutoExtend: Extending Word Embeddings to Embeddings for Synsets and Lexemes (ACL 2015)

Faruqui, M., Dodge, J., Jauhar, S. K., Dyer, C., Hovy, E., & Smith, N. A. Retrofitting Word Vectors to Semantic Lexicons (NAACL 2015)*

S. K. Jauhar, C. Dyer, E. Hovy: Ontologically Grounded Multi-sense Representation Learning for Semantic Vector Space Models (NAACL 2015)

M. T. Pilehvar and N. Collier, De-Conflated Semantic Representations (EMNLP 2016)

31

Page 32: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

32

Wikipedia

Page 33: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

33

WikipediaHigh coverage of named entities and

specialized concepts from different domains

Page 34: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

34

Wikipedia hyperlinks

Page 35: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

35

Wikipedia hyperlinks

Page 36: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

Thanks to an automatic mapping algorithm, BabelNet

integrates Wikipedia and WordNet, among other

resources (Wiktionary, OmegaWiki, WikiData…).

Key feature: Multilinguality (271 languages)

36

Page 37: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

37

BabelNet (https://babelnet.org/ )

Concept

Entity

Page 38: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

It follows the same structure of WordNet:

synsets are the main units

38

BabelNet

Page 39: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

In this case, synsets are multilingual

39

BabelNet

Page 40: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

40

NASARI(Camacho-Collados et al., AIJ 2016)

Goal

Build vector representations for multilingual BabelNet

synsets.

How?

We exploit Wikipedia semantic network and WordNet

taxonomy to construct a subcorpus (contextual

information) for any given BabelNet synset.

Page 41: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

41

Process of obtaining contextual information for a BabelNet synset exploiting BabelNet taxonomy and Wikipedia as a semantic network

Pipeline

Page 42: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

Human-interpretable dimensions

plant (living organism)

organism#1

table#3

tree#1leaf#14

soil#2carpet#2

food#2garden#2

dictionary#3

refinery#1

42

Page 43: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

43

Embedded vector representation (low-dimensional)Closest senses

Page 44: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

NAACL 2018 Tutorial: The Interplay between Lexical Resources and Natural Language ProcessingCamacho-Collados, Espinosa-Anke, Pilehvar

44

A word is the surface form of a sense: we can exploit this intrinsic relationship for jointly training word and sense embeddings.

SW2V: Senses and Words to Vectors (Mancini and Camacho-Collados et al., CoNLL 2017)

Page 45: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

NAACL 2018 Tutorial: The Interplay between Lexical Resources and Natural Language ProcessingCamacho-Collados, Espinosa-Anke, Pilehvar

45

SW2V: Senses and Words to Vectors (Mancini and Camacho-Collados et al., CoNLL 2017)

A word is the surface form of a sense: we can exploit this intrinsic relationship for jointly training word and sense embeddings.

How?

Updating the representation of the word and its associated senses interchangeably.

Page 46: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

NAACL 2018 Tutorial: The Interplay between Lexical Resources and Natural Language ProcessingCamacho-Collados, Espinosa-Anke, Pilehvar

46

SW2V: IdeaGiven as input a corpus and a semantic network:

1. Use a semantic network to link to each word its associated senses in context.

He withdrew money from the bank.

Page 47: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

NAACL 2018 Tutorial: The Interplay between Lexical Resources and Natural Language ProcessingCamacho-Collados, Espinosa-Anke, Pilehvar

47

Given as input a corpus and a semantic network:

1. Use a semantic network to link to each word its associated senses in context.

He withdrew money from the bank.

SW2V: Idea

Page 48: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

NAACL 2018 Tutorial: The Interplay between Lexical Resources and Natural Language ProcessingCamacho-Collados, Espinosa-Anke, Pilehvar

48

Given as input a corpus and a semantic network:

1. Use a semantic network to link to each word its associated senses in context.

2. Use a neural network where the update of word and sense embeddings is linked, exploiting virtual connections.

SW2V: Idea

Page 49: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

NAACL 2018 Tutorial: The Interplay between Lexical Resources and Natural Language ProcessingCamacho-Collados, Espinosa-Anke, Pilehvar

49

Given as input a corpus and a semantic network:

1. Use a semantic network to link to each word its associated senses in context.

2. Use a neural network where the update of word and sense embeddings is linked, exploiting virtual connections.

He bank

money

withdrew thefrom

SW2V: Idea

Page 50: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

NAACL 2018 Tutorial: The Interplay between Lexical Resources and Natural Language ProcessingCamacho-Collados, Espinosa-Anke, Pilehvar

50

Given as input a corpus and a semantic network:

1. Use a semantic network to link to each word its associated senses in context.

2. Use a neural network where the update of word and sense embeddings is linked, exploiting virtual connections.

He bank

money

withdrew thefrom

error

SW2V: Idea

Page 51: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

NAACL 2018 Tutorial: The Interplay between Lexical Resources and Natural Language ProcessingCamacho-Collados, Espinosa-Anke, Pilehvar

51

Given as input a corpus and a semantic network:

1. Use a semantic network to link to each word its associated senses in context.

2. Use a neural network where the update of word and sense embeddings is linked, exploiting virtual connections.

He bank

money

withdrew thefrom

error

SW2V: Idea

Page 52: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

NAACL 2018 Tutorial: The Interplay between Lexical Resources and Natural Language ProcessingCamacho-Collados, Espinosa-Anke, Pilehvar

52

Given as input a corpus and a semantic network:

1. Use a semantic network to link to each word its associated senses in context.

2. Use a neural network where the update of word and sense embeddings is linked, exploiting virtual connections.

He bank

money

withdrew thefrom

error

SW2V: Idea

Page 53: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

NAACL 2018 Tutorial: The Interplay between Lexical Resources and Natural Language ProcessingCamacho-Collados, Espinosa-Anke, Pilehvar

53

Given as input a corpus and a semantic network:

1. Use a semantic network to link to each word its associated senses in context.

2. Use a neural network where the update of word and sense embeddings is linked, exploiting virtual connections.

He bank

money

withdrew thefrom

error

SW2V: Idea

Page 54: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

NAACL 2018 Tutorial: The Interplay between Lexical Resources and Natural Language ProcessingCamacho-Collados, Espinosa-Anke, Pilehvar

54

Given as input a corpus and a semantic network:

1. Use a semantic network to link to each word its associated senses in context.

2. Use a neural network where the update of word and sense embeddings is linked, exploiting virtual connections.

In this way it is possible to learn word and sense/synset embeddings jointly on a single training.

SW2V: Idea

Page 55: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

NAACL 2018 Tutorial: The Interplay between Lexical Resources and Natural Language ProcessingCamacho-Collados, Espinosa-Anke, Pilehvar

55

Full architecture of W2V (Mikolov et al., 2013)

Words and associated senses used both as input and output.

E=-log(p(wt|Wt))

Page 56: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

NAACL 2018 Tutorial: The Interplay between Lexical Resources and Natural Language ProcessingCamacho-Collados, Espinosa-Anke, Pilehvar

56

Words and associated senses used both as input and output.

Full architecture of SW2V

E=-log(p(wt|Wt,St)) - ∑s∈St log(p(s|Wt,St))

Page 57: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

NAACL 2018 Tutorial: The Interplay between Lexical Resources and Natural Language ProcessingCamacho-Collados, Espinosa-Anke, Pilehvar

57

Word and senses connectivity: example 1

Ten closest word and sense embeddings to the sense company (military unit)

Page 58: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

NAACL 2018 Tutorial: The Interplay between Lexical Resources and Natural Language ProcessingCamacho-Collados, Espinosa-Anke, Pilehvar

58

Word and senses connectivity: example 2

Ten closest word and sense embeddings to the sense school (group of fish)

Page 59: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

59

- Taxonomy Learning (Espinosa-Anke et al., EMNLP 2016)

- Open Information Extraction (Delli Bovi et al. EMNLP 2015).

- Lexical entailment (Nickel & Kiela, NIPS 2017)

- Word Sense Disambiguation (Tripodi & Navigli, EMNLP 2019)

- Sentiment analysis (Flekova & Gurevych, ACL 2016)

- Lexical substitution (Cocos et al., SENSE 2017)

- Computer vision (Young et al., ICRA 2017)

- Text classification (Sinoara et al., KB-Systems 2019)

Applications of knowledge-based sense representations

...

Page 60: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

60

❖ Domain labeling/adaptation

❖ Word Sense Disambiguation

❖ Downstream NLP applications (e.g. text classification)

Applications (in this talk)

Page 61: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

NAACL 2018 Tutorial: The Interplay between Lexical Resources and Natural Language ProcessingCamacho-Collados, Espinosa-Anke, Pilehvar

61

Annotate each concept/entity with its corresponding

domain of knowledge.

To this end, we use the Wikipedia featured articles page,

which includes 34 domains and a number of Wikipedia

pages associated with each domain (Biology, Geography,

Mathematics, Music, etc. ).

Domain labeling(Camacho-Collados and Navigli, EACL 2017)

Page 62: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

NAACL 2018 Tutorial: The Interplay between Lexical Resources and Natural Language ProcessingCamacho-Collados, Espinosa-Anke, Pilehvar

Wikipedia featured articles

Domain labeling

Page 63: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

NAACL 2018 Tutorial: The Interplay between Lexical Resources and Natural Language ProcessingCamacho-Collados, Espinosa-Anke, Pilehvar

63

How to associate a concept with a domain?

1. Learn a NASARI vector for the concatenation of all Wikipedia pages

associated with a given domain.

2. Exploit the semantic similarity between knowledge-based vectors

and graph properties of the lexical resources.

Domain labeling

Page 64: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

64

BabelDomains(Camacho-Collados and Navigli, EACL 2017)

As a result:

Unified resource with information about domains of knowledge

BabelDomains available for BabelNet, Wikipedia and WordNet available at

http://lcl.uniroma1.it/babeldomains

Already integrated into BabelNet (online interface and API)

Page 65: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

65

BabelDomains

Physics and astronomy

Computing

Media

Page 66: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

66

Kobe, which is one of Japan's largest cities, [...]

?

Word Sense Disambiguation

Page 67: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

67

Kobe, which is one of Japan's largest cities, [...]

X

Word Sense Disambiguation

Page 68: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

68

Kobe, which is one of Japan's largest cities, [...]

Word Sense Disambiguation

Page 69: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

69

Word Sense Disambiguation (Camacho-Collados et al., AIJ 2016)

Basic idea

Select the sense which is semantically closer to the

semantic representation of the whole document

(global context).

Page 70: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

Word Sense Disambiguation on textual definitions

(Camacho-Collados et al., LREC 2016; LREV 2019)

Combination of a graph-based disambiguation system (Babelfy) with NASARI to disambiguate the concepts and named entities of over 35M definitions in 256 languages.

Sense-annotated corpus freely available at

http://lcl.uniroma1.it/disambiguated-glosses/

70

Page 71: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

Context-rich WSD

Interchanging the positions of the king and a rook.

castling (chess)

Page 72: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

Context-rich WSD

Castling is a move in the game of chess involving a player’s king and either of the player's original rooks.

A move in which the king moves two squares towards a rook, and the rook moves to the other side of the king.

Interchanging the positions of the king and a rook.

castling (chess)

Page 73: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

Context-rich WSD

Interchanging the positions of the king and a rook.

Castling is a move in the game of chess involving a player’s king and either of the player's original rooks.

Manœuvre du jeu d'échecs

Spielzug im Schach, bei dem König und Turm einer Farbe bewegt werdenEl enroque es un movimiento especial

en el juego de ajedrez que involucra al rey y a una de las torres del jugador.

A move in which the king moves two squares towards a rook, and the rook moves to the other side of the king.

Rošáda je zvláštní tah v šachu, při kterém táhne zároveň král a věž.

Rok İngilizce'de kaleye rook denmektedir.

Rokade er et spesialtrekk i sjakk.

Το ροκέ είναι μια ειδική κίνηση στο σκάκι που συμμετέχουν ο βασιλιάς και ένας από τους δυο πύργους.

castling (chess)

Page 74: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

Context-rich WSD

74

Interchanging the positions of the king and a rook.

Castling is a move in the game of chess involving a player’s king and either of the player's original rooks.

Manœuvre du jeu d'échecs

Spielzug im Schach, bei dem König und Turm einer Farbe bewegt werdenEl enroque es un movimiento especial

en el juego de ajedrez que involucra al rey y a una de las torres del jugador.

A move in which the king moves two squares towards a rook, and the rook moves to the other side of the king.

Rošáda je zvláštní tah v šachu, při kterém táhne zároveň král a věž.

Rok İngilizce'de kaleye rook denmektedir.

Rokade er et spesialtrekk i sjakk.

Το ροκέ είναι μια ειδική κίνηση στο σκάκι που συμμετέχουν ο βασιλιάς και ένας από τους δυο πύργους.

castling (chess)

Page 75: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

75

Towards a seamless integration of senses in downstream NLP applications

(Pilehvar et al., ACL 2017)

Question:

What if we apply WSD and inject sense embeddings to a

standard neural classifier?

Page 76: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

76

Tasks: Topic categorization and sentiment analysis (polarity detection)

Topic categorization: Given a text, assign it a topic (e.g. politics, sports, etc.).

Sentiment analysis: Predict the sentiment of the sentence/review as either positive or negative.

Page 77: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

77

Classification model

Standard CNN classifier

inspired by Kim (2014)

Page 78: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

78

Classification model

Standard CNN classifier

inspired by Kim (2014)

Inject senses embeddings as input!

Page 79: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

79

Sense-based vs. word-based: Conclusions

Sense-based better than word-based... when the input text is large enough

Page 80: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

80

Sense-based vs. word-based:

Sense-based better than word-based... when the input text is large enough:

Page 81: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

81

Contextualized word embeddings ELMo BERT

Peters et al. (NAACL 2018)

Devlin et al. (NAACL 2019)

Page 82: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

82

Contextualized word embeddings ELMo BERT

Peters et al. (NAACL 2018)

Devlin et al. (NAACL 2019)

Based on LSTMs

Based onTransformers

Page 83: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

83

Contextualized word embeddings ELMo BERT

Peters et al. (NAACL 2018)

Devlin et al. (NAACL 2019)

Based on LSTMs

Based onTransformers

More successful nowadays

Page 84: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

84

Contextualized word embeddingsELMo/BERT

Page 85: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

85

Play with Transformers Text generation

https://transformer.huggingface.co

Page 86: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

86

Contextualized word embeddingsELMo/BERT

As word embeddings, learned by leveraging language models on massive amounts of text corpora.

Page 87: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

87

Contextualized word embeddingsELMo/BERT

As word embeddings, learned by leveraging language models on massive amounts of text corpora.

New: each word vector depends on the context. It is dynamic.

Page 88: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

88

Contextualized word embeddingsELMo/BERT

As word embeddings, learned by leveraging language models on massive amounts of text corpora.

New: each word vector depends on the context. It is dynamic.

Important improvements in many NLP tasks.

Page 89: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

89

Contextualized word embeddingsELMo/BERT (examples)

He withdrew money from the bank.

The bank remained closed yesterday.

We found a nice spot by the bank of the river.

Page 90: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

90

Contextualized word embeddingsELMo/BERT (examples)

He withdrew money from the bank.

The bank remained closed yesterday.

We found a nice spot by the bank of the river.

0.25, 0.32, -0.1 ….

0.22, 0.30, -0.08 ….

-0.8, 0.01, 0.3 ….

Page 91: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

91

Contextualized word embeddingsELMo/BERT (examples)

He withdrew money from the bank.

The bank remained closed yesterday.

We found a nice spot by the bank of the river.

0.25, 0.32, -0.1 ….

0.22, 0.30, -0.08 ….

-0.8, 0.01, 0.3 ….

Similar vectors

Page 92: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

92

Contextualized word embeddingsin applications

Page 93: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

How well do these models capture “meaning”?

(Bouraoui et al. AAAI 2020)

93

Page 94: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

94

How well do these models capture “meaning”?

(Bouraoui et al. AAAI 2020)

Page 95: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

95

How well do these models capture “meaning”?

(Bouraoui et al. AAAI 2020)

Page 96: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

96

How well do these models capture “meaning”?

Good enough for many applications.

Page 97: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

97

How well do these models capture “meaning”?

Good enough for many applications.

GLUELanguage understanding

benchmark

Human baselines!

Page 98: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

98

How well do these models capture “meaning”?

Good enough for many applications.

Room for improvement. For example, in SuperGLUE:

➢ Winograd Schema Challenge: BERT ~65% vs Humans ~95%

➢ Word-in-Context (WiC) Challenge: BERT ~68% vs Humans ~80%

Page 99: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

99

How well do these models capture “meaning”?

Good enough for many applications.

Room for improvement. For example, in SuperGLUE:

➢ Winograd Schema Challenge: BERT ~65% vs Humans ~95%

➢ Word-in-Context (WiC) Challenge: BERT ~68% vs Humans ~80%

requires commonsense reasoning

requires abstracting the notion of sense

Page 100: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

100

Word-in-Context (WiC) Challenge(Pilehvar and Camacho-Collados, NAACL 2019)

Task: Identify the most suitable meaning of a word in contextFramed as binary classification (True/False)

Examples:

There's a lot of trash on the bed of the river I keep a glass of water next to my bed when I sleep

He cashed a check at the bankThe bank is on the corner of Nassau and Witherspoon

WiC competition open in CodaLab: https://competitions.codalab.org/competitions/20010

True

False

Page 101: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

101

WiC Challenge

But still...

BERT is everywhere!

Page 102: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

102

For more information on meaning representations (embeddings):

❖ ACL 2016 Tutorial on “Semantic representations of word senses and concepts”:http://josecamachocollados.com/slides/Slides_ACL16Tutorial_SemanticRepresentation.pdf

❖ EACL 2017 workshop on “Sense, Concept and Entity Representations and their Applications”: https://sites.google.com/site/senseworkshop2017/

❖ NAACL 2018 Tutorial on “Interplay between lexical resources and NLP”: https://bitbucket.org/luisespinosa/lr-nlp/

❖ Blog post on “Word, Sense and Contextualized Embeddings”: https://medium.com/@josecamachocollados/how-to-represent-meaning-in-natural-language-processing-word-sense-and-contextualized-embeddings-bbe31bdab84a

❖ “From Word to Sense Embeddings: A Survey on Vector Representations of Meaning” (JAIR, Dec 2018): https://www.jair.org/index.php/jair/article/view/11259

Page 103: AI Wales, Cardiff, 6 February 2020 Word, Sense and ...josecamachocollados.com/slides/Presentation_AIWales...About me 5-year BSc in Mathematics and 2-year Masters in Natural Language

103

Thank you!

Questions please!

[email protected]@CamachoCollados

josecamachocollados.com