CS460/626 : Natural Language Processing/Speech, NLP …pb/cs626-460-2011/cs626-460-lect37... ·...

Preview:

Citation preview

CS460/626 : Natural Language Processing/Speech, NLP and the Web

(Lecture 37– Semantics; Universal Networking Language)

Pushpak BhattacharyyaCSE Dept., IIT Bombay

12th April, 2011

Semantics: wikipedia

•Semantics (from Greek sēmantiká, neuter plural of sēmantikós) is the study of meaning.

•It typically focuses on the relation •It typically focuses on the relation between signifiers, such as words, phrases, signs and symbols, and what they stand for, their denotata.

Computational Semantics: wikipedia

•Computational semantics is the study of how to automate the process of constructing and reasoning with meaning representations of natural language expressions.

•Some traditional topics of interest are: construction of meaning representations, semantic underspecification, anaphora resolution, presupposition projection, and quantifier scope resolution.

•Methods employed usually draw from formal semantics or statistical semantics.

•Computational semantics has points of contact with the areas of lexical semantics (word sense disambiguation and semantic role labeling), discourse semantics, knowledge representation and automated reasoning (in particular, automated theorem proving).

•Since 1999 there has been an ACL special interest group on computational semantics, SIGSEM.

A hurdle: signifier-denotatadichotomy

� Divide between a word and what it stands for

� “red” is NOT red in colour� “red” is NOT red in colour

� “red wine”, “red rose”, “he is in the red” denote very different sense of the word

� Translation into another language reveals this difference

A Perpective

Semantics

Pragmatics

Discourse

Morphology

Lexicon

Syntax

Semantics

Our tryst with semantics:

Universal Networking Language (UNL)

Motivation

� Extraction of semantics, i.e., deep meaning is important for many applications.� Machine Translation, Meaning-based IR, CLIRMachine Translation, Meaning-based IR, CLIR

� Robust, scalable & efficient methods of knowledge extraction required

� Machine Translation and Cross Lingual IR: a need of the hour for crossing language barrier

7

Interlingua: a vehicle for machine translation

EnglishHindi

Interlingua(UNL)

FrenchChinese

generation

Analysis

8

UNL: a United Nations project

� Started in 1996� 10 year program� 15 research groups across continents� First goal: generators� Next goal: analysers (needs solving various ambiguity

problems)problems)� Current active language groups

� UNL_French (GETA-CLIPS, IMAG)� UNL_English+Hindi� UNL_Italian (Univ. of Pisa)� UNL_Portugese (Univ of Sao Paolo, Brazil)� UNL_Russian (Institute of Linguistics, Moscow)� UNL_Spanish (UPM, Madrid)

9

World-wide Universal Networking Language (UNL) Project

UNL

English Russian

Marathi

10

Japanese

Hindi

Spanish

� Language independent meaning representation.

Others

The UNL MT System: an Overview

11

NLP@IITB

12

Foundations and Applications

� UNL Foundations� Semantic Relations

� Universal Words

� Attributes

� How to write UNL expressionsHow to write UNL expressions

� UNL Applications� Machine Translation: Rule based and Statistical

� Search

� Text Entailment

� Sentiment Analysis

13

LanguageProcessing & Understanding

Information Extraction:Part of Speech taggingNamed EntityRecognition

Shallow ParsingSummarization

IR:Cross Lingual SearchCrawlingIndexingMultilingual Relevance Feedback

Machine Learning:Semantic Role labelingSentiment Analysis

Text Entailment(web 2.0 applications)

Using graphical models, support

vector machines, neural networks

Machine Translation:StatisticalInterlingua BasedEnglish�Indianlanguages

Indianlanguages�IndianlanguagesIndowordnet

Resources: http://www.cfilt.iitb.ac.inPublications: http://www.cse.iitb.ac.in/~pb

Linguistics is the eye and computation thebody

UNL represents knowledge: John eats rice with a spoon

Semantic relations

attributes

Universal words

Repositoryof 42SemanticRelations

and84 attributelabels

15

Sentence embeddings

Deepa claimed that she had composed a poem.

[UNL]

agt(claim.@entry.@past, Deepa)agt(claim.@entry.@past, Deepa)

obj(claim.@entry.@past, :01)

agt:01(compose.@past.@entry.@complete, she)

obj:01(compose.@past.@entry.@complete, poem.@indef)

[\UNL]

16

Constituents of Universal Networking Language

� Universal Words (UWs)

� Relations

� Attributes

17

� Attributes

� Knowledge Base

UNL Graph

@ entry @ pastforward(icl>send)

He forwarded the mail to the minister.

18

obj

agt

minister(icl>person)

mail(icl>collection)

he(icl>person)

@def

@def

gol

UNL Expression

agt (forward(icl>send).@ entry @ past, he(icl>person))

19

obj (forward(icl>send).@ entry @ past, minister(icl>person))

gol (forward(icl>send ).@ entry @ past, mail(icl>collection). @def)

What is a Universal Word (UW)?

� Words of UNL

� Constitute the UNL vocabulary, the syntactic-semantic units to form UNL expressions

� A UW represents a concept

Basic UW (an English word/compound word/phrase

20

� Basic UW (an English word/compound word/phrasewith no restrictions or Constraint List)

� Restricted UW (with a Constraint List )

� Examples:

“crane(icl>device)”

“crane(icl>bird)”

The Lexicon

Format of the dictionary entry

e.g., [minister] {} “minister(icl>person)” (N,ANIMT,PHSCL,PRSN);

� Head word

[headword] {} “Universal word“ (Attribute list);

21

Head word

� Universal word

� Attributes

� Morphological - Pl(plural), V_ed(past tense form)

� Syntactic - V(verb),VOA(verb of action)

� Semantic - ANIMT(animate), PLACE, TIME

The Lexicon (cntd)

Content words:

[forward] {} “forward(icl>send)” (V,VOA) <E,0,0>;

He forwarded the mail to the minister.

22

[forward] {} “forward(icl>send)” (V,VOA) <E,0,0>;

[mail] {} “mail(icl>message)” (N,PHSCL,INANI) <E,0,0>;

[minister] {} “minister(icl>person)” (N,ANIMT,PHSCL,PRSN) <E,0,0>;

Headword Universal Word Attributes

The Lexicon (cntd)

function words:

[he] {} “he” (PRON,SUB,SING,3RD) <E,0,0>;

He forwarded the mail to the minister.

23

<E,0,0>;

[the] {} “the” (ART,THE) <E,0,0>;

[to] {} “to” (PRE,#TO) <E,0,0>;

Headword Universal Word

Attributes

Hindi example: स�ंा का उदाहरण १/२

साव�भौमशदमु�य शद

farmer(icl>creator)farmerN,ANIMT,FAUNA,MML,PRSN

E

गणु

farmer(icl>creator)farmer

शेतकर

कसानN,M,ANIMT,FAUNA,MML,PRSN,Na

N,ANIMT,FAUNA,MML,PRSN

M

H

N,M,ANIMT,FAUNA,MML,PRSN

The Features of a UW

� Every concept existing in any language must correspond to a UW

� The constraint list should be as small as

25

� The constraint list should be as small as necessary to disambiguate the headword

� Every UW should be defined in the UNL Knowledge-Base

Restricted UWs

� Examples

� He will hold office until the spring of next year.

� The spring was broken.

26

� Restricted UWs, which are Headwords with a constraint list, for example:

“spring(icl>season)”

“spring(icl>device)”

“spring(icl>jump)”

“spring(icl>fountain)”

How to create UWs?

� Pick up a concept� the concept of “crane" as "a device for lifting heavy loads” or

as “a long-legged bird that wade in water in search of food”

27

search of food”

� Choose an English word for the concept.� In the case for “crane", since it is a word of English, the corresponding word should be ‘crane'

� Choose a constraint list for the word.� [ ] ‘crane(icl>device)'� [ ] ‘crane(icl>bird)'

How to create UNL expressions

English sentences: basic structure

� A <verb> B

� John eats bread

� agt(eat.@entry, John)

obj(eat.@entry, bread)

R2

verb

BA

R1

R2

� obj(eat.@entry, bread)

� A <verb>

� John sleeps

� aoj(sleep.@entry, John)

� A <be> B

� John is good

� aoj(good.@entry, John)

verb

A

R1

B

A

aoj

Hindi sentences: basic structure

� A B <verb>

� John roti khaataa hai

� agt(eat.@entry, John)

� obj(eat.@entry, bread)

A <verb>

R2

verb

BA

R1

R2

� A <verb>

� John sotaa hai

� aoj(sleep.@entry, John)

� A <be> B

� John acchaa hai

� aoj(good.@entry, John)

verb

A

R1

B

A

aoj

:02

Complex English sentences: Use recursion on the basic structure

A <verb> B

� John who is a good boy eats bread which is toasted

agt(eat.@entry, :01)

eat

:02:01

agt obj

:02:01� agt(eat.@entry, :01)

� obj(eat.@entry, :02)

� aoj:01(boy, John.@entry)

� mod:01(boy, good)

� obj:01(toast, bread.@entry.@focus)

boy

John

aoj

toast

Bread

obj

good

mod

Red arrows indicate entry nodes

Recommended