49
Natural Language Processing Chapter 4

Natural Language Processing Chapter 4. 323-670 Artificial IntelligenceChapter 42 NLP Language translation / multilingual translation Language understanding

Embed Size (px)

Citation preview

Page 1: Natural Language Processing Chapter 4. 323-670 Artificial IntelligenceChapter 42 NLP Language translation / multilingual translation Language understanding

Natural Language Processing

Chapter 4

Page 2: Natural Language Processing Chapter 4. 323-670 Artificial IntelligenceChapter 42 NLP Language translation / multilingual translation Language understanding

323-670 Artificial Intelligence Chapter 42

NLP

• Language translation / multilingual translation

• Language understanding– Figure 14.5 p. 365 Interaction

among component– Figure 14.6 p. 366 A speech

Waveform

Page 3: Natural Language Processing Chapter 4. 323-670 Artificial IntelligenceChapter 42 NLP Language translation / multilingual translation Language understanding

323-670 Artificial Intelligence Chapter 43

Figure14.5: More Interaction among Components

S

NP

VJohn

VP

NP PP

N

boy

saw DET

the

PP with a telescope

in the park

John saw the boy in the park with a telescope.

Page 4: Natural Language Processing Chapter 4. 323-670 Artificial IntelligenceChapter 42 NLP Language translation / multilingual translation Language understanding

323-670 Artificial Intelligence Chapter 44

Figure14.5: More Interaction among Components

S

NP

VJohn

VP

NP

PPN

boy

saw DET

the

PP

with a dogin the park

John saw the boy in the park with a dog.

Page 5: Natural Language Processing Chapter 4. 323-670 Artificial IntelligenceChapter 42 NLP Language translation / multilingual translation Language understanding

323-670 Artificial Intelligence Chapter 45

Figure14.5: More Interaction among Components

John saw the boy in the park with a statue.

S

NP

VJohn

VP

NP

N

boy

saw DET

the

PP

with a statue

in the park

Page 6: Natural Language Processing Chapter 4. 323-670 Artificial IntelligenceChapter 42 NLP Language translation / multilingual translation Language understanding

323-670 Artificial Intelligence Chapter 46

Figure14.6: Local Ambiguity in a Speech Problem

The cat scares all the birds away.

k a t s k a r s

A cat’s cares are few.

Page 7: Natural Language Processing Chapter 4. 323-670 Artificial IntelligenceChapter 42 NLP Language translation / multilingual translation Language understanding

323-670 Artificial Intelligence Chapter 47

The Problem: English sentences are incomplete descriptions of the information that they are intended to convey:

Some dogs are outside. I called Lynda to ask her

to the movies.She said she’ d love to

go.

Some dogs are on the lawn. She was home when I called. Three dogs are on the lawn. She answered the phone. Rover, Tripp, and Spot are I actually asked her. on the lawn.

The Good Side: Language allows speakers to be as vague or precise as they like. It also allows speakers to leave out things they believe their hearers already know.

Page 8: Natural Language Processing Chapter 4. 323-670 Artificial IntelligenceChapter 42 NLP Language translation / multilingual translation Language understanding

323-670 Artificial Intelligence Chapter 48

The Problem: The same expression means different things in different contexts:

Where’s the water? (in a chemistry lab, it must be pure)Where’s the water? (when you are thirsty, it must be potable)Where’s the water? (dealing with a leaky roof, it can be filthy)

The Good Side: Language lets us communicate about an infinite world using a finite (and thus earnable) number of symbols.

Page 9: Natural Language Processing Chapter 4. 323-670 Artificial IntelligenceChapter 42 NLP Language translation / multilingual translation Language understanding

323-670 Artificial Intelligence Chapter 49

The Problem: No natural language program canbe complete because new words, expressions, and meanings can be generated Quite freely:

I’ll fax it to you.

The Good Side: Language can evolve as the experiences that we want to communicate about evolve.

Page 10: Natural Language Processing Chapter 4. 323-670 Artificial IntelligenceChapter 42 NLP Language translation / multilingual translation Language understanding

323-670 Artificial Intelligence Chapter 410

The problem: There are lots of ways to say the same thing:

Mary was born on October 11.Mary’s birthday is October 11.

The Good Side: When you know a lot, facts imply each other. Language is intended to be used by agents who know a lot.

Figure 15.1: Features of Language That Mark It Both Difficult and Useful

Page 11: Natural Language Processing Chapter 4. 323-670 Artificial IntelligenceChapter 42 NLP Language translation / multilingual translation Language understanding

323-670 Artificial Intelligence Chapter 411

NLP Problems

• Figure 15.1 P. 378• English sentences are incomplete descriptions

of the information that are intended to convey. • The same expression means different things in

different context.• No natural language program can be complete

because of new words, expression, and meaning can be generated quite freely.

• There are lots of ways to say the same thing.

Page 12: Natural Language Processing Chapter 4. 323-670 Artificial IntelligenceChapter 42 NLP Language translation / multilingual translation Language understanding

323-670 Artificial Intelligence Chapter 412

NLP Problems

1) Processing written text– using lexical, syntactic, and semantic

knowledge of the language – require the real world information

2) Processing spoken language– using all information needed aboveplus additional knowledge about phonology– handle ambiguities in speech

Page 13: Natural Language Processing Chapter 4. 323-670 Artificial IntelligenceChapter 42 NLP Language translation / multilingual translation Language understanding

323-670 Artificial Intelligence Chapter 413

Step in NLP

1) Morphological Analysis2) Syntactic Analysis 3) Semantic Analysis4) Discourse Integration5) Pragmatic Analysis

– boundaries between these five phrases are often fuzzy.

Page 14: Natural Language Processing Chapter 4. 323-670 Artificial IntelligenceChapter 42 NLP Language translation / multilingual translation Language understanding

323-670 Artificial Intelligence Chapter 414

1. Morphological Analysis

• Individual words are analyzed into components

• Nonword tokens such as punctuation are separated from the words

• I want to print Bill’s .int file.

proper noun

possessive suffix

file extension

Page 15: Natural Language Processing Chapter 4. 323-670 Artificial IntelligenceChapter 42 NLP Language translation / multilingual translation Language understanding

323-670 Artificial Intelligence Chapter 415

2. Syntactic Analysis

• linear sequence of words are transformed into structures

• show how words relate to each other• English syntactic analyzer• If do not pass the syntactic analyzer

rejecte.g. (Boy the go to store the)

Page 16: Natural Language Processing Chapter 4. 323-670 Artificial IntelligenceChapter 42 NLP Language translation / multilingual translation Language understanding

323-670 Artificial Intelligence Chapter 416

• Example of syntactic analysis Figure 15.2 p. 382 RM2, RM5, RM5

• A knowledge base FragmentFigure 15.3 p. 383 User073, F1, Printing, File_Structure, WaitingMental Event/ Physical Event Animate/Event

• Partial meaning for a sentence Figure 15.4 p. 384

2. Syntactic Analysis

Page 17: Natural Language Processing Chapter 4. 323-670 Artificial IntelligenceChapter 42 NLP Language translation / multilingual translation Language understanding

323-670 Artificial Intelligence Chapter 417

Syntax The dog bites the man.

Page 18: Natural Language Processing Chapter 4. 323-670 Artificial IntelligenceChapter 42 NLP Language translation / multilingual translation Language understanding

323-670 Artificial Intelligence Chapter 418

Apply rule

Page 19: Natural Language Processing Chapter 4. 323-670 Artificial IntelligenceChapter 42 NLP Language translation / multilingual translation Language understanding

323-670 Artificial Intelligence Chapter 419

Parse Tree The man bits the dog.

Page 20: Natural Language Processing Chapter 4. 323-670 Artificial IntelligenceChapter 42 NLP Language translation / multilingual translation Language understanding

323-670 Artificial Intelligence Chapter 420

The dog likes a man.Parse Tree

Page 21: Natural Language Processing Chapter 4. 323-670 Artificial IntelligenceChapter 42 NLP Language translation / multilingual translation Language understanding

323-670 Artificial Intelligence Chapter 421

Internal Representative

Page 22: Natural Language Processing Chapter 4. 323-670 Artificial IntelligenceChapter 42 NLP Language translation / multilingual translation Language understanding

323-670 Artificial Intelligence Chapter 423

• Top-down Parsing – Begin with start symbol and apply the

grammar rules forward until the symbols at the terminals of the tree correspond to the components of the sentence being parsed.

• Bottom-up Parsing– Begin with the sentence to be parsed and

apply the grammar rules backward until a single tree whose terminals are the words of the sentence and whose top node is the start symbol has been produced.

Syntactic Processing (2)

Page 23: Natural Language Processing Chapter 4. 323-670 Artificial IntelligenceChapter 42 NLP Language translation / multilingual translation Language understanding

323-670 Artificial Intelligence Chapter 424

The man bits the dog.

Transition Network

Page 24: Natural Language Processing Chapter 4. 323-670 Artificial IntelligenceChapter 42 NLP Language translation / multilingual translation Language understanding

323-670 Artificial Intelligence Chapter 425

ATN : Augmented Transition Network

• similar to finite state machineFigure 15.8 p.392 An ATN networkFigure 15.9 p.393 An ATN Grammar in List Form

• sentence “The long file has printed.”S NP Q1 AUX Q3 V Q4 (F)

halt

NP Det Q6 Adj Q6 N Q7 (F) (S DCL (NP (FILE (LONG) DEFINITE))

HAS

(VP PRINTED)) p.394

Page 25: Natural Language Processing Chapter 4. 323-670 Artificial IntelligenceChapter 42 NLP Language translation / multilingual translation Language understanding

323-670 Artificial Intelligence Chapter 427

3. Semantic Analysis

• the structures created by the syntactic analyzer are assign meanings

• mapping between the syntactic structure and objects in the task domain

• If no mapping reject (colorless green ideas sleep furiously)• 1) It must map individual words into appropriate

objects in the knowledge base or database.• 2) It must create the correct structures to

correspond to the meanings of the individual words combine with each other.

Page 26: Natural Language Processing Chapter 4. 323-670 Artificial IntelligenceChapter 42 NLP Language translation / multilingual translation Language understanding

323-670 Artificial Intelligence Chapter 432

รู�ปแสดงผลการูวิ เครูาะห์�ทางวิากยส�มพั�นธ์�ของปรูะโยค “I want to print Bill’s .init file.”

Page 27: Natural Language Processing Chapter 4. 323-670 Artificial IntelligenceChapter 42 NLP Language translation / multilingual translation Language understanding

323-670 Artificial Intelligence Chapter 433

Page 28: Natural Language Processing Chapter 4. 323-670 Artificial IntelligenceChapter 42 NLP Language translation / multilingual translation Language understanding

323-670 Artificial Intelligence Chapter 434

Page 29: Natural Language Processing Chapter 4. 323-670 Artificial IntelligenceChapter 42 NLP Language translation / multilingual translation Language understanding

323-670 Artificial Intelligence Chapter 435

ผลการูวิ เครูาะห์�ทางควิามห์มายแสดงด�งรู�ป

Page 30: Natural Language Processing Chapter 4. 323-670 Artificial IntelligenceChapter 42 NLP Language translation / multilingual translation Language understanding

323-670 Artificial Intelligence Chapter 436

ผลส�ดท�ายท !จากการูวิ เครูาะห์�ทางปฏิ บั�ติ ค&อค'าส�!งในย�น กซ์�ท !ใช้�ผลส�ดท�ายท !จากการูวิ เครูาะห์�ทางปฏิ บั�ติ ค&อค'าส�!งในย�น กซ์�ท !ใช้�ส� !งย�น กซ์�พั มพั�ไฟล�ท !ติ�องการูส�!งย�น กซ์�พั มพั�ไฟล�ท !ติ�องการู lpr /wsmith/stuff.initlpr /wsmith/stuff.init

Page 31: Natural Language Processing Chapter 4. 323-670 Artificial IntelligenceChapter 42 NLP Language translation / multilingual translation Language understanding

323-670 Artificial Intelligence Chapter 442

4. Discourse Integration

• the meaning of the individual sentence may depend on the sentences that precede it and may influence the meanings of the sentences that follow it.

• (Ex. John want it.) “It” depends on the previous sentence.

• Current user who type word “I” is – User068 = Susan_Black

• We get F1 with filename in /wsmith/ directory

Page 32: Natural Language Processing Chapter 4. 323-670 Artificial IntelligenceChapter 42 NLP Language translation / multilingual translation Language understanding

323-670 Artificial Intelligence Chapter 443

5. Pragmatic Analysis

• The structure representing what was said is reinterpreted to determine what was actually meant.

• (Ex. Do you know what time it is?) we should understand what to do....Understand to decide what to do as a result

• Representing the intended meaning– Figure 15.5 P. 385

Page 33: Natural Language Processing Chapter 4. 323-670 Artificial IntelligenceChapter 42 NLP Language translation / multilingual translation Language understanding

Turbo Prolog

Page 34: Natural Language Processing Chapter 4. 323-670 Artificial IntelligenceChapter 42 NLP Language translation / multilingual translation Language understanding

323-670 Artificial Intelligence Chapter 447

ftp://172.28.80.6/older/DosProgram/TPROLOGAlt + Enter = Big ScreenF1 : HelpF2 : SaveF3 : LoadF6 : Next/SwitchF8 : Previous GoalF9 : CompileF10 : Step (For trace) / EndAlt + T : Trace ON/OFFSet up window size edit Use arrow key to adjust the

size

TURBO PROLOG

Page 35: Natural Language Processing Chapter 4. 323-670 Artificial IntelligenceChapter 42 NLP Language translation / multilingual translation Language understanding

323-670 Artificial Intelligence Chapter 448

Use the example from the EXAMPLE directory to try to program.

Start with EX03EX01.PROpredicates likes(symbol,symbol)

clauses likes(ellen, tennis). likes(john, football). likes(tom, baseball).

likes(eric, swimming) likes(mark, tennis). likes(bill, Activity) if likes(tom, Activity).

likes(mark, Activity) :- likes(ellen, Activity).

TURBO PROLOG

FACTS

RULES

Page 36: Natural Language Processing Chapter 4. 323-670 Artificial IntelligenceChapter 42 NLP Language translation / multilingual translation Language understanding

323-670 Artificial Intelligence Chapter 449

ARITHMETICArithmetic operators: +, -, *, /, mod, div

Relational operators: >, <, =, >=, <=, <>, ><

Functions: sin, cos, tan, arctan, ln, log, exp, sqrt, round, trunc, abs

EX: 1 + 2 = 2 + 1, X = 5/2, X = 5 mod 2, 5 <> 9

PROLOG.HELP

Page 37: Natural Language Processing Chapter 4. 323-670 Artificial IntelligenceChapter 42 NLP Language translation / multilingual translation Language understanding

323-670 Artificial Intelligence Chapter 450

char 1 byte characters

integer 2 byte integer numbers

real 8 byte floating point

numbers

symbol strings inserted in the

internal symbol table

string sequences of chars

"hello world\n"

PREDEFINED DOMAINS

Page 38: Natural Language Processing Chapter 4. 323-670 Artificial IntelligenceChapter 42 NLP Language translation / multilingual translation Language understanding

323-670 Artificial Intelligence Chapter 451

CONSTANTS const1 = definition const2 = definition

[GLOBAL] DOMAINS dom [,dom] = [reference] declaration1; declaration2 listdom = dom* dom = <basisdom>[GLOBAL] DATABASE [ - <databasename> ] [determ] pred1(....) pred2(.....)

GLOBAL PREDICATES [determ|nondeterm] pred1(.........)

-(i,i,o,..)(i,o,i,..) [ language c|pascal|fortran ] [ as "name" ] pred2(........)

PREDICATES [determ|nondeterm] pred1(.........) pred2(........)

CLAUSES p(....):-p1(...), p2(.....), ... . p(....):-p1(...), p2(.....), ... .

include "filename" Include a file during compilation.

SUMMARY OF PROGRAM SECTIONS

Page 39: Natural Language Processing Chapter 4. 323-670 Artificial IntelligenceChapter 42 NLP Language translation / multilingual translation Language understanding

323-670 Artificial Intelligence Chapter 452

random(RealVariable)(real) - (o)

random(MaxValue,RandomInt)(integer,integer) - (i,o)

sound(Duration,Frequency)(integer,integer) - (i,i)

beepdate(Year,Month,Day)

(integer,integer,integer) - (o,o,o) (i,i,i)time(Hours,Minutes,Seconds,Hundredths)

(integer,integer,integer,integer) - (o,o,o,o) (i,i,i,i)

trace(on/off)(string) - (i) (o)

MISCELLANEOUS

Page 40: Natural Language Processing Chapter 4. 323-670 Artificial IntelligenceChapter 42 NLP Language translation / multilingual translation Language understanding

323-670 Artificial Intelligence Chapter 453

trap (PredicateCall,ExitCode,Predicate

ToCallOnError)

exit

exit (ExitCode)

(integer) - (i)

if exit to DOS then the DOS errorlevel task processing variable will

contain the value given to the exit predicate.

break (on/off)

(string) - (i) (o)

ERROR & BREAK CONTROL

Page 41: Natural Language Processing Chapter 4. 323-670 Artificial IntelligenceChapter 42 NLP Language translation / multilingual translation Language understanding

323-670 Artificial Intelligence Chapter 454

display(String)

(string) - (i)

edit(InputString,OutputString)

(string,string) - (i,o)

edit(InputString,OutputString,Headstr,Headstr2,Msg,Pos,Helpfilename,

EditMode,Indent,Insert,TextMode,RetPos,RetStatus)

(string,string,string,string,string,integer,string,integer,integer,integer,integer,integer,integer)

- (i,o,i,i,i,i,i,i,i,i,i,o,o)

If the user saves the text from the editor, HeadStr2 will be used as the file name.

editmsg(InputString,OutputString,Headstr,Headstr2,Msg,Pos,Helpfilename,RetStatus)

(string,string,string,string,string,integer,string,integer) - (i,o,i,i,i,i,i,o)

EDITOR

Page 42: Natural Language Processing Chapter 4. 323-670 Artificial IntelligenceChapter 42 NLP Language translation / multilingual translation Language understanding

323-670 Artificial Intelligence Chapter 455

makewindow(WindowNo,ScrAtt,FrameAtt,Framestr,Row,Column,Height,Width)

(integer,integer,integer,string,integer,integer,integer,integer)

shiftwindow(WindowNo)

(integer) - (i) (o)

gotowindow(WindowNo)

(integer) - (i)

resizewindow(StartRow,NoOfRows,StartCol,NoOfCols)

(integer,integer,integer,integer) - (i,i,i,i)

colorsetup(Main_Frame)

(integer) - (i)

WINDOW SYSTEM

Page 43: Natural Language Processing Chapter 4. 323-670 Artificial IntelligenceChapter 42 NLP Language translation / multilingual translation Language understanding

323-670 Artificial Intelligence Chapter 456

readln(StringVariable)

(string) - (o)

readint(IntgVariable)

(integer) - (o)

readreal(RealVariable)

(real) - (o)

readchar(CharVariable)

(char) - (o)

keypressed

unreadchar(CharToBePushedBack)

(Char) - (i)

readterm( Domain, Variable )

(DomainName,Domain) - (i,_)

INPUT

Page 44: Natural Language Processing Chapter 4. 323-670 Artificial IntelligenceChapter 42 NLP Language translation / multilingual translation Language understanding

323-670 Artificial Intelligence Chapter 457

write( Variable|Constant * )

nl

writef( FormatString, Variable|Constant* )In the format string the following options are known after a percentage

sign:

%d Normal decimal number. (chars and integers)

%u As an unsigned integer. (chars and integers)

%R As a database reference number. (database reference numbers)

%X As a long hexadecimal number. (strings, database reference numb).

%x As a hexadecimal number. (chars and integers).

%s Strings. (symbols and strings).

%c As a char. (chars and integers).

%g Reals in shortest posible format (default for reals)

%e Reals in exponetial notation

%f Reals in fixed notation

%lf Only for C compatibility (fixed reals)

\n - newline

\t - tabulator

\nnn - character with code nnn

OUTPUT

Page 45: Natural Language Processing Chapter 4. 323-670 Artificial IntelligenceChapter 42 NLP Language translation / multilingual translation Language understanding

323-670 Artificial Intelligence Chapter 458

Natural Language Processing using prolog

Sentence :- Noun_phrase, Verb_phrase.

Noun_phrase :- Det, Noun.Noun_phrase :- Noun.

Verb_phrase :- Verb, Noun_phrase.Verb_phrase :- verb.

EX : The cat eats the fish. A man likes an apple.

Page 46: Natural Language Processing Chapter 4. 323-670 Artificial IntelligenceChapter 42 NLP Language translation / multilingual translation Language understanding

323-670 Artificial Intelligence Chapter 459

EX13EX04.pro NLP.prodomains sentence = s(noun_phrase,verb_phrase) noun_phrase = noun(noun) ; noun_phrase(detrm,noun) noun = string verb_phrase = verb(verb) ; verb_phrase(verb,noun_phrase) verb = string detrm = stringpredicates s_sentence(string,sentence) s_noun_phrase(string,string,noun_phrase) s_verb_phrase(string,verb_phrase) d(string) n(string) v(string) startgoal

start.

goal:

Please enter the sentence >

Bill eats apple

Page 47: Natural Language Processing Chapter 4. 323-670 Artificial IntelligenceChapter 42 NLP Language translation / multilingual translation Language understanding

323-670 Artificial Intelligence Chapter 460

clauses start :- write("\n Please enter a sentence > "), readln(Str), s_sentence(Str,s(_,_)). s_sentence(Str, s(N_Phrase,V_Phrase) ):- s_noun_phrase(Str, Rest, N_Phrase), s_verb_phrase(Rest, V_Phrase). s_noun_phrase(Str, Rest, noun_phrase(Detr,Noun)):- fronttoken(Str,Detr,Rest1), d(Detr), fronttoken(Rest1,Noun,Rest), n(Noun). s_noun_phrase(Str,Rest,noun(Noun)):- fronttoken(STR,Noun,Rest), n(Noun). s_verb_phrase(Str, verb_phrase(Verb,N_Phrase)):- fronttoken(Str,Verb,Rest1), v(Verb), s_noun_phrase(Rest1,"",N_Phrase). s_verb_phrase(Str,verb(Verb)):- fronttoken(STR,Verb,""), v(Verb).

EX13EX04.pro NLP.pro (cont)

Page 48: Natural Language Processing Chapter 4. 323-670 Artificial IntelligenceChapter 42 NLP Language translation / multilingual translation Language understanding

323-670 Artificial Intelligence Chapter 461

EX13EX04.pro NLP.pro (cont)/* determiner */ d("the"). d("a"). d("an")./* nouns */ n(“Bill"). n("dog"). n("cat"). n("fish"). n("ant"). n("apple"). n("man"). n("bus")./* verbs */ v("is"). v("eats"). v("likes"). v("takes").

The cat likes fish

A man takes a bus

Page 49: Natural Language Processing Chapter 4. 323-670 Artificial IntelligenceChapter 42 NLP Language translation / multilingual translation Language understanding

323-670 Artificial Intelligence Chapter 462

The End