24
1 Pertemuan 22 Natural Language Processing Syntactic Processing Matakuliah : T0264/Intelijensia Semu Tahun : Juli 2006 Versi : 2/2

1 Pertemuan 22 Natural Language Processing Syntactic Processing Matakuliah: T0264/Intelijensia Semu Tahun: Juli 2006 Versi: 2/2

Embed Size (px)

Citation preview

Page 1: 1 Pertemuan 22 Natural Language Processing Syntactic Processing Matakuliah: T0264/Intelijensia Semu Tahun: Juli 2006 Versi: 2/2

1

Pertemuan 22Natural Language Processing

Syntactic Processing

Matakuliah : T0264/Intelijensia Semu

Tahun : Juli 2006

Versi : 2/2

Page 2: 1 Pertemuan 22 Natural Language Processing Syntactic Processing Matakuliah: T0264/Intelijensia Semu Tahun: Juli 2006 Versi: 2/2

2

Learning Outcomes

Pada akhir pertemuan ini, diharapkan mahasiswa

akan mampu :

• << TIK-99 >>

• << TIK-99>>

Page 3: 1 Pertemuan 22 Natural Language Processing Syntactic Processing Matakuliah: T0264/Intelijensia Semu Tahun: Juli 2006 Versi: 2/2

3

Outline Materi

• Materi 1

• Materi 2

• Materi 3

• Materi 4

• Materi 5

Page 4: 1 Pertemuan 22 Natural Language Processing Syntactic Processing Matakuliah: T0264/Intelijensia Semu Tahun: Juli 2006 Versi: 2/2

4

15.2. Syntactic Proccesing

• Syntactic processing adalah tahapan yang mengerjakan konversi kalimat kedalam struktur hirarki yang berkaitan dengan arti kalimat secara tunggal.

• Proses ini disebut sebagai parsing. • Dua alasan penting adalah :

1. Proses semantic harus beroperasi pada pokok kalimat.

2. Memungkinkan harus menguraikan makna kalimat tanpa menggunakan tatabahasa.

Page 5: 1 Pertemuan 22 Natural Language Processing Syntactic Processing Matakuliah: T0264/Intelijensia Semu Tahun: Juli 2006 Versi: 2/2

5

Roles

• Constraint the number of constituents that semantic can consider. Since syntax is cheaper then semantics, this is cost effective.

• Force syntactically required interpretations, for example in distinguishing the meanings of:– The satellite orbited Mars.– Mars orbited the satellite.

Page 6: 1 Pertemuan 22 Natural Language Processing Syntactic Processing Matakuliah: T0264/Intelijensia Semu Tahun: Juli 2006 Versi: 2/2

6

Two Main Components

• Grammar

A declarative representation, called grammar, of the syntactic facts about the language.

• Parser

A procedure, called a parser, that compares the grammar against input sentences to produce parsed structures.

Page 7: 1 Pertemuan 22 Natural Language Processing Syntactic Processing Matakuliah: T0264/Intelijensia Semu Tahun: Juli 2006 Versi: 2/2

7

15.2.1. Grammar and Parser

A Simple Grammar for a Fragment of EnglishS NP VPNP the NP1NP PRONP PNNP NP1NP1 ADJS NADJS ADJ ADJSVP VVP V NPN file NP printerPN BillPRO IADJ short long fastV printed created want

Page 8: 1 Pertemuan 22 Natural Language Processing Syntactic Processing Matakuliah: T0264/Intelijensia Semu Tahun: Juli 2006 Versi: 2/2

8

A Parse Tree for s Sentence

“ Bill printed the file “

Page 9: 1 Pertemuan 22 Natural Language Processing Syntactic Processing Matakuliah: T0264/Intelijensia Semu Tahun: Juli 2006 Versi: 2/2

9

Ambiguity

Examples : “Have the students who missed the exam

take it today.” ” Have the students who missed the exam

taken it today ?”  “ The horse raced past the barn fell down.”

Page 10: 1 Pertemuan 22 Natural Language Processing Syntactic Processing Matakuliah: T0264/Intelijensia Semu Tahun: Juli 2006 Versi: 2/2

10

Parsing Strategies

• Top-Down

• Bottom-Up

• All Paths

• Best Path with Backtracking

• Best Path with Patchup

• Wait and See

Page 11: 1 Pertemuan 22 Natural Language Processing Syntactic Processing Matakuliah: T0264/Intelijensia Semu Tahun: Juli 2006 Versi: 2/2

11

Parsing Strategies

• Top-Down Parsing – Begin with the start symbol and apply the grammar rules forward until the symbols at the terminal of the tree correspond to the components of the sentence being parsed.

• Bottom-Up Parsing – Begin with the sentence to be parsed and apply the grammar rules backward until a single tree whose terminals are the word of the sentence and whose to node is the start symbol has been produced.

Page 12: 1 Pertemuan 22 Natural Language Processing Syntactic Processing Matakuliah: T0264/Intelijensia Semu Tahun: Juli 2006 Versi: 2/2

12

Parsing Strategies

• All Paths – Follow all possible path and build all the posiible intermediate components.

• Best Path with Backtracking – Follow only one path at a time, but record, at every choice point, the information that is necessary to make another choice if the chosen path fails to lead to a complete interpretation of the sentence.

Page 13: 1 Pertemuan 22 Natural Language Processing Syntactic Processing Matakuliah: T0264/Intelijensia Semu Tahun: Juli 2006 Versi: 2/2

13

Parsing Strategies

• Best Path with Patchup – Follow only one path at a time, but when an error is detected, explicitly shuffle around the components that have already been formed.

• Wait and See – Follow only one path, but rather than making decisions about the function of each component at it is encountered, procrastinate the decision until enough information is available to make the decision correctly.

Page 14: 1 Pertemuan 22 Natural Language Processing Syntactic Processing Matakuliah: T0264/Intelijensia Semu Tahun: Juli 2006 Versi: 2/2

14

15.2.2. Augmented Transition Network

• Augmented Transition Network is a top-down parsing procedure that allows various kinds of knowledge to be incorporated into the parsing system so it can operated efficiently.

• ATN in graphical notation :

“The long file has printed”• This execution proceeds as follows :

1. Begin in state S.

2. Push to NP.

3. Do a category test to see if “the” is a determiner.

4. This test succeeds, so set the DETERMINER register to DEFINITE and go to state Q6.

Page 15: 1 Pertemuan 22 Natural Language Processing Syntactic Processing Matakuliah: T0264/Intelijensia Semu Tahun: Juli 2006 Versi: 2/2

15

Augmented Transition Network

5. Do a category test to see if “long” is an adjective

6. This test succeeds, so append “long” to the list contained in the ADJS register. (This list was previously empty). Stay in state Q6.

7. Do a category test to see if “file” is an adjective. This test fails.

8. Do a category test to see if “file” is a noun. This test succeeds, so set the NOUN register to “file” and go to state Q7.

9. Push to PP.

10. Do a category test to see if “has” is a preposition. This test fails, so pop and signal failure.

Page 16: 1 Pertemuan 22 Natural Language Processing Syntactic Processing Matakuliah: T0264/Intelijensia Semu Tahun: Juli 2006 Versi: 2/2

16

Augmented Transition Network

11. There is nothing else that can be done from state Q7, so pop and return the structure ( NP ( FILE ( LONG ) DEFINITE ))

The return causes the machine to be in state Q1, with the SUBJ register set to the structure just returned and the type register set to DCL.

12. Do a category test to see if “has” is a verb. This test succeeds, so set the AUX register to NIL and set the V register to “has”. Go to state Q4.

13. Push to state NP. Since the next word, “printed”, is not determiner or proper noun, NP will pop and return failure.

14. The only other thing to do in state Q4 is to halt. But more input remains, so a complete parse has not been found. Backtracking is now required.

Page 17: 1 Pertemuan 22 Natural Language Processing Syntactic Processing Matakuliah: T0264/Intelijensia Semu Tahun: Juli 2006 Versi: 2/2

17

Augmented Transition Network

15. The last choice point was at state Q1, so return there. The register AUX and V must be unset.

16. Do a category test to see if “has” is an auxiliary. This test succeeds, so set the AUX register to “has” and go to state Q3.

17. Do a category test to see if “printed” is a verb. This test succeeds, so set the V register to “printed”. Go to state Q4.

18. Now, since the input is exhausted, Q4 is acceptable final state. Pop and return the structure ( S DCL (NP ( FILE ( LONG ) DEFINITE ))

HAS ( VP PRINTED)

This structure is the output of the parse.

Page 18: 1 Pertemuan 22 Natural Language Processing Syntactic Processing Matakuliah: T0264/Intelijensia Semu Tahun: Juli 2006 Versi: 2/2

18

An ATN Network for a Fragment of English

Page 19: 1 Pertemuan 22 Natural Language Processing Syntactic Processing Matakuliah: T0264/Intelijensia Semu Tahun: Juli 2006 Versi: 2/2

19

15.2.3. Unification Grammars

• Purely declarative representations

• Unification simultaneously performs two operations:

– Matching– Structure building, by combining

constituents

• Think of graphs as sets not lists, i.e., order doesn’t matter.

Page 20: 1 Pertemuan 22 Natural Language Processing Syntactic Processing Matakuliah: T0264/Intelijensia Semu Tahun: Juli 2006 Versi: 2/2

20

Unification Grammars contd’

• Lexical items as graphs:

• Nonterminal constituents as graphs:

[CAT: DET LEX: the ]

[CAT: N LEX: fileNUMBER: SING]

[NP: [DET: the HEAD: file

NUMBER: SING]

Page 21: 1 Pertemuan 22 Natural Language Processing Syntactic Processing Matakuliah: T0264/Intelijensia Semu Tahun: Juli 2006 Versi: 2/2

21

Unification Grammars contd’

• Grammar rules (e.g., NP DET N) as graphs:

[CONSTITUENT1: [CAT: DET LEX: {1}]

[CONSTITUENT2: [CAT: N LEX: {2}

NUMBER {3}]

[BUILD: [NP: [DET: {1} HEAD: {2}NUMBER {3}]]]

Page 22: 1 Pertemuan 22 Natural Language Processing Syntactic Processing Matakuliah: T0264/Intelijensia Semu Tahun: Juli 2006 Versi: 2/2

22

Algorithm : Unification Grammars

1. If either G1 or G2 is an attribute that is not itself an attribute-value pair then :

a. If the attributes conflict (as defined above), then fail.

b. If either is a variable, then bind it to the value of the other and return that value.

c. Otherwise, return the most general value that is consistent with both the original values. Specifically, is disjunction is allowed, then return the intersection of the values.

2. Otherwise, do : a. Set variable NEW to empty.

b. For each attribute A that is present (at the top level) in either G1 or G2 do :

Page 23: 1 Pertemuan 22 Natural Language Processing Syntactic Processing Matakuliah: T0264/Intelijensia Semu Tahun: Juli 2006 Versi: 2/2

23

Algorithm : Unification Grammars

(i) If A is not present at the top level in the other input, then add A its value to NEW

(ii) If it is, then call Graph-Unify with the two values for A. If that fail, then fail. Otherwise, take the new value of A to be the result of that unification and add A with is value to NEW.

c. If there are any labels attached to G1 or G2, then bind them to NEW and return NEW.

Page 24: 1 Pertemuan 22 Natural Language Processing Syntactic Processing Matakuliah: T0264/Intelijensia Semu Tahun: Juli 2006 Versi: 2/2

24

<< Closing >>

End of Pertemuan 22

Good Luck