47
University of Texas at Austin Machine Learning Group Machine Learning Group Department of Computer Sciences University of Texas at Austin Learning to Transform Natural to Formal Languages July 13, 2005 Rohit J. Kate Yuk Wah Wong Raymond J. Mooney

Learning to Transform Natural to Formal Languages

  • Upload
    monty

  • View
    38

  • Download
    1

Embed Size (px)

DESCRIPTION

Learning to Transform Natural to Formal Languages. Rohit J. Kate Yuk Wah Wong Raymond J. Mooney. July 13, 2005. Introduction. Semantic Parsing : Transforming natural language sentences into executable complete formal representations - PowerPoint PPT Presentation

Citation preview

Page 1: Learning to Transform Natural to Formal Languages

University of Texas at Austin

Machine Learning Group

Machine Learning GroupDepartment of Computer Sciences

University of Texas at Austin

Learning to Transform Natural to Formal Languages

July 13, 2005

Rohit J. Kate Yuk Wah Wong Raymond J. Mooney

Page 2: Learning to Transform Natural to Formal Languages

2

Introduction

• Semantic Parsing: Transforming natural language sentences into executable complete formal representations

• Different from Semantic Role Labeling which involves only shallow semantic analysis

• Two application domains:– CLang: RoboCup Coach Language – GeoQuery: A Database Query Application

Page 3: Learning to Transform Natural to Formal Languages

3

CLang: RoboCup Coach Language

• In RoboCup Coach competition teams compete to coach simulated players

• The coaching instructions are given in a formal language called CLang

Simulated soccer field

Coach

CLang

If the ball is in our penalty area, then all our players except player 4 should stay in our half.

((bpos (penalty-area our))(do (player-except our{4}) (pos (half our)))

Semantic Parsing

Page 4: Learning to Transform Natural to Formal Languages

4

GeoQuery: A Database Query Application

• Query application for U.S. geography database containing about 800 facts [Zelle & Mooney, 1996]

User

How many cities are

there in the US?

Query answer(A, count(B, (city(B), loc(B, C), const(C, countryid(USA))),A))

Semantic Parsing

Page 5: Learning to Transform Natural to Formal Languages

5

Outline

• Semantic Parsing using Transformation Rules

• Learning Transformation Rules

• Experiments

• Conclusions

Page 6: Learning to Transform Natural to Formal Languages

6

Semantic Parsing using Transformation Rules

• SILT (Semantic Interpretation by Learning Transformations)

• Uses pattern-based transformation rules which map natural language phrases to formal language constructs

• Transformation rules are repeatedly applied to the sentence to construct its formal language expression

Page 7: Learning to Transform Natural to Formal Languages

7

Formal Language GrammarNL: If our player 4 has the ball, our player 4 should shoot.CLang: ((bowner our {4}) (do our {4} shoot)) CLang Parse:

• Non-terminals: RULE, CONDITION, ACTION…• Terminals: bowner, our, 4…• Productions: RULE CONDITION DIRECTIVE DIRECTIVE do TEAM UNUM ACTION ACTION shoot

RULE

CONDITION DIRECTIVE

do TEAM UNUM ACTIONbowner TEAM UNUM

our 4 our 4 shoot

Page 8: Learning to Transform Natural to Formal Languages

8

Transformation Rule Representation

• Rule has two components: a natural language pattern and an associated formal language template

• Two versions of SILT:– String-based rules: used to convert natural language

sentence directly to formal language– Tree-based rules: used to convert syntactic tree to formal

languageString-pattern TEAM UNUM has [1] ball

Template CONDITION (bowner TEAM {UNUM})Tree-pattern

Template CONDITION (bowner TEAM {UNUM})

NP VP

VBZ NP

DT NN

the ball

hasTEAM UNUM

S

word gap

Page 9: Learning to Transform Natural to Formal Languages

9

Example of Semantic Parsing

If our player 4 has the ball, our player 4 should shoot.

our

TEAM our

player 4

UNUM 4

shoot

ACTIONshoot

TEAM UNUM has [1] ball

CONDITION (bowner TEAM {UNUM})

TEAM UNUM should ACTION

DIRECTIVE (do TEAM {UNUM} ACTION)

If CONDITION, DIRECTIVE.

RULE (CONDITION DIRECTIVE)

Page 10: Learning to Transform Natural to Formal Languages

10

Example of Semantic Parsing

If player 4 has the ball, player 4 should shoot .

our

TEAM our

player 4

UNUM 4

shoot

ACTIONshoot

TEAM UNUM has [1] ball

CONDITION (bowner TEAM {UNUM})

TEAM UNUM should ACTION

DIRECTIVE (do TEAM {UNUM} ACTION)

If CONDITION, DIRECTIVE.

RULE (CONDITION DIRECTIVE)

our ourTEAM

our

TEAM

our

Page 11: Learning to Transform Natural to Formal Languages

11

Example of Semantic Parsing

If player 4 has the ball, player 4 should shoot .

our

TEAM our

player 4

UNUM 4

shoot

ACTIONshoot

TEAM UNUM has [1] ball

CONDITION (bowner TEAM {UNUM})

TEAM UNUM should ACTION

DIRECTIVE (do TEAM {UNUM} ACTION)

If CONDITION, DIRECTIVE.

RULE (CONDITION DIRECTIVE)

TEAM

our

TEAM

our

Page 12: Learning to Transform Natural to Formal Languages

12

Example of Semantic Parsing

If has the ball, should shoot .

our

TEAM our

player 4

UNUM 4

shoot

ACTIONshoot

TEAM UNUM has [1] ball

CONDITION (bowner TEAM {UNUM})

TEAM UNUM should ACTION

DIRECTIVE (do TEAM {UNUM} ACTION)

If CONDITION, DIRECTIVE.

RULE (CONDITION DIRECTIVE)

TEAM

our

TEAM

our

player 4 player 4UNUM

4

UNUM

4

Page 13: Learning to Transform Natural to Formal Languages

13

Example of Semantic Parsing

If has the ball, should shoot .

our

TEAM our

player 4

UNUM 4

shoot

ACTIONshoot

TEAM UNUM has [1] ball

CONDITION (bowner TEAM {UNUM})

TEAM UNUM should ACTION

DIRECTIVE (do TEAM {UNUM} ACTION)

If CONDITION, DIRECTIVE.

RULE (CONDITION DIRECTIVE)

TEAM

our

TEAM

our

UNUM

4

UNUM

4

Page 14: Learning to Transform Natural to Formal Languages

14

ACTION

shoot

Example of Semantic Parsing

If has the ball, should .

our

TEAM our

player 4

UNUM 4

shoot

ACTIONshoot

TEAM UNUM has [1] ball

CONDITION (bowner TEAM {UNUM})

TEAM UNUM should ACTION

DIRECTIVE (do TEAM {UNUM} ACTION)

If CONDITION, DIRECTIVE.

RULE (CONDITION DIRECTIVE)

TEAM

our

TEAM

our

UNUM

4

UNUM

4

shoot

Page 15: Learning to Transform Natural to Formal Languages

15

Example of Semantic Parsing

If has the ball, should .

our

TEAM our

player 4

UNUM 4

shoot

ACTIONshoot

TEAM UNUM has [1] ball

CONDITION (bowner TEAM {UNUM})

TEAM UNUM should ACTION

DIRECTIVE (do TEAM {UNUM} ACTION)

If CONDITION, DIRECTIVE.

RULE (CONDITION DIRECTIVE)

TEAM

our

TEAM

our

UNUM

4

UNUM

4

ACTION

shoot

Page 16: Learning to Transform Natural to Formal Languages

16

Example of Semantic Parsing

If , should .

our

TEAM our

player 4

UNUM 4

shoot

ACTIONshoot

TEAM UNUM has [1] ball

CONDITION (bowner TEAM {UNUM})

TEAM UNUM should ACTION

DIRECTIVE (do TEAM {UNUM} ACTION)

If CONDITION, DIRECTIVE.

RULE (CONDITION DIRECTIVE)

TEAM

our

TEAM

our

UNUM

4

UNUM

4

ACTION

shoot

has the ballCONDITION

(bowner our {4})

Page 17: Learning to Transform Natural to Formal Languages

17

Example of Semantic Parsing

If , should .

our

TEAM our

player 4

UNUM 4

shoot

ACTIONshoot

TEAM UNUM has [1] ball

CONDITION (bowner TEAM {UNUM})

TEAM UNUM should ACTION

DIRECTIVE (do TEAM {UNUM} ACTION)

If CONDITION, DIRECTIVE.

RULE (CONDITION DIRECTIVE)

TEAM

our

UNUM

4

ACTION

shoot

CONDITION

(bowner our {4})

Page 18: Learning to Transform Natural to Formal Languages

18

Example of Semantic Parsing

If , .

our

TEAM our

player 4

UNUM 4

shoot

ACTIONshoot

TEAM UNUM has [1] ball

CONDITION (bowner TEAM {UNUM})

TEAM UNUM should ACTION

DIRECTIVE (do TEAM {UNUM} ACTION)

If CONDITION, DIRECTIVE.

RULE (CONDITION DIRECTIVE)

TEAM

our

UNUM

4

ACTION

shoot

CONDITION

(bowner our {4})

shouldDIRECTIVE

(do our {4} shoot)

Page 19: Learning to Transform Natural to Formal Languages

19

Example of Semantic Parsing

If , .

our

TEAM our

player 4

UNUM 4

shoot

ACTIONshoot

TEAM UNUM has [1] ball

CONDITION (bowner TEAM {UNUM})

TEAM UNUM should ACTION

DIRECTIVE (do TEAM {UNUM} ACTION)

If CONDITION, DIRECTIVE.

RULE (CONDITION DIRECTIVE)

CONDITION

(bowner our {4})

DIRECTIVE

(do our {4} shoot)

Page 20: Learning to Transform Natural to Formal Languages

20

Example of Semantic Parsing

If , .

our

TEAM our

player 4

UNUM 4

shoot

ACTIONshoot

TEAM UNUM has [1] ball

CONDITION (bowner TEAM {UNUM})

TEAM UNUM should ACTION

DIRECTIVE (do TEAM {UNUM} ACTION)

If CONDITION, DIRECTIVE.

RULE (CONDITION DIRECTIVE)

CONDITION

(bowner our {4})

DIRECTIVE

(do our {4} shoot)

RULE

((bowner our {4}) (do our {4} shoot))

Page 21: Learning to Transform Natural to Formal Languages

21

Learning Transformation Rules

• SILT induces rules from a corpora of NL sentences paired with their formal representations

• Patterns are learned for each production by bottom-up rule learning

• For every production:– Call those sentences positives whose formal

representations’ parses use that production – Call the remaining sentences negatives

Page 22: Learning to Transform Natural to Formal Languages

22

Rule Learning for a Production

• SILT applies greedy-covering, bottom-up rule induction method that repeatedly generalizes positives until they start covering negatives

• The ball is in REGION , our player 7 is in REGION and no opponent is around our player 7 within 1.5 distance.

• If the ball is in REGION and not in REGION then player 3 should intercept the ball.

• During normal play if the ball is in the REGION then player 7 , 9 and 11 should dribble the ball to the REGION .

• When the play mode is normal and the ball is in the REGION then our player 2 should pass the ball to the REGION .

• All players except the goalie should pass the ball to REGION if it is in RP18.

• If the ball is inside rectangle ( -54 , -36 , 0 , 36 ) then player 10 should position itself at REGION with a ball attraction of REGION .

• Player 2 should pass the ball to REGION if it is in REGION .

• If our player 6 has the ball then he should take a shot on goal. • If player 4 has the ball , it should pass the ball to player 2 or 10. • If the condition DR5C3 is true , then player 2 , 3 , 7 and 8

should pass the ball to player 3. • During play on , if players 6 , 7 or 8 is in REGION , they should

pass the ball to players 9 , 10 or 11. • If "Clear_Condition" , players 2 , 3 , 7 or 5 should clear the ball

REGION . • If it is before the kick off , after our goal or after the opponent's

goal , position player 3 at REGION . • If the condition MDR4C9 is met , then players 4-6 should pass

the ball to player 9. • If Pass_11 then player 11 should pass to player 9 and no one

else.

CONDITION (bpos REGION)positives negatives

Page 23: Learning to Transform Natural to Formal Languages

23

Generalization of String Patterns

ACTION (pos REGION)

Pattern 1: Always position player UNUM at REGION .Pattern 2: Whenever the ball is in REGION, position player

UNUM near the REGION .

Find the highest scoring common subsequence:

)(*)()( gapswordofsumclengthcscore

Page 24: Learning to Transform Natural to Formal Languages

24

Generalization of String Patterns

ACTION (pos REGION)

Pattern 1: Always position player UNUM at REGION .Pattern 2: Whenever the ball is in REGION, position player

UNUM near the REGION .

Find the highest scoring common subsequence:

Generalization: position player UNUM [2] REGION .

)(*)()( gapswordofsumclengthcscore

Page 25: Learning to Transform Natural to Formal Languages

25

Generalization of Tree Patterns

REGION (penalty-area TEAM)

Pattern 1: Pattern 2

Find common subgraphs.

NP

NP NN NN

TEAM POS penalty box

’s

NP

PRP$ NN NN

TEAM penalty area

Page 26: Learning to Transform Natural to Formal Languages

26

Generalization of Tree Patterns

REGION (penalty-area TEAM)

Pattern 1: Pattern 2

Find common subgraphs.

NP

NP NN NN

TEAM POS penalty box

’s

NP

PRP$ NN NN

TEAM penalty area

NP

TEAM

NN

penalty

NNGeneralization:

*

Page 27: Learning to Transform Natural to Formal Languages

27

Rule Learning for a Production

• If our player 6 has the ball then he should take a shot on goal. • If player 4 has the ball , it should pass the ball to player 2 or 10. • If the condition DR5C3 is true , then player 2 , 3 , 7 and 8

should pass the ball to player 3. • During play on , if players 6 , 7 or 8 is in REGION , they should

pass the ball to players 9 , 10 or 11. • If "Clear_Condition" , players 2 , 3 , 7 or 5 should clear the ball

REGION . • If it is before the kick off , after our goal or after the opponent's

goal , position player 3 at REGION . • If the condition MDR4C9 is met , then players 4-6 should pass

the ball to player 9. • If Pass_11 then player 11 should pass to player 9 and no one

else.

CONDITION (bpos REGION)positives negatives

Bottom-up Rule Learner

ball is [2] REGION

CONDITION (bpos REGION)

it is in REGION

CONDITION (bpos REGION)

• The ball is in REGION , our player 7 is in REGION and no opponent is around our player 7 within 1.5 distance.

• If the ball is in REGION and not in REGION then player 3 should intercept the ball.

• During normal play if the ball is in the REGION then player 7 , 9 and 11 should dribble the ball to the REGION .

• When the play mode is normal and the ball is in the REGION then our player 2 should pass the ball to the REGION .

• All players except the goalie should pass the ball to REGION if it is in REGION.

• If the ball is inside REGION then player 10 should position itself at REGION with a ball attraction of REGION .

• Player 2 should pass the ball to REGION if it is in REGION .

Page 28: Learning to Transform Natural to Formal Languages

28

Rule Learning for a Production

• If our player 6 has the ball then he should take a shot on goal. • If player 4 has the ball , it should pass the ball to player 2 or 10. • If the condition DR5C3 is true , then player 2 , 3 , 7 and 8

should pass the ball to player 3. • During play on , if players 6 , 7 or 8 is in REGION , they should

pass the ball to players 9 , 10 or 11. • If "Clear_Condition" , players 2 , 3 , 7 or 5 should clear the ball

REGION . • If it is before the kick off , after our goal or after the opponent's

goal , position player 3 at REGION . • If the condition MDR4C9 is met , then players 4-6 should pass

the ball to player 9. • If Pass_11 then player 11 should pass to player 9 and no one

else.

CONDITION (bpos REGION)positives negatives

Bottom-up Rule Learner

ball is [2] REGION

CONDITION (bpos REGION)

it is in REGION

CONDITION (bpos REGION)

• The CONDITION , our player 7 is in REGION and no opponent is around our player 7 within 1.5 distance.

• If the CONDITION and not in REGION then player 3 should intercept the ball.

• During normal play if the CONDITION then player 7 , 9 and 11 should dribble the ball to the REGION .

• When the play mode is normal and the CONDITION then our player 2 should pass the ball to the REGION .

• All players except the goalie should pass the ball to REGION if CONDITION.

• If the CONDITION then player 10 should position itself at REGION with a ball attraction of REGION .

• Player 2 should pass the ball to REGION if CONDITION .

Page 29: Learning to Transform Natural to Formal Languages

29

Rule Learning for All Productions

• Transformation rules for productions should cooperate globally to generate complete semantic parses

• Redundantly cover every positive example by β = 5 best rules

• Find the subset of these rules which best cooperate to generate complete semantic parses on the training data

)()()(*)()(

rnegrposrposrposrgoodness

coverage accuracy

Page 30: Learning to Transform Natural to Formal Languages

30

Experimental Corpora

• CLang – 300 randomly selected pieces of coaching advice from

the log files of the 2003 RoboCup Coach Competition– 22.52 words on average in NL sentences– 14.24 tokens on average in formal expressions

• GeoQuery [Zelle & Mooney, 1996] – 250 queries for the given U.S. geography database– 6.87 words on average in NL sentences– 5.32 tokens on average in formal expressions

Page 31: Learning to Transform Natural to Formal Languages

31

Experimental Methodology

• Evaluated using standard 10-fold cross validation• Syntactic parses needed by tree-based version were

obtained by training Collins’ parser [Bikel, 2004] on WSJ treebank and gold-standard parses of training sentences

• Correctness– CLang: output exactly matches the correct representation– Geoquery: the resulting query retrieves the same answer as the

correct representation

• Metrics

| || |

ParsesCompletedParsesCompletedCorrectPrecision

||SentencesParses|Completed|CorrectRecall

Page 32: Learning to Transform Natural to Formal Languages

32

Compared Systems

• CHILL – Learns control rules for shift-reduce parsing using

Inductive Logic Programming (ILP)– CHILLIN [Zelle & Mooney, 1996]

– COCKTAIL [Tang & Mooney, 2001]

• GEOBASE– Hand-built parser for GeoQuery [Borland International,

1988]

Page 33: Learning to Transform Natural to Formal Languages

33

Precision Learning Curves for CLang

Page 34: Learning to Transform Natural to Formal Languages

34

Recall Learning Curves for CLang

Page 35: Learning to Transform Natural to Formal Languages

35

Precision Learning Curves for GeoQuery

Page 36: Learning to Transform Natural to Formal Languages

36

Recall Learning Curves for GeoQuery

Page 37: Learning to Transform Natural to Formal Languages

37

Related Work

• SCISSOR [Ge & Mooney, 2005]

– Integrates semantic and syntactic statistical parsing– Requires extensive annotations but gives better results

• PRECISE [Popescu et al., 2003]

– Designed to work specially on NL database interfaces

• Speech Recognition Community [Zue & Glass, 2000]

– Simpler queries in ATIS corpus

Page 38: Learning to Transform Natural to Formal Languages

38

Conclusions

• New approach for semantic parsing, SILT, which uses transformation rules

• SILT learns transformation rules by doing bottom-up rule induction exploiting the target language grammar

• Tested on two very different domains, performs better than previous ILP-based approaches

Page 39: Learning to Transform Natural to Formal Languages

39

Thank You!

Our corpora can be downloaded from: http://www.cs.utexas.edu/~ml/nldata.html

Questions??

Page 40: Learning to Transform Natural to Formal Languages

40

F-measure Learning Curves for CLang

Page 41: Learning to Transform Natural to Formal Languages

41

F-measure Learning Curves for GeoQuery

Page 42: Learning to Transform Natural to Formal Languages

42

Extra Slide: Average Training Time in Minutes

CLang GeoQuery

SILT-string 3.2 0.35

CHILLIN 10.4 6.3

SILT-tree 81.4 21.5

COCKTAIL _ 39.6

Page 43: Learning to Transform Natural to Formal Languages

43

Extra Slide: Variations of Rule Representation

• Context in the patterns:

in REGION

CONDITION (bpos REGION)

Page 44: Learning to Transform Natural to Formal Languages

44

Extra Slide: Variations of Rule Representation

• Context in the patterns:

the ball in REGION

CONDITION (bpos REGION)

TEAM UNUM has [1] ball

CONDITION (bowner TEAM {UNUM})

in REGIONCONDITION

(bpos REGION)

TEAM UNUM has the ball

Page 45: Learning to Transform Natural to Formal Languages

45

Extra Slide: Variations of Rule Representation

• Context in the patterns: • Templates with multiple productions:

TEAM UNUM has the ball in REGION

CONDITION (and (bwoner TEAM UNUM) (bpos REGION))

Page 46: Learning to Transform Natural to Formal Languages

46

((bpos (penalty-area opp))(do (player-except our{4}) (pos (half our)))

Extra Slide: Experimental Methodology

• Correctness– CLang: output exactly matches the correct

representation– Geoquery: the resulting query retrieves the same

answer as the correct representation

If the ball is in our penalty area, all our players except player 4 should stay in our half.

((bpos (penalty-area our))(do (player-except our{4}) (pos (half our)))

Correct:

Output:

Page 47: Learning to Transform Natural to Formal Languages

47

Extra Slide: Future Work

• Hard-matching symbolic patterns are sometimes too brittle, exploit string and tree kernels as classifiers [Lodhi et al., 2002]

• Unified implementation of string and tree-based versions for direct comparisons