16
COGEX at the Second RTE Marta Tatu, Brandon Iles, John Slavick, Adrian Novischi, Dan Moldovan Language Computer Corporation April 10 th , 2006

COGEX at the Second RTE

Embed Size (px)

DESCRIPTION

COGEX at the Second RTE. Marta Tatu, Brandon Iles, John Slavick, Adrian Novischi, Dan Moldovan Language Computer Corporation April 10 th , 2006. LCC’s Submission to RTE2. Linear combination of three entailment scores COGEX with constituency parse tree-derived logic forms - PowerPoint PPT Presentation

Citation preview

Page 1: COGEX at the Second RTE

COGEX at the Second RTE

Marta Tatu, Brandon Iles, John Slavick, Adrian Novischi, Dan Moldovan

Language Computer CorporationApril 10th, 2006

Page 2: COGEX at the Second RTE

COGEX@RTE2 2

LCC’s Submission to RTE2

Linear combination of three entailment scores

1. COGEX with constituency parse tree-derived logic forms2. COGEX with dependency parse tree-derived logic forms3. Lexical alignment between T and H

For each pair i (Ti,Hi)

If then Ti entails Hi

Lambda (λ) parameters learned on the development data for each task (IE, IR, QA, SUM)

5.0)()()( iscoreλiscoreλiscore LexAlignLexAlignCOGEXCOGEXCOGEXCOGEX DDCC

Page 3: COGEX at the Second RTE

COGEX@RTE2 3

Approach to RTE with COGEX

Transform the two text fragments into 3-layered logic forms Syntactic Semantic Temporal

Automatically create axioms to be used during the proof Lexical Chains axioms World Knowledge axioms Linguistic transformation axioms

Load COGEX’s SOS with T and H and its USABLE list of clauses with the generated axioms,

Search for a proof by iteratively removing clauses from SOS and searching the USABLE for possible inferences until a refutation is found If no contradiction is detected

Relax arguments Drop entire predicates from H

Compute proof score

semantic and temporal axioms

Page 4: COGEX at the Second RTE

COGEX@RTE2 4

COGEX Enhancements (1/3)

Logic Form Transformation

Negations not_RB(x1,e1) & walk_VB(e1,x2,x3) » -

walk_VB(e1,x2,x3)

not_RB(x1,e1) & walk_VB(e1,x2,x3) & fast_RB(x4,e1) » -fast_RB(x4,e1)

no/DT case_NN(x1) & confirm_VB(e1,x2,x1) » -confirm_VB(e1,x2,x1)

Page 5: COGEX at the Second RTE

COGEX@RTE2 5

COGEX Enhancements (1/3)

Logic Form Transformation Temporal normalization of date/time predicates

13th of January 1990 vs. January 13th, 1990

13th_of_January_1990_NN(x1) vs. January_13th_1990_NN(x1)

time_TMP(BeginFN(x1), year, month, day, hour, minute, second) & time_TMP(EndFN(x1), year, month, day, hour, minute, second)

time_TMP(BeginFN(x1), 1990, 1, 13, 0, 0, 0) & time_TMP(EndFN(x1), 1990, 1, 13, 23, 59, 59)

Page 6: COGEX at the Second RTE

COGEX@RTE2 6

COGEX Enhancements (1/3)

Logic Form Transformation

Temporal context SUMO predicates (Clark et al., 2005)

(S,E1,E2) : S is the temporal signal linking two events E1 and E2

during_TMP(e1,x1), earlier_TMP(e1,x1), …

Page 7: COGEX at the Second RTE

COGEX@RTE2 7

Logic Forms Differences

Generate LF from two different sources Constituency parse of the data Dependency parse trees (data provided by the

challenge organizers)

Constituency DependencySemantic informationTemporal information

Captures better the (long-range) syntactic dependenciesTemporal normalization (only)NEs imported from the constituency LF whenever the tokens matched (no control over tokenization)

Page 8: COGEX at the Second RTE

COGEX@RTE2 8

Logic Forms Differences

Gilda Flores was kidnapped on the 13th of January 1990.

Constituency: Gilda_NN(x1) & Flores_NN(x2) & nn_NNC(x3,x1,x2) & _human_NE(x3) & kidnap_VB(e1,x9,x3) & on_IN(e1,x8) & 13th_NN(x4) & of_NN(x5) & January_NN(x6) & 1990_NN(x7) & nn_ NNC(x8,x4,x5,x6,x7) & _date_NE(x8) & THM_SR(x3,e1) & TMP_SR(x8,e1) & time_TMP(BeginFN(x1), 1990, 1, 13, 0, 0, 0) & time_TMP(EndFN(x1), 1990, 1, 13, 23, 59, 59) & during_TMP(e1,x8)

Dependency: Gilda_Flores_NN(x2) & _human_NE(x2) & kidnap_VB(e1,x4,x2) & on_IN(e1,x3) & 13th_NN(x3) & of_IN(x3,x1) & January_1990_NN(x1)

Page 9: COGEX at the Second RTE

COGEX@RTE2 9

COGEX Enhancements (2/3)

Axioms on Demand Lexical Chains

Consider the first k=3 senses for each word

Maximum length of a lexical chain = 3

DERIVATIONAL WordNet relation is ambiguous with respect to the role of the noun

Derivation-ACT: employ_VB(e1,x1,x2) → employment_NN(e1)

Derivation-AGENT: employ_VB(e1,x1,x2) → employer_NN(x1)

Derivation-THEME: employ_VB(e1,x1,x2) → employee_NN(x2)

Morphological derivations between adjectives and verbs

Page 10: COGEX at the Second RTE

COGEX@RTE2 10

COGEX Enhancements (2/3)

Axioms on Demand Lexical Chains

Augment with the NE predicate for NE target concepts nicaraguan_JJ(x1,x2) → Nicaragua_NN(x1) &

_country_NE(x1) Discard lexical chains

with more than 2 HYPONYMY relations (H too specific) with a HYPONYMY followed by an ISA

Chicago_NN(x1) → Detroit_NN(x1) which include general concepts: object/NN, act/VB, be/VB

ni = number of hyponyms of concept ci

N = number of concepts in ci’s hierarchy

)1log(

)1log()(

N

ncWgenerality i

i

Page 11: COGEX at the Second RTE

COGEX@RTE2 11

More Axioms

Another 73 World Knowledge axioms

Semantic Calculus – combinations of two semantic relations (82 axioms) ISA, KINSHIP, CAUSE are transitive relations

ISA_SR(x1,x2) & PAH_SR(x3,x2) → PAH_SR(x3,x2) Mike is a rich man → Mike is rich

Temporal Reasoning Axioms (Clark et al., 2005) (65 axioms) Dates entail more general times

October 2000 → year 2000

during_TMP(e1,e2) & during_TMP(e2,e3) → during_TMP(e1,e3)

Page 12: COGEX at the Second RTE

COGEX@RTE2 12

COGEX Enhancements (3/3)

Proof Re-Scoring

(T) smart people → people (H)

(T) people → smart people (H)

Entities mentioned in T and H are existentially quantified

Universally quantified T and H entities

(T) people → smart people (H)

(T) smart people → people (H)

Page 13: COGEX at the Second RTE

COGEX@RTE2 13

Shallow Lexical Alignment

Compute the edit distance between T and H Cost (deletion of a word from T) = 0

Cost (replace of a word from T with another in H) = ∞

Cost (insert a word from H) =

Edit distance between synonyms = 0

T:The Council of

Europehas

45 member states.

Three countries from …

DEL INS DEL

H:The Council of

Europeis made up

by45 member

states.

otherwise 10

verbsWNfor 13

adv and adj nouns, WNfor 30

,

,

,

Page 14: COGEX at the Second RTE

COGEX@RTE2 14

Results

Learned parameters: IE: score given by COGEXC with some correction from COGEXD

IR: the highest contribution is made by LexAlign (~62%)

COGEXD better on IE, IR, QA (~69% accuracy)

COGEXC better on SUM (~66% accuracy)

Three-way combination outperforms any individual results and any two-system combination

Page 15: COGEX at the Second RTE

COGEX@RTE2 15

Results, Future Work

Higher accuracy on the SUM task SUM is the highest accuracy task for all systems

(false entailment pairs had H completely unrelated with the texts T)

IE: highest number of false positives Future enhancements

Other types of context: report, planning, etc. Need for more axioms

Automatic gathering of semantic axioms

Paraphrase acquisition (phrase1 → phrase2)

Page 16: COGEX at the Second RTE

Thank You !

Questions?