35
Heuristic Inverse Subsumption in Full-clausal Theories Y. Yamamoto 1 , K. Inoue 2 and K. Iwanuma 1 1 University of Yamanashi 2 National Institute of Informatics Int. Conf. on Inductive Logic Programming (ILP2012) Dubrovnik

Heuristic Inverse Subsumption in Full-clausal Theoriesida.felk.cvut.cz/ilp2012/wp-content/uploads/ilp... · 2010: proposing a new form of inverse subsumption (IS) for complete explanatory

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

  • Heuristic Inverse Subsumption in Full-clausal Theories

    Y. Yamamoto1, K. Inoue2 and K. Iwanuma1 1 University of Yamanashi

    2 National Institute of Informatics

    Int. Conf. on Inductive Logic Programming (ILP2012)

    Dubrovnik

  • Motivation

    Progress in ILP 2010: proposing a new form of inverse subsumption (IS) for complete explanatory induction

    2011: embedding this complete IS into CF-induction and statistically characterizing the obtained hypotheses

    Question: how does the complete IS work well in practice?

    2012: for the empirical evaluation, we provide a heuristic IS algorithm available in full-clausal theories

    From inverse entailment to inverse subsumption

  • Contents

      Overview

      Lattice-search in Progol-like ILP systems

      Case study

      Empirical result

      Conclusion and future work

  • Problem setting ( explanatory induction ) Input:

    B : The (prior) background theory E : (Positive) examples/observations

    Task: Finding a hypothesis H such that B ∧ H ⊨ E, B ∧ H is consistent.

  • Problem setting ( explanatory induction ) Input:

    B : The (prior) background theory E : (Positive) examples/observations

    Task: Finding a hypothesis H such that B ∧ H ⊨ E, B ∧ H is consistent.

    Inverse Subsumption (IS) [Y. Yamamoto et al., 10] B ∧ ¬E F ¬H

  • Problem setting ( explanatory induction ) Input:

    B : The (prior) background theory E : (Positive) examples/observations

    Task: Finding a hypothesis H such that B ∧ H ⊨ E, B ∧ H is consistent.

    Inverse Subsumption (IS) [Y. Yamamoto et al., 10] B ∧ ¬E F

    Constructing a bridge theory

    ¬H

  • Problem setting ( explanatory induction ) Input:

    B : The (prior) background theory E : (Positive) examples/observations

    Task: Finding a hypothesis H such that B ∧ H ⊨ E, B ∧ H is consistent.

    Inverse Subsumption (IS) [Y. Yamamoto et al., 10] B ∧ ¬E F

    Constructing a bridge theory

    ¬H

    F*

    F* is a CNF formula equivalent to ¬F

  • Problem setting ( explanatory induction ) Input:

    B : The (prior) background theory E : (Positive) examples/observations

    Task: Finding a hypothesis H such that B ∧ H ⊨ E, B ∧ H is consistent.

    Inverse Subsumption (IS) [Y. Yamamoto et al., 10] B ∧ ¬E F

    Constructing a bridge theory

    H

    ¬H

    F* Generalizing F* to H

    !

    !

  • Problem setting ( explanatory induction ) Input:

    B : The (prior) background theory E : (Positive) examples/observations

    Task: Finding a hypothesis H such that B ∧ H ⊨ E, B ∧ H is consistent.

    Inverse Subsumption (IS) [Y. Yamamoto et al., 10] B ∧ ¬E F

    Constructing a bridge theory

    H

    ¬H

    F*

    !

    !

    ∀C ∈ F*, ∃D∈ H s.t. D subsumes C

    Generalizing F* to H

  • Problem setting ( explanatory induction ) Input:

    B : The (prior) background theory E : (Positive) examples/observations

    Task: Finding a hypothesis H such that B ∧ H ⊨ E, B ∧ H is consistent.

    Inverse Subsumption (IS) [Y. Yamamoto et al., 10] B ∧ ¬E F

    Constructing a bridge theory

    H Generalizing F* to H

    ¬H

    F*

    How to construct F*?

    !

    !

  • Bottom theory F*Definition (Induction field). An induction field IH is defined as , where L is a finite set of ground literals to appear in ground hypotheses. Given an induction field IH = , Taut(IH) is defined as the set of tautologies: Taut(IH ) = { ¬A ∨ A | A ∈ IH, ¬A ∈ IH }.

    Definition (Bottom theory). Given a bridge theory F and an induction field IH, the bottom theory wrt F and IH is defined as the following theory: τ( MD( F∪Taut(IH ) )), where τ(MD(X)) is the minimal complement of X which does not contain

    any subsumed clauses and tautologies.

    Key idea: adding the tautologies

  • Bottom theory F*Definition (Induction field). An induction field IH is defined as , where L is a finite set of ground literals to appear in ground hypotheses. Given an induction field IH = , Taut(IH) is defined as the set of tautologies: Taut(IH ) = { ¬A ∨ A | A ∈ IH, ¬A ∈ IH }.

    Definition (Bottom theory). Given a bridge theory F and an induction field IH, the bottom theory wrt F and IH is defined as the following theory: τ( MD( F∪Taut(IH ) )), where τ(MD(X)) is the minimal complement of X which does not contain

    any subsumed clauses and tautologies. Every hypothesis is subsumed by the bottom theory

    Key idea: adding the tautologies

  • Bottom theory F*Definition (Induction field). An induction field IH is defined as , where L is a finite set of ground literals to appear in ground hypotheses. Given an induction field IH = , Taut(IH) is defined as the set of tautologies: Taut(IH ) = { ¬A ∨ A | A ∈ IH, ¬A ∈ IH }.

    Definition (Bottom theory). Given a bridge theory F and an induction field IH, the bottom theory wrt F and IH is defined as the following theory: τ( MD( F∪Taut(IH ) )), where τ(MD(X)) is the minimal complement of X which does not contain

    any subsumed clauses and tautologies. Every hypothesis is subsumed by the bottom theory

    How can we practically search the subsumption lattice

    bounded by the bottom theory for a hypothesis.

    Key idea: adding the tautologies

  • How to practically search the lattice?

    Lattice-search techniques in Progol-like ILP systems

    1. Reducing the search space –  Mode declarations –  A specific (weak and ordered) subsumption-lattice

    2. Evaluating hypotheses –  A heuristic function evaluating description length and

    coverage of positive/negative examples

    3. Best-first search –  Called A*-like algorithm

  • Comparison

    Properties Progol-like ILP systems (Progol / Aleph)Inverse subsumption

    (general setting)

    Hypothesis class Horn theory Full-clausal theory

    Inductive bias Mode declaration Induction field

    Subsumption Ordered General

    Bottom theory ⊥(B, E) τ(MD(F ∪ Taut(IH)))

    Heuristic function f

    f = |covered examples| - (|size| + |singleton variables|) Nothing

    Search strategy

    Best-first search (called A*-like) Nothing

  • A practical setting of IS (proposal)

    Properties Progol-like ILP systems (Progol / Aleph)Inverse subsumption

    (Practical setting)Hypothesis

    class Horn theory Full-clausal theory

    Inductive bias Mode declaration Full-clausal mode declaration

    Subsumption Ordered General

    Bottom theory ⊥(B, E) e ∈ τ(MD(F ∪ Taut(IH)))

    Heuristic function f

    f = |covered examples| - (|size| + |inconsistent

    variables|)

    f = |covered clauses of τ(MD(F ∪ Taut(IH)))| -

    (|size| + |inconsistent or singleton variables|)

    Search strategy

    Best-first search (called A*-like) Best-first search

  • Case study

    •  Mode declarations M –  Modeh(1, buy(+man, #item)). –  Modeh(1, shopping(+man, #date)). –  Modeb(1, buy(+man, #item)). (Type of variables) − man(john). item(diaper). item(beer). date(at_night).

    •  Background theory B buy(john, diaper) ∨ buy(john, beer).

    •  Examples E shopping(john, at_night).

  • •  Step 0: extracting an induction field IH from M

    M : –  Modeh(1, buy(+man, #item)). –  Modeh(1, shopping(+man, #date)). –  Modeb(1, buy(+man, #item)). (Type of variables) − man(john). item(diaper). item(beer). date(at_night).

    IH : < buy(john, diaper), buy(john, beer), ¬shopping(john, at_night), ¬buy(john, diaper), ¬buy(john, beer)>

    Case study

  • Case study

    •  Step 1: constructing a bridge theory F = B ∪ ¬E •  Step 2: computing τ(MD(F ∪ Taut(IH))) τ(MD(F ∪ Taut(IH))) = { buy(john,diaper)∨¬buy(john, beer)∨shopping(john, at_night), ¬buy(john,diaper)∨buy(john, beer)∨shopping(john, at_night),

    ¬buy(john,diaper)∨¬buy(john, beer)∨shopping(john, at_night)}

  • Case study

    Computing the best hypothesis clause for each clause in τ(MD(F ∪ Taut(IH))) one by one

    •  Step 1: constructing a bridge theory F = B ∪ ¬E •  Step 2: computing τ(MD(F ∪ Taut(IH))) τ(MD(F ∪ Taut(IH))) = { buy(john,diaper)∨¬buy(john, beer)∨shopping(john, at_night), ¬buy(john,diaper)∨buy(john, beer)∨shopping(john, at_night),

    ¬buy(john,diaper)∨¬buy(john, beer)∨shopping(john, at_night)}

  • Best first search in the subsumption lattice bounded by some (selected) clause in τ(MD(F ∪ Taut(IH)))

    h = □

    Most specific clause

    buy(john,diaper)∨¬buy(john, beer)∨shopping(john, at_night)

    Most general clause

    Case study

  • Best first search in the subsumption lattice bounded by some (selected) clause in τ(MD(F ∪ Taut(IH)))

    h = □

    Most specific clause

    ! h( ) Best clause hb1 in ! h( )

    buy(john,diaper)∨¬buy(john, beer)∨shopping(john, at_night)

    Most general clause

    Case study

    is a refinement operator!

  • Best first search in the subsumption lattice bounded by some (selected) clause in τ(MD(F ∪ Taut(IH)))

    h = □

    Most specific clause

    ! h( )!2 = ! h( )!! hb1( )

    Best clause hb1 in ! h( )Best clause hb2 in !

    2

    buy(john,diaper)∨¬buy(john, beer)∨shopping(john, at_night)

    Most general clause

    Case study

    is a refinement operator!

  • Best first search in the subsumption lattice bounded by some (selected) clause in τ(MD(F ∪ Taut(IH)))

    h = □

    Most specific clause

    ! h( )!2 = ! h( )!! hb1( )

    !3 = !2!!(hb2 )

    Best clause hb1 in ! h( )Best clause hb2 in !

    2

    buy(john,diaper)∨¬buy(john, beer)∨shopping(john, at_night)

    Most general clause

    Case study

    is a refinement operator!

  • Best first search in the subsumption lattice bounded by some (selected) clause in τ(MD(F ∪ Taut(IH)))

    h = □

    Most specific clause

    ! h( )!2 = ! h( )!! hb1( )

    !3 = !2!!(hb2 )

    ・・・

    !n

    Best clause hb1 in ! h( )Best clause hb2 in !

    2

    buy(john,diaper)∨¬buy(john, beer)∨shopping(john, at_night)

    Most general clause

    Case study

    is a refinement operator!

  • •  Step 3: best-first search in the subsumption lattice

    Case study

    buy(john,diaper)∨¬buy(john, beer)∨shopping(john, at_night)

    □shopping(X, at_night) buy(X, diaper)f = 3 – (1 + 1) f = 1 – (1 + 1)

    Heuristic function f = 3 – ( 1 + 1 )

    (Number of covered examples) = 3(Description length) = 1

    (Num. of singleton variables) = 1

  • •  Step 3: best-first search in the subsumption lattice

    Case study

    buy(john,diaper)∨¬buy(john, beer)∨shopping(john, at_night)

    □shopping(X, at_night) buy(X, diaper)

    Select the best

    f = 3 – (1 + 1) f = 1 – (1 + 1)

    Heuristic function f = 3 – ( 1 + 1 )

    (Number of covered examples) = 3(Description length) = 1

    (Num. of singleton variables) = 1

  • •  Step 3: best-first search in the subsumption lattice

    Case study

    buy(john,diaper)∨¬buy(john, beer)∨shopping(john, at_night)

    □shopping(X, at_night) buy(X, diaper)

    Select the best

    f = 3 – (1 + 1) f = 1 – (1 + 1)

    f = 2 – (2 + 0)

    shopping(X, at_night) ←buy(X, beer)

    Heuristic function f = 3 – ( 1 + 1 )

    (Number of covered examples) = 3(Description length) = 1

    (Num. of singleton variables) = 1

  • •  Step 3: best-first search in the subsumption lattice

    Case study

    buy(john,diaper)∨¬buy(john, beer)∨shopping(john, at_night)

    □shopping(X, at_night) buy(X, diaper)

    Select the best

    f = 3 – (1 + 1) f = 1 – (1 + 1)

    f = 2 – (2 + 0)

    shopping(X, at_night) ←buy(X, beer)

    Terminate here as there is no singleton variable!

    Heuristic function f = 3 – ( 1 + 1 )

    (Number of covered examples) = 3(Description length) = 1

    (Num. of singleton variables) = 1

  • τM(F ∪ Taut(IH)) = { buy(john,diaper)∨¬buy(john, beer)∨shopping(john, at_night), ¬buy(john,diaper)∨buy(john, beer)∨shopping(john, at_night),

    ¬buy(john,diaper)∨¬buy(john, beer)∨shopping(john, at_night)}

    * Until here, we have one hypothesis clause: shopping(X, at_night) ←buy(X, beer) •  Step 4: removing the clauses from τ(MD(F ∪ Taut(IH))) that have already been explained by this clause

    Case study

  • τM(F ∪ Taut(IH)) = { buy(john,diaper)∨¬buy(john, beer)∨shopping(john, at_night), ¬buy(john,diaper)∨buy(john, beer)∨shopping(john, at_night),

    ¬buy(john,diaper)∨¬buy(john, beer)∨shopping(john, at_night)}

    * Until here, we have one hypothesis clause: shopping(X, at_night) ←buy(X, beer) •  Step 4: removing the clauses from τ(MD(F ∪ Taut(IH))) that have already been explained by this clause

    Case study

    Computing the best hypothesis clause for this clause (Go to Step 3)

  • •  Step 3: best-first search in the subsumption lattice

    Case study

    ¬buy(john,diaper)∨buy(john, beer)∨shopping(john, at_night)

    □shopping(X, at_night) buy(X, beer)

    Select the best

    f = 1 – (1 + 1) f = 1 – (1 + 1)

    buy(X, beer) ←buy(X, diaper)f = 1 – (2 + 0)

    Terminate here as there is no singleton variable!

  • τM(F ∪ Taut(IH)) = { buy(john,diaper)∨¬buy(john, beer)∨shopping(john, at_night), ¬buy(john,diaper)∨buy(john, beer)∨shopping(john, at_night),

    ¬buy(john,diaper)∨¬buy(john, beer)∨shopping(john, at_night)}

    * Until here, we have two hypothesis clauses: shopping(X, at_night) ← buy(X, beer). buy(X, beer) ← buy(X, diaper). •  Step 4: removing the clauses from τ(MD(F ∪ Taut(IH))) such that have already been explained; Return the hypothesis clauses.

    Case study

  • Empirical result•  Learn the concept of ``addition of numbers’’ •  Comparing the performances in two cases

    –  IS with/without tautologies •  Predictive accuracy is obtained by the leave-one-out strategy •  We obtain the correct concept (with 100% accuracy) in the case of IS with tautologies (red line), though it takes much execution time to compute the bottom theory

    Acc

    urac

    y [%

    ]

    Exe

    c. ti

    me

    [mse

    c]

  • Conclusion and future workSummary: - Inverse subsumption (IS) in full-clausal theories - Lattice-search techniques in Progol-like ILP systems - Implementing IS with those techniques - An empirical result Future work: - Further empirical evaluations using practical examples - Improving the scalability of the complete IS system