46
Computer Science, building 42.2 Roskilde University Universitetsvej 1 P.O. Box 260 DK-4000 Roskilde Denmark Phone: +45 4674 2000 Fax: +45 4674 3072 www.dat.ruc.dk Regular types, tree automata and model checking Lecture 1 John Gallagher University of Roskilde, Denmark Supported by Danish Research Council project SAFT

Regular types, tree automata and model checkingakira.ruc.dk/~jpg/UPM-MSc/upm-lect1.pdf · 3 Outline of the lecture •Motivating examples •What are finite tree automata (FTAs)?

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Computer Science, building 42.2Roskilde University

Universitetsvej 1P.O. Box 260

DK-4000 RoskildeDenmark

Phone: +45 4674 2000Fax: +45 4674 3072

www.dat.ruc.dk

Regular types, tree automataand model checking

Lecture 1John Gallagher

University of Roskilde, Denmark

Supported by Danish Research Council project SAFT

2

Why tree automata?

• Finite tree automata define sets of trees

• trees are ubiquitous in computing• trees model data, abstract syntax, computation

states and computation traces (computationtrees) ....

• (it is no surprise that XML - a "universal" dataexchange language - is tree structured)

• Tree automata ought to be as familiar incomputer science as string automata,which define sets of strings.

3

Outline of the lecture

• Motivating examples• What are finite tree automata (FTAs)?• different ways to represent them

• Fundamental algorithms and procedures• Runs of a tree automaton• Bottom-up and top-down deterministic automata• Union, intersection, complementation, completion• Emptiness check• Determinisation, minimisation

4

Focus on Logic Programming

• FTAs are independent of any programminglanguage

• In these lectures we focus on logicprograms• to represent tree automata• as the target of tree-automata-based analysis

• But the ideas are applicable in many othercontexts• and we will look at some of them briefly

5

Motivating examples (1)

oddEven ← even(X), even(s(X)).

even(0).even(s(X)) ← odd(X).odd(s(X)) ← even(X).

Can the query oddEven succeed?

main(X) ← zeroList(X), ....., member(1,X).

zeroList([]).zeroList([0|X]) ← zeroList(X).

member(X,[X|_]).member(X,[_|Y]) ← member(X,Y).

Can the query main(X) succeed?

6

Motivating examples (2)

Transition system defining a token ring (with any number of processes)(example from Roychoudhury et al.).

gen([0,1]).gen([0 | X]) ← gen(X).trans(X,Y) ← trans1(X,Y).trans([1|X],[0|Y]) ← trans2(X,Y).trans1([0,1|T],[1,0|T]).trans1([H|T],[H|T1]) ← trans1(T,T1).trans2([0],[1]).trans2([H|T],[H|T1]) ← trans2(T,T1).reachable(X) ← gen(X).reachable(X) ← reachable(Y), trans(Y,X).

What are the possible answers forreachable(X)? Can X be a list containingmore than one '1'?

gen([0,1]).gen([0,0,1) .gen([0,0,0,...,1]).....

Intended reachablestatesreachable([0,0,...,1,...0,0])(lists with exactly one 1)

7

Motivating Examples (3)

/* transpose a matrix */

transpose(Xs,[]) :-nullrows(Xs).

transpose(Xs,[Y|Ys]) :-makerow(Xs,Y,Zs),transpose(Zs,Ys).

makerow([],[],[]).makerow([[X|Xs]|Ys],[X|Xs1],[Xs|Zs]):-

makerow(Ys,Xs1,Zs).

nullrows([]).nullrows([[]|Ns]) :-

nullrows(Ns).

row → []; [any | row]

matrix → []; [row | matrix]

Show "type correctness"of transpose(X,Y) . I.e.X and Y are both of type "matrix"in all possible solutions.

Show "mode correctness"of transpose(X,Y) . I.e.X is a ground term iffY is a ground term.

8

Motivating Examples (4)

/* rewrite system from Feuillade et al. */

plus(0,x) → xplus(s(x),y) → s(plus(x,y))times(0,x) → 0times(s(x),y) → plus(y,times(x,y))

square(x) → times(x,x)even(0) → trueeven(s(0)) → falseeven(s(x)) → odd(x)odd(0) → falseodd(s(0)) → trueodd(s(x)) → even(x)

even(square(x)) → odd(square(s(x)))odd(square(x)) → even(square(s(x)))

Show that the square of an even (resp. odd) number is even (resp. odd).

That is, show that false is notreachable in the term rewritingsystem, starting from the term even(square(0)).

9

Motivating Example (5)

• A Muller C-element is a device containing adelay element and some logic gates.

% A logic program representationmullerC_2(Ss) :- mullerC_3(0,Ss). mullerC_3(A,[s(B,D,F)|Ss]) :- and2logic(B,D,H), and2logic(B,F,I), and2logic(D,F,J), or3logic(H,I,J,K), delayLogic(K,F,A,L), mullerC_3(L,Ss).

Prove that if the two inputs of the device are both true at some point, then the output should become true at most three cycles later. Focus on analysing possible values of the argument highlighted.

10

A class of static analysis problems

• The above small examples illustrate the mainkinds of problem we are interested in.

1. Describing the reachable states of somesystem, and checking whether they satisfysome desired property.

2. Checking some specific property of acomputation; the property is specified as aregular type.

3. Checking properties of traces or computationtrees.

11

A quick survey of FTA applications

• Logic program mode and type analysis• Janssens & Bruynooghe, Gallagher & de Waal, Gallagher &

Puebla, Van Hentenryck et al. Vaucheret et al., Pietrzak etal.,

• Automatic binding type analysis Craig et al.• Control-flow analysis

• Aiken, Heintze & Jaffar, Charatonik & Podelski, Boichut etal., Jensen,

• Cryptographic protocol analysis• Monniaux, Genet, Goubault-Larreque, Feuillade et al.

• Shape analysis• Yang et al.

• Property checking• Banda & Gallagher

12

Approximating sets of terms

• Let Σ be a signature - a set of functionsymbols, each having a rank (arity)

• Term(Σ) is the set of all terms (trees)constructible from Σ• i.e. terms of form f(t1,...,tn) where f ∈ Σ, f has

arity n and t1 ∈ Term(Σ),...,tn ∈ Term(Σ)

• when arity is 0, we write f() as f.

• Termn(Σ) denotes the set of n-ary relationsover Term(Σ).

13

Regular/Recognizable Tree Languages

• Suppose Σ = {[], [.|.], 0, f(.)}• We can specify (finitely) the (infinite) set of

all lists, i.e.{[],[0],[f(0)],[f(f([])), 0], [[]], [[0],[0,0]],...}

[] → list[any|list] → list

0 → any[] → any[any|any] → anyf(any) → any

14

NFTA - Nondeterministic finite tree automata

Tree automata provide a means of specifying infinite sets oftrees (terms) over some signature Σ.

A (nondeterministic) finite tree automaton (N)FTA is a tuple <Q, Qf, Σ, Δ> where

Q is a finite set of states Qf ⊆ Q are the accepting statesΔ is a finite set of transitions (rules) of the form

f(q1,…,qn) → q0, where q0, q1,…,qn ∈ Q, and f is an n-aryfunction in Σ.

An FTA A defines a set of terms L(A)

The previous example: A = <{list, any}, {list}, {[], [.|.], 0, f(.)}, Δ>where Δ = {[] → list,[any|list] → list, 0 → any,[] → any,

[any|any] → any, f(any) → any}

15

Approximation of relations using FTAs

• The set of values in each argument will beapproximated using an FTA.

• So we could approximate the reverse relation as{<x,y> | x ∈ L(A), y ∈ L(A)} where A is the FTAdefining lists

• We write reverse(list, list) as theapproximation.

• It is usual to think of such an approximation asa regular type

16

Running an FTA

• The transitions of an FTA define a groundrewrite system over Term(Σ ∪ Q).• consider the states Q as unary function symbols

• Given a term t ∈ Term(Σ), t is accepted bythe FTA iff there is a rewriting t →* qfwhere qf is a final state of the FTA.

• E.g.[f(0),0] → [f(any),0] → [any,0] → [any,any]→

[any,any|list] → [any|list] → list

17

Bottom-up vs top-down run

• The runs defined above are called bottom-up runs.

• If we just reverse the arrows in thetransitions, the resulting rewrite systemgives top-down runs.

• list → [any|list] → [any,any|list] →[any,any] →[f(any),any] → [f(0),any] →[f(0),0]

18

Regular tree languages

• The language of an automaton A, calledL(A), is the set of terms t ∈ Term(Σ) suchthat t is accepted by some run of A.

• If a set of terms can be represented asL(A) from some FTA A, we say that the setof terms is recognizable.

• Such a set of terms is also known as aregular tree language• the set Δ can be seen as a regular tree grammar• (grammar = generator, automaton =

recognizer).

19

Some alternative notations for FTAs

• Transitions[] → list, [any|list] → list, 0 → any, [] → any, [any|any] →

any, f(any) → any• Regular type rules/tree grammars

list ==> []; [any|list]any ==> 0; []; f(any); [any|any]

• Regular unary Horn clauseslist([]).list([X|Y]) :- any(X), list(Y).any(0).any([]).any(f(X)) :- any(X).any([X|Y]) :- any(X),any(Y).

• Normalised set constraintsXlist ⊇ [], Xlist ⊇ [Xany|Xlist],Xany ⊇ 0, Xany ⊇ [], Xany ⊇ f(Xany), Xany ⊇ [Xany|Xany]

20

Exercise: notations for FTAs

• Define the set of binary trees whose nodescontain numbers (in successor notation) asan FTA.

• Write the FTA as a regular type, a unarylogic program and a set of normalised setconstraints.

8

3

5

6

21

Closure Properties and Operations

• Languages defined by FTAs are closed underoperations (intersection, union, complement).

• I.e. given recognizable languages L1 and L2, then L1∪ L2, L1 ∩ L2 , and ¬L1 are recognizable.

• Emptiness of an FTA and membership of a term inL(A) are decidable.

• Given FTAs A1 and A2, then we can construct FTAsA1∪A2, A1∩A2 , and ¬A1 such that L(A1∪A2) = L(A1)∪L(A2), etc.

22

Exercises: intersection and union

• Define an FTA A1 such that L(A1) is the set of listsof even numbers in successor notation i.e. the listelements are from {0, s(s(0)), s(s(s(s(0)))),...}.

• Define an FTA A2 such that L(A2) is the set of listsof multiples of 3, i.e. elements from {0, s(s(s(0))),s(s(s(s(s(s(0)))))),...}.

• Try to construct FTAs for L(A1) ∪ L(A2) and L(A1) ∩L(A2).

• What can you say about the size of theseautomata?

23

Complete FTAs

• An FTA is complete if there is at least one rulef(q1,...,qn) → q ∈ Δ for all n-ary functions f ∈ Σ,and q1,...,qn ∈ Q.

• An FTA can always be completed byi. adding an extra state, say qerr, which is not a final

state, andii. add a rule f(q1,...,qn) → qerr, for each f(q1,...,qn) for

which does not already appear on the LHS of a rule(note that q1,...,qn can include qerr)

• The completed FTA accepts the same set ofterms as the original.

24

Example: completion

• Let• Σ = {[], [.|.], 0, s(.)}

• Q = {list, num}, Qf = {list}• Δ = {[]→list, [num|list]→list, 0→num, s(num)→

num}

• Completion• Δ = {[]→list, [num|list]→list, 0→num, s(num)→

num, [list|list]→qerr, [num|num]→qerr, [list|num]→qerr, [qerr|num]→qerr, [num|qerr]→qerr, [qerr|list]→qerr, [list|qerr]→qerr, [qerr|qerr]→qerr, s(list)→qerr, s(qerr)→qerr}

25

Exercise: completion

• Define an FTA defining the finite set of terms {a, b,f(a,b), f(a,a)}. (Assume the signature Σ ={a,b,f(.,.)}.

• Form the completion of the FTA.• What can you say in general about the size of the

completed FTA?

• Note that in a complete FTA, there is a run t →* qfor every term t ∈ Term(Σ) and some (notnecessarily final) state q.

26

Deterministic FTAs

• Unlike string automata, determinism comesin two flavours.

• An FTA is bottom-up deterministic (DFTA) ifthere are no two rules in Δ having the sameleft-hand-side.• f(q1,...,qn) → q and f(q1,...,qn) → q', q ≠ q' disallowed

• An FTA is top-down deterministic (DTTA) ifthere are no two rules in Δ having both thesame right-hand-side and the same functionsymbol on the left.• f(q1,...,qn) → q and f(s1,...,sn) → q disallowed

27

Equivalence of FTAs and DFTAs

• For every FTA, there is an equivalent DFTA(bottom-up deterministic FTA).

• However, this does not hold for top-downdeterministic FTAs.• there are some FTAs that have no equivalent DTTA.• E.g.

• Σ = {[],[.|.],a,b},• Δ = {[]→ablist, [ta|ablist]→ablist, [tb|blist]→ablist, []→

blist, [tb|blist], a→ta, b→tb }

• (lists of a's followed by b's, [a,a,a,....,b,b,b])• Regular type notations are usually top-down

deterministic• hence regular types are not as expressive as FTAs in

general

28

Complementation

• It is a simple matter to complement a completeDFTA.

• Just exchange the final and non-final states.

• E.g. (example of completion above)• Δ = {[]→list, [num|list]→list, 0→num, s(num)→

num, [list|list]→qerr, [num|num]→qerr, [list|num]→qerr, [qerr|num]→qerr, [num|qerr]→qerr, [qerr|list]→qerr,[list|qerr]→qerr, [qerr|qerr]→qerr, s(list)→qerr, s(qerr)→qerr}

• In the original, Qf={list}. The complement isobtained by setting Qf={num, qerr}.

• Hence terms that were not accepted before arenow accepted, and vice versa.

29

Limited Precision of Top-DownDeterministic FTAs

append([], Ys,Ys).append([X|Xs], Ys, [X|Zs]) ← append(Xs,Ys,Zs).

?- append(A,B,C).

[] → A[a | A] → A

[a,a,….a]

[] → B[b | B] → B

[b,b,….b]

?

with a deterministicautomaton, the best we cando is[] → C[D | C] → Ca → Db → DThis is the set of lists of a and b (mixed).[a,a,b,a,b,b,….a]

30

Disjoint Accepting States in DFTAs

• Given a DFTA and a term t, we can seethat a bottom-up run starting from t isdeterministic.

• Hence each term can be accepted by atmost one state of a DFTA.

• Thus the sets of terms accepted by thestates of a DFTA are disjoint.

31

Determinizing FTAs

• An algorithm exists for converting anarbitrary FTA to a DFTA.

• Consider transitions for list and any[] → list[any|list] → list[] → any[any|any] → any0 → anys(any) → any

• This is not b-u deterministic ([] occurstwice in lhs of a transition)

32

Determinization of FTAs

• Any FTA can be determinized.• There is an equivalent FTA that is bottom-

up deterministic• In a deterministic FTA, each term is in at

most one type (state). Types are disjoint.

list

any nonlist

list+

33

Determinization of list/any

[] → list'[list'|list'] → list'[nonlist|list'] → list'[nonlist|nonlist] → nonlist[list|nonlist] → nonlist0 → nonlists(list) → nonlists(nonlist) → nonlist

list' = [list ∩ any]nonlist = [any]

An expression [q1, q2, ....,qn] denotesa state in the DFTAthat accepts terms acceptedby all of q1,...,qn and accepted by noother state.

34

Determinization algorithm

Determinization Algorithm DET input: NFTA A = <Q, Σ, Qf, Δ> begin Set Qd to ∅; Set Δd to ∅; repeat

Set Qd to Qd ∪ {s}; Set Δd to Δd ∪ {f(s1,..., sn) → s} where

f ∈ Σ, s1,..., sn ∈ Qd, s = {q ∈ Q | ∃q1 ∈ s1,..., qn ∈ sn , f(q1,..., qn) → q ∈ Δ}

until no rule can be added to Δd Set Qdf to {s ∈ Qd | s ∩ Qf ≠ ∅} output: DFTA Ad = <Qd, Σ, Qdf, Δd>end

35

Exercise: Determinization

• Determinize the FTA for "a sequence of asfollowed by a sequence of bs"

• Σ = {[],[.|.],a,b},• Δ = {[]→ablist, [ta|ablist]→ablist, [ta|blist]→

ablist, []→blist, [tb|blist]→blist, a→ta, b→tb}• Q = {ablist, blist, ta, tb}• Qf = {ablist}

36

Determinization: Complexity

• Determinization can lead to exponentialblow-up.• The states in the DFTA are from the set 2Q

• If "any" is in the original FTA then thecorresponding DFTA is complete• hence the number of transitions for an n-ary

function symbol is (2Q)n

• For this reason determinization has oftenbeen regarded as impractical

37

Properties of DFTAs

• The transitions of a complete DFTA for a given f form afunction f: 2Q × ... × 2Q → 2Q

• just regard the → as =

[] = list'

cons(list',list') = list'cons(nonlist,list') = list'cons(nonlist,nonlist) = nonlistcons(list,nonlist) = nonlist

0 = nonlist

s(list) = nonlists(nonlist) = nonlist

Thus we can regarda bottom-up run ofa DFTA as "evaluating"a term over the set ofDFTA states.

As we will see this canlead to abstract interpretations.

38

Product representation of transitions

• f(Q1,...,Qn) → q represents the set oftransitions{f(q1,...,qn) → q | qj ∈ Qj, 1≤j≤n}

E.g. determinized list/nonlist example

[] → list[{list,nonlist}|{list}] → list[{list,nonlist}|{nonlist}] → nonlistf({list,nonlist},..., {list,nonlist}) → nonlist

39

Determinization algorithm generatingproduct form

qmap(q, fn, j) = {f(q1, . . . , qn) → q0 ∈ Δ | j ≤ n, q = qj}Qmap(Q0, fn, j) = ∪{qmap(q, fn, j) | q ∈ Q0}states(Δ) = {q0 | f(q1, . . . , qn) → q0 ∈ Δ}

fmap(fn, i,D) = {Qmap(Q0, fn, i) | i ≤ n, Q0 ∈ D} \ ∅

C = {q | f0 → q ∈ Δ}| f0 ∈ Σ}

F(D) = ({states(Δ1 ∩ ・ ・ ・ ∩ Δn) | Δi ∈ fmap(fn, i,D), 1≤ i ≤ n} \ ∅) ∪ C

The algorithm finds the least set D ∈ 22Q such that D = F(D).The set D is computed by a fixpoint iteration as follows.

initialise i = 0; D0 = ∅; repeat Di+1 = F(Di); i = i + 1 until Di = Di−1

40

Example: list/nonlist

t1: [] → list,t2:[dynamic|list] → list,t3: [] → dynamic,t4: [dynamic|dynamic] → dynamic,t5: f(dynamic,dynamic) → dynamic,. . .qmap(list,cons,1) = {}qmap(list,cons,2) = {t2}qmap(list,f,1) = {}qmap(list,f,2) = {}qmap(dynamic,cons,1) = {t2,t4}qmap(dynamic,cons,2) = {t4}qmap(dynamic,f,1) = {t5}qmap(dynamic,f,2) = {t5}

41

Example: continued

• D0 = ∅

• D1 = {{list, dynamic}}• 2nd iteration• fmap(cons,1,D1) = fmap(cons,2,D1) = {{t2,t4}}• fmap(f,1,D1) = fmap(f,2,D1) = {{t5}}• D2 = F(D1) = {{list, dynamic},{dynamic}}

• 3rd iteration• fmap(cons,1,D2) = {{t2,t4}}• fmap(cons,2,D2) = {{t2,t4},{t4}}• fmap(f,1,D2) = fmap(f,2,D2) = {{t5}}• D3 = F(D2) = {{list, dynamic},{dynamic}}

• D3=D2

42

Extracting product transitions

fmap(cons,1,D3) fmap(cons,2,D3)

{{t2,t4}} {{t2,t4} ,{t4}}

To generate the product transitions for cons, form the productof the fmap values.

[{t2,t4}|{t2,t4}] → {t2,t4}∩{t2,t4}[{t2,t4}|{t4}] → {t2,t4}∩{t4}

[{{list,dynamic},{dynamic}}|{{list,dynamic}}] → {list,dynamic}[{{list,dynamic},{dynamic}}|{{dynamic}}] → {dynamic}

43

Reduction in size with productrepresentation

Q Δ Qd (Δd) ΔΠ

3 1933 4 (1130118) 1951 4 1934 5 (10054302) 1951 3 655 4 (20067) 433 4 656 5 (86803) 433 105 803 46 (6567) 141 16 65 16 (268436271) 89

Q = no. of FTA statesΔ = no. of FTA rulesQd = no. of DFTA statesΔd = no. of DFTA rulesΔΠ = no. of DFTA product rules

44

Decision procedures

• Emptiness of an FTA.

• Consider each rule f(q1,...,qn) → q• if q1,...,qn are all non-empty, then q is non-empty.

• Interpret each state q as a propositional variablewhich is true if q is non-empty

• Then we can write a propositional formula for eachrule, q1∧...∧qn → q

• If q is a consequence of all such formulas for anFTA, then q is non-empty

• This is decidable (in linear time)

45

Other decision procedures

• Intersection non-emptiness• EXPTIME-complete.

• Finiteness• polynomial time

• Equivalence• EXPTIME-complete.

• Singleton Set Property• polynomial time

46

ε-transitions

• Just as for word automata, FTAs maycontain ε-transitions• q → q'

• These can always be eliminated so are notimportant for the theory• (basic idea) remove q → q' and add all

transitions f(q1,...,qn) → q' such that there is atransition f(q1,...,qn) → q

• ε-transitions arise naturally in analysingprograms