33
Model Checking XML Manipulating Software Xiang Fu Tevfik Bultan Jianwen Su Department of Computer Science University of California, Santa Barbara {fuxiang,bultan,su}@cs.ucsb.edu

Model Checking XML Manipulating Software Xiang Fu Tevfik Bultan Jianwen Su Department of Computer Science University of California, Santa Barbara {fuxiang,bultan,su}@cs.ucsb.edu

Embed Size (px)

Citation preview

Page 1: Model Checking XML Manipulating Software Xiang Fu Tevfik Bultan Jianwen Su Department of Computer Science University of California, Santa Barbara {fuxiang,bultan,su}@cs.ucsb.edu

Model Checking XML Manipulating Software

Xiang Fu Tevfik Bultan Jianwen Su

Department of Computer ScienceUniversity of California, Santa Barbara

{fuxiang,bultan,su}@cs.ucsb.edu

Page 2: Model Checking XML Manipulating Software Xiang Fu Tevfik Bultan Jianwen Su Department of Computer Science University of California, Santa Barbara {fuxiang,bultan,su}@cs.ucsb.edu

Web Services

• Loosely coupled, interaction through standardized interfaces

• Standardized data transmission via XML• Asynchronous messaging• Platform independent (.NET, J2EE)

Data

Type

Service

Composition

Message

BPEL4WS

Web Service StandardsIm

ple

men

tatio

n P

latfo

rms

Mic

roso

ft .N

et,

Sun

J2

EE

WSDL

SOAP

XML Schema

XML

WSCIInteraction

Page 3: Model Checking XML Manipulating Software Xiang Fu Tevfik Bultan Jianwen Su Department of Computer Science University of California, Santa Barbara {fuxiang,bultan,su}@cs.ucsb.edu

Outline

• An Example: Stock Analysis Service• Capturing Global Behaviors

– Conversations, Conversation Protocols • Web Service Analysis Tool• XML Messaging

– XML data, MSL types, XPath expressions• Model Checking Conversation Protocols

– Translation to Promela• Conclusions and Future Work

Page 4: Model Checking XML Manipulating Software Xiang Fu Tevfik Bultan Jianwen Su Department of Computer Science University of California, Santa Barbara {fuxiang,bultan,su}@cs.ucsb.edu

An Example: Stock Analysis Service (SAS)

register ack, cancel

accept, reject, bill

request,

terminate

report

Investor (Inv)

Research Dept.(RD)

Stock Broker(SB)

• SAS is a composite web service– a finite set of peers: Investor (Inv), Stock Broker (SB),

and Research Department (RD) – and a finite set of message classes: register, ack, cancel, accept, ...

Page 5: Model Checking XML Manipulating Software Xiang Fu Tevfik Bultan Jianwen Su Department of Computer Science University of California, Santa Barbara {fuxiang,bultan,su}@cs.ucsb.edu

Communication Model

• We assume that the messages among the peers are exchanged through reliable and asynchronous messaging– FIFO and unbounded message queues

• This model is similar to industry efforts such as

– JMS (Java Message Service)

– MSMQ (Microsoft Message Queuing Service)

reqStock Broker

(SB)Research Dept.

(RD)req

Page 6: Model Checking XML Manipulating Software Xiang Fu Tevfik Bultan Jianwen Su Department of Computer Science University of California, Santa Barbara {fuxiang,bultan,su}@cs.ucsb.edu

Conversations

• A virtual watcher records the messages as they are sent

Watcher

• A conversation is a sequence of messages the watcher sees during an execution

register

accept

requestreport

Investor (Inv)

Research Dept.(RD)

Stock Broker(SB)

ack

repacc bilreg ackreq ter

bill

terminate

Page 7: Model Checking XML Manipulating Software Xiang Fu Tevfik Bultan Jianwen Su Department of Computer Science University of California, Santa Barbara {fuxiang,bultan,su}@cs.ucsb.edu

Conversation Protocols

1

23

4

6

5

7 8

10

9

12 11

register

reject

terminate

accept

request

report ack

request

report

ackcancel

bill cancel

bill

terminate

• Conversation Protocol: An automaton that accepts the desired conversation set

SAS conversation protocol

Page 8: Model Checking XML Manipulating Software Xiang Fu Tevfik Bultan Jianwen Su Department of Computer Science University of California, Santa Barbara {fuxiang,bultan,su}@cs.ucsb.edu

Properties of Conversations

• The notion of conversation enables us to reason about temporal properties of the composite web services

• LTL framework extends naturally to conversations– LTL temporal operators

X (neXt), U (Until), G (Globally), F (Future)– Atomic properties

Predicates on message classes (or contents)

Example: G ( accept F bill )

• Model checking problem: Given an LTL property, does the conversation set satisfy the property?

Page 9: Model Checking XML Manipulating Software Xiang Fu Tevfik Bultan Jianwen Su Department of Computer Science University of California, Santa Barbara {fuxiang,bultan,su}@cs.ucsb.edu

BPEL to

GFSAGuardedautomata

GFSA to Promela (bounded queue)

BPEL

WebServices

Promela

Synchronizability Analysis

GFSA to Promela(synchronous

communication)

IntermediateRepresentation

ConversationProtocol

Front End

Realizability Analysis

Guardedautomaton

skip

GFSAparser

success

fail

GFSA to Promela(single process,

no communication)

success

fail

Analysis Back End

(bottom-up)

(top-down)

Verification Languages

Web Service Analysis Tool (WSAT)

• Friday 4:00pm, tool presentation at CAV• Demonstration Saturday (or anytime you find me with

my laptop)

Page 10: Model Checking XML Manipulating Software Xiang Fu Tevfik Bultan Jianwen Su Department of Computer Science University of California, Santa Barbara {fuxiang,bultan,su}@cs.ucsb.edu

SAS Guarded AutomataTopdown { Schema{ PeerList{ Investor, Broker, ResearchDept }, TypeList{ Register ... Accept ... }, MessageList{ register{ Investor -> Broker : Register }, accept{ Broker -> Investor : Accept }, ... } }, GProtocol{ States{ s1,s2,s3,s4,s5,s6,s7,s8,s9,s10,s11,s12 }, InitialState{ s1 }, FinalStates{ s4 }, TransitionRelation{ t1{ s1 -> s2 : register, Guard{ true } }, t2{ s2 -> s5 : accept, Guard{ true => $accept[//orderID := $register//orderID] } }, ... } }}

Page 11: Model Checking XML Manipulating Software Xiang Fu Tevfik Bultan Jianwen Su Department of Computer Science University of California, Santa Barbara {fuxiang,bultan,su}@cs.ucsb.edu

XML (eXtensible Markup Language)

• XML is a markup language like HTML• Similar to HTML, XML tags are written as

<tag> followed by </tag>• HTML vs. XML

– In HTML, tags are used to describe the appearance of the data

<b> </b> <i> </i> ...– In XML, tags are used to describe the content of the

data rather than the appearance

<date> </date> <address> </address>• XML documents can be modeled as trees where each

internal node corresponds to a tag, and leaf nodes correspond to basic types

Page 12: Model Checking XML Manipulating Software Xiang Fu Tevfik Bultan Jianwen Su Department of Computer Science University of California, Santa Barbara {fuxiang,bultan,su}@cs.ucsb.edu

An XML Document and Its Tree

<Register><investorID>VIP01</investorID><requestList><stockID>0001</stockID><stockID>0002</stockID></requestList><payment><accountNum>0425</accountNum></payment></Register>

investorID

Register

VIP01

requestList

0001 0002

payment

accountNum

0425

stockID stockID

Page 13: Model Checking XML Manipulating Software Xiang Fu Tevfik Bultan Jianwen Su Department of Computer Science University of California, Santa Barbara {fuxiang,bultan,su}@cs.ucsb.edu

MSL (Model Schema Language)

• MSL is a language for defining XML data types– MSL captures core features of XML Schema

• Basic MSL syntax

g | b | t [ g ] | g { m , n }

| g , g | g & g | g | g

g is an XML type (i.e., an MSL type expression)

is the empty sequence

b is a basic type such as string, boolean, int, etc.

t is a tag

m and n are positive integers

[ ] { } & , | are MSL type constructors

Page 14: Model Checking XML Manipulating Software Xiang Fu Tevfik Bultan Jianwen Su Department of Computer Science University of California, Santa Barbara {fuxiang,bultan,su}@cs.ucsb.edu

MSL Semanticst [ g ]

denotes a type with root node labeled t with children of type g

g { m , n } denotes a sequence of size at least m and at most n

where each member is of type g

g1 , g2 denotes an ordered sequence where the first member is

of type g1 and the second member is of type g2

g1 & g2 denotes an unordered sequence where one member is of

type g1 and the other member is of type g2

g1 | g2

denotes a choice between type g1 and type g2, i.e., either type g1 or type g2, but not both

Page 15: Model Checking XML Manipulating Software Xiang Fu Tevfik Bultan Jianwen Su Department of Computer Science University of California, Santa Barbara {fuxiang,bultan,su}@cs.ucsb.edu

An MSL Type Declaration and an Instance

Register[ investorID[string] , requestList[ stockID[int]{1,3} ] , payment[ creditCardNum[int] | accountNum[int] ]]

<Register><investorID>VIP01</investorID><requestList><stockID>0001</stockID><stockID>0002</stockID></requestList><payment><accountNum>0425</accountNum></payment></Register>

Page 16: Model Checking XML Manipulating Software Xiang Fu Tevfik Bultan Jianwen Su Department of Computer Science University of California, Santa Barbara {fuxiang,bultan,su}@cs.ucsb.edu

Mapping MSL types to Promela

• Restrictions: no unbounded or unordered sequences, no string manipulation

• Basic types – integer and boolean types are mapped to Promela

basic types int and bool – strings are mapped to enumerated type (mtype) in

Promela • we only allow constant string values

• Type constructors are handled using – structured types (declared using typedef) in Promela– or arrays

Page 17: Model Checking XML Manipulating Software Xiang Fu Tevfik Bultan Jianwen Su Department of Computer Science University of California, Santa Barbara {fuxiang,bultan,su}@cs.ucsb.edu

Example

Register[ investorID[string] , requestList[ stockID[int]{1,3} ] , payment[ creditCardNum[int] | accountNum[int] ]]

typedef t1_investorID{ mtype stringvalue;}typedef t2_stockID{int intvalue;}typedef t3_requestList{ t2_stockID stockID [3]; int stockID_occ;}typedef t4_accountNum{int intvalue;}typedef t5_creditCard{int intvalue;}mtype {m_accountNum, m_creditCard}typedef t6_payment{ t4_accountNum accountNum; t5_creditCard creditCard; mtype choice;}typedef Register{ t1_investorID investorID; t3_requestList requestList; t6_payment payment;}

Page 18: Model Checking XML Manipulating Software Xiang Fu Tevfik Bultan Jianwen Su Department of Computer Science University of California, Santa Barbara {fuxiang,bultan,su}@cs.ucsb.edu

XPath

• In order to write specifications or programs that manipulate XML documents we need: – an expression language to access values and nodes in

XML documents

• XPath is a language for writing expressions (queries) that navigate through XML trees and return a set of answer nodes

• An XPath query defines a function which – takes and XML tree and a context node (in the same

tree) as input and – returns a set of nodes (in the same tree) as output

Page 19: Model Checking XML Manipulating Software Xiang Fu Tevfik Bultan Jianwen Su Department of Computer Science University of California, Santa Barbara {fuxiang,bultan,su}@cs.ucsb.edu

XPath Syntax

Basic XPath syntax:

q . | .. | b | t | * | q / q | q // q | q [ exp ]

q is an XPath query

exp denotes a predicate on basic types, i.e., on the leaf nodes of the XML tree

b denotes a basic type such as string, boolean, int, etc.

t denotes a tag

Page 20: Model Checking XML Manipulating Software Xiang Fu Tevfik Bultan Jianwen Su Department of Computer Science University of California, Santa Barbara {fuxiang,bultan,su}@cs.ucsb.edu

XPath Semantics

XPath expression are evaluated from left to right

Given an XML tree and a node n as a context node

. returns n

.. returns the parent of n

Given an XML tree and a set of nodes

* returns all the nodes

b returns the nodes that are of basic type b

t returns the nodes which are labeled with tag t

Page 21: Model Checking XML Manipulating Software Xiang Fu Tevfik Bultan Jianwen Su Department of Computer Science University of California, Santa Barbara {fuxiang,bultan,su}@cs.ucsb.edu

XPath Semantics Contd.

Starting at the context node:

q1 / q2 returns each node which matches q2 starting at a child of a node which matches q1

q1 // q2 returns each node which matches q2 starting at a descendant of a node which matches q1

(if q1 is missing, then start at the root)

q [ exp ] returns the nodes that match q and withchildren for which exp evaluates to true

Page 22: Model Checking XML Manipulating Software Xiang Fu Tevfik Bultan Jianwen Su Department of Computer Science University of California, Santa Barbara {fuxiang,bultan,su}@cs.ucsb.edu

Examples

//payment/* returns the node labeled accountNum

/Register/requestList/stockID/int returns the nodes labeled 0001 and 0002

//stockID[int > 1]/int returns the node labeled 0002

investorID

Register

VIP01

requestList

0001 0002

payment

accountNum

0425

stockID stockID

Page 23: Model Checking XML Manipulating Software Xiang Fu Tevfik Bultan Jianwen Su Department of Computer Science University of California, Santa Barbara {fuxiang,bultan,su}@cs.ucsb.edu

XPath to Promela

• Generate code that evaluates the XPath expression– Restrictions: no ancestors-axis, no string expressions

• Uses two data structures– Type tree shows the structure of the corresponding

MSL type– Abstract statements which are mapped to Promela

code• Traverse the XPath expression from left to right

– Statements generated in each step are inserted into the BLANK spaces left in the code from the previous step

– The type tree is used to keep track of the context of the generated code

Page 24: Model Checking XML Manipulating Software Xiang Fu Tevfik Bultan Jianwen Su Department of Computer Science University of California, Santa Barbara {fuxiang,bultan,su}@cs.ucsb.edu

IF(c)if :: c -> BLANK :: else -> skipfi

v = l – 1do :: v < h -> BLANK v++ :: else -> breakod

BLANK

FOR(v,l,h)

EMPTY

INC(v)

SET(v,a)

v++

v = a

Statement Promela Code

Page 25: Model Checking XML Manipulating Software Xiang Fu Tevfik Bultan Jianwen Su Department of Computer Science University of California, Santa Barbara {fuxiang,bultan,su}@cs.ucsb.edu

investorID

Register

string

requestList

int

payment

creditCard

int

stockID (idx: i1)

accountNum

int

1

2

3

4

108

7

5

6

9 11

Register[ investorID[string] & requestList[ stockID[int]{1,3} ] & payment[ creditCardNum[int] | accountNum[int] ]]

Type Tree

Page 26: Model Checking XML Manipulating Software Xiang Fu Tevfik Bultan Jianwen Su Department of Computer Science University of California, Santa Barbara {fuxiang,bultan,su}@cs.ucsb.edu

FOR (i1,1,3)

EMPTY

IF (cond)

SET (bRes1,0)

IF (bRes1)

IF (i2==i3)

IF (bRes2) EMPTY

SET (bRes2,0)

SET (bRes2,0)

SET (bRes1,1)

$register // stockID / [int()>5] / [position() = last()] / int()

cond v_register.requestlist.stockID[i1] > 5SequenceInsert

1

5

5 56

INC (i2)

SET (i2,1)

Page 27: Model Checking XML Manipulating Software Xiang Fu Tevfik Bultan Jianwen Su Department of Computer Science University of California, Santa Barbara {fuxiang,bultan,su}@cs.ucsb.edu

$request//stockID=$register//stockID[int()>5][position()=last()]

/* result of the XPath expression */ bool bResult = false; /* results of the predicates 1, 2, and 1 resp. */ bool bRes1, bRes2, bRes3; /* index, position(), last(), index, position() */ int i1, i2, i3, i4, i5;

i2=1; /* pre-calculate the value of last(), store in i3 */ i4=0; i5=1; i3=0; do :: i4 < v_register.requestList.stockID_occ -> /* compute first predicate */ bRes3 = false; if :: v_register.requestList.stockID[i4].intvalue>5 -> bRes3 = true :: else -> skip fi; if :: bRes3 -> i5++; i3++; :: else -> skip fi; i4++;

:: else -> break; od;

Page 28: Model Checking XML Manipulating Software Xiang Fu Tevfik Bultan Jianwen Su Department of Computer Science University of California, Santa Barbara {fuxiang,bultan,su}@cs.ucsb.edu

$request//stockID=$register//stockID[int()>5][position()=last()]

i1=0; do :: i1 < v_register.requestList.stockID_occ -> bRes1 = false; if :: v_register.requestList.stockID[i1].intvalue>5 -> bRes1 = true :: else -> skip fi; if :: bRes1 -> bRes2 = false; if :: (i2 == i3) -> bRes2 = true; :: else -> skip fi; if :: bRes2 -> if :: (v_request.stockID.intvalue == v_register.requestList.stockID[i1].intvalue) -> bResult = true; :: else -> skip fi :: else -> skip fi; i2++; :: else -> skip fi; i1++; :: else -> break; od;

Page 29: Model Checking XML Manipulating Software Xiang Fu Tevfik Bultan Jianwen Su Department of Computer Science University of California, Santa Barbara {fuxiang,bultan,su}@cs.ucsb.edu

Model Checking Using Promela

• Error in SAS conversation protocol

t14{ s8 -> s12 : bill,

Guard{

$request//stockID = $register//stockID [position() = last()]

=>

$bill[ //orderID := $register//orderID ]

}

}

• Repeating stockID will cause error

• One can only discover these kinds of errors by analysis of XPath expressions

Page 30: Model Checking XML Manipulating Software Xiang Fu Tevfik Bultan Jianwen Su Department of Computer Science University of California, Santa Barbara {fuxiang,bultan,su}@cs.ucsb.edu

Related Work

• Verification of web services– Simulation, verification, composition of web services

using a Petri net model [Narayanan, McIlraith WWW’02]

– Using MSC to model BPEL web services which are translated to labeled transition systems and verified using model checking [Foster, Uchitel, Magee, Kramer ASE’03]

– Model checking Web Service Flow Language specifications using SPIN [Nakajima ICWE’04]

– BPEL verification using a process algebra model and Concurrency Workbench [Koshkina, van Breugel TAV-WEB’04]

Page 31: Model Checking XML Manipulating Software Xiang Fu Tevfik Bultan Jianwen Su Department of Computer Science University of California, Santa Barbara {fuxiang,bultan,su}@cs.ucsb.edu

Related Work

• Conversation specification– IBM Conversation support project

http://www.research.ibm.com/convsupport/– Conversation support for business process integration

[Hanson, Nandi, Kumaran EDOCC’02]

Page 32: Model Checking XML Manipulating Software Xiang Fu Tevfik Bultan Jianwen Su Department of Computer Science University of California, Santa Barbara {fuxiang,bultan,su}@cs.ucsb.edu

Future Work

• Other input languages in the front end– WSCI, OWL-S

• Other verification tools at the back end– SMV, Action Language Verifier

• Symbolic representations for XML data

• Abstraction for XML data and XML data manipulation

Page 33: Model Checking XML Manipulating Software Xiang Fu Tevfik Bultan Jianwen Su Department of Computer Science University of California, Santa Barbara {fuxiang,bultan,su}@cs.ucsb.edu

Translatorfor bottom-upspecifications Guarded

automata Translation withbounded queue

SynchronizabilityAnalysis

Translation withsynchronous

communication

IntermediateRepresentation

ConversationProtocols

Front End

Realizability Analysis

Guardedautomaton

skip

Translatorfor top-downspecifications

success

fail

Translation withsingle process,

no communicationsuccessfail

Analysis Back End

BPEL

Web ServiceSpecificationLanguages

WSCI

Promela

SMV

ActionLanguage

VerificationLanguages

. . .

. . .

Aut

omat

ed

Abs

trac

tion

Current and Future Work