4
A Method to Generate Verification Condition Generator Zhaopeng Li ,Yang Zhang ,Yiyun Chen ,School of Computer Science and Technology, University of Science and Technology of China, Hefei, China Software Security Laboratory, Suzhou Institute for Advanced Study, University of Science and Technology of China, Suzhou, China Email: {zpli, huahuax}@mail.ustc.edu.cn [email protected] AbstractWe propose a method to generate certain verification condition generators (VCGens, for short) automatically to be used in certifying compilers or other verification tools in this paper, to alleviate the burden of developing various kinds of VCGens in the domain-specific program verification tools. We introduce a new methodology for describing the rules in the verification condition calculation. We have implemented a prototype of VCGEN 2 (VCGenGen) using C++. This tool provides a series of interfaces named action functions to the users. Users can describe the calculation rules by combining these action functions. And our tool also embeds a parser generator, so users need to feed in the grammar of the languages along with the calculation rules. If there is no error, VCGEN 2 outputs the corresponding VCGen with respect to the user-defined languages and rules. We have used our prototype to generate a number of VCGens successfully as demonstration. I. I NTRODUCTION In the field of high-confidence software research, formal verification is a typical method to construct high-confidence software. There are roughly two approaches to formal veri- fication. The first approach and formation is model checking [1], which consists of a systematically exhaustive exploration of the mathematical model. Model checking offers great experience in hardware verification in hardware designs. And the second approach is logical inference. In general, logical inference consists of using a formal version of mathematical reasoning about the system, usually using theorem proving software to prove the formulae. We mainly focus on program verification via Hoare-style logical inference and theorem-proving techniques in this paper, since this method is widely used and it can provide a math- ematical, even machine-checkable, proof for the properties of given programs. Many researches today are focusing on such an approach to certify critical software and provide formal proofs. Verification condition generation is a common way to achieve Hoare-style logical inference. A program is speci- fied by enough specifications (pre-/post-conditions, and loop invariants), which are assertions describing desired program properties, at certain program points. A program logic, with inference rules in Hoare-triple ({P}S{Q}), must be designed to reason about such programs. Then a strongest post-condition (or weakest pre-condition) calculation is used to generate a set of proof obligations called verification conditions (VC for short). These are simply formulae of the assertion language that do not contain any occurrences of program constructs. All verification conditions must be shown valid in order to announce that the given program confirms to its specifications. In recent years, many projects using the verification- condtion-based method have been proposed. It is not easy to verify various properties of the programs in different languages using an all-in-one program logic. To solve this problem, domain-specific logics are designed to certify programs written in corresponding domain-specific languages [2]. To alleviate the burden of developing various kinds of VCGens in the domain-specific program verification tools, we propose a method to generate certain verification condition generators automatically in this paper. Although the principle of different VCGens is the same roughly, they can not be reused easily in different projects because they deal with programs using different source languages, different assertion languages, and even different logics. Meanwhile, the way to generate verification conditions is mechanical if the caculation rules are designed clearly. This motives us to design a tool to help the one who wants to program a VCGen in their project. Instead of coding different VCGens according to different requirements again and again, users using our tool shall only provide the grammars of the source language, the assertion language, and the rules of verification condition calculus. Roughly speaking, our tool works in a similar manner as the compiler tool yacc [3]. Our tool can generate the corresponding VCGen with respect to the input information. We hope that the work presented here can be leveraged to provide convenience and tool support in program verification. The main contributions are: An efficient method for generating VCGen automatically. A design of a series of embedded functions called action functions, including how to combine them to express the calculus rules of verification condition generation. An implement of the prototype named VCGEN 2 which can generate a VCGen based on strongest post-condition calculation automatically with respect to simple input about the grammar of the languages and the calculus rules. This paper is organized as follows. In Section II, the framework and the work flow of the generator for VCGen are 2011 Fifth IEEE International Conference on Theoretical Aspects of Software Engineering 978-0-7695-4506-6/11 $26.00 © 2011 IEEE DOI 10.1109/TASE.2011.25 239

[IEEE 2011 IEEE 5th International Symposium on Theoretical Aspects of Software Engineering (TASE) - Xi'an, China (2011.08.29-2011.08.31)] 2011 Fifth International Conference on Theoretical

  • Upload
    yiyun

  • View
    213

  • Download
    1

Embed Size (px)

Citation preview

Page 1: [IEEE 2011 IEEE 5th International Symposium on Theoretical Aspects of Software Engineering (TASE) - Xi'an, China (2011.08.29-2011.08.31)] 2011 Fifth International Conference on Theoretical

A Method to Generate Verification ConditionGenerator

Zhaopeng Li†,‡ Yang Zhang†,‡ Yiyun Chen†,‡† School of Computer Science and Technology, University of Science and Technology of China, Hefei, China

‡ Software Security Laboratory, Suzhou Institute for Advanced Study,

University of Science and Technology of China, Suzhou, China

Email: {zpli, huahuax}@mail.ustc.edu.cn [email protected]

Abstract—We propose a method to generate certain verification condition

generators (VCGens, for short) automatically to be used incertifying compilers or other verification tools in this paper, toalleviate the burden of developing various kinds of VCGens in thedomain-specific program verification tools. We introduce a newmethodology for describing the rules in the verification conditioncalculation. We have implemented a prototype of VCGEN2

(VCGenGen) using C++. This tool provides a series of interfacesnamed action functions to the users. Users can describe thecalculation rules by combining these action functions. And ourtool also embeds a parser generator, so users need to feed inthe grammar of the languages along with the calculation rules.If there is no error, VCGEN2 outputs the corresponding VCGenwith respect to the user-defined languages and rules. We haveused our prototype to generate a number of VCGens successfullyas demonstration.

I. INTRODUCTION

In the field of high-confidence software research, formal

verification is a typical method to construct high-confidence

software. There are roughly two approaches to formal veri-

fication. The first approach and formation is model checking

[1], which consists of a systematically exhaustive exploration

of the mathematical model. Model checking offers great

experience in hardware verification in hardware designs. And

the second approach is logical inference. In general, logical

inference consists of using a formal version of mathematical

reasoning about the system, usually using theorem proving

software to prove the formulae.

We mainly focus on program verification via Hoare-style

logical inference and theorem-proving techniques in this paper,

since this method is widely used and it can provide a math-

ematical, even machine-checkable, proof for the properties of

given programs. Many researches today are focusing on such

an approach to certify critical software and provide formal

proofs.

Verification condition generation is a common way to

achieve Hoare-style logical inference. A program is speci-

fied by enough specifications (pre-/post-conditions, and loop

invariants), which are assertions describing desired program

properties, at certain program points. A program logic, with

inference rules in Hoare-triple ({P}S{Q}), must be designed to

reason about such programs. Then a strongest post-condition

(or weakest pre-condition) calculation is used to generate a

set of proof obligations called verification conditions (VC for

short). These are simply formulae of the assertion language

that do not contain any occurrences of program constructs.

All verification conditions must be shown valid in order to

announce that the given program confirms to its specifications.

In recent years, many projects using the verification-condtion-based method have been proposed. It is not easy to

verify various properties of the programs in different languages

using an all-in-one program logic. To solve this problem,

domain-specific logics are designed to certify programs written

in corresponding domain-specific languages [2].

To alleviate the burden of developing various kinds of

VCGens in the domain-specific program verification tools, we

propose a method to generate certain verification condition

generators automatically in this paper. Although the principle

of different VCGens is the same roughly, they can not be

reused easily in different projects because they deal with

programs using different source languages, different assertion

languages, and even different logics. Meanwhile, the way to

generate verification conditions is mechanical if the caculation

rules are designed clearly. This motives us to design a tool

to help the one who wants to program a VCGen in their

project. Instead of coding different VCGens according to

different requirements again and again, users using our tool

shall only provide the grammars of the source language, the

assertion language, and the rules of verification condition

calculus. Roughly speaking, our tool works in a similar manner

as the compiler tool yacc [3]. Our tool can generate the

corresponding VCGen with respect to the input information.

We hope that the work presented here can be leveraged to

provide convenience and tool support in program verification.

The main contributions are:

• An efficient method for generating VCGen automatically.

• A design of a series of embedded functions called action

functions, including how to combine them to express the

calculus rules of verification condition generation.

• An implement of the prototype named VCGEN2 which

can generate a VCGen based on strongest post-condition

calculation automatically with respect to simple input

about the grammar of the languages and the calculus

rules.

This paper is organized as follows. In Section II, the

framework and the work flow of the generator for VCGen are

2011 Fifth IEEE International Conference on Theoretical Aspects of Software Engineering

978-0-7695-4506-6/11 $26.00 © 2011 IEEE

DOI 10.1109/TASE.2011.25

239

Page 2: [IEEE 2011 IEEE 5th International Symposium on Theoretical Aspects of Software Engineering (TASE) - Xi'an, China (2011.08.29-2011.08.31)] 2011 Fifth International Conference on Theoretical

� � � � �� �

� � ��

� �� � �

� ���

� � � ��� �

� � �� �

� �� � �

� � � �� �

� �� � �

� � � � ��

��� ��

� � � � � � ��� �

� � � � � ��� � � ��� �

� �� � �� � �

� ���

�� � � � ��� ��� �

� �� � � �� �

! � � � �� � � �� "� �

! �� ��� � � � ���� ���

�� � ��� �# � �� � � � �

� � � � �� �� �� �� �

���

�� ��� � �

Fig. 1. The Framework of VCGEN2 and a Typical Application

introduced. The action functions are presented in Section III.

In the final section, we compare our work with related work,

summarize our work and introduce the future work.

II. THE VCGen GENERATOR

In this section we mainly present the framework of the

VCGen Generator. The key challenges are to figure out how to

express the calculus rules and relate them with corresponding

language structures. We use a method similar to the compiler

tool YACC in which users write code to relate the syntax

and program text. We provide a set of embedded functions,

named action functions, to users using which they can describe

calculus rules of verification condition generation.

As shown in Fig 1, the input of VCGEN2 includes the

description of a source language, an assertion language and

the calculus rules. The VCGEN2 contains two corresponding

components: lexical and syntax analyzer generator (Module A

in Fig 1) and calculus rule analyzer based on action functions

(Module B in Fig 1).

After analysis, the VCGEN2 generates a parser to parse the

program into the form of abstract syntax tree (AST). Then the

other component of VCGEN2 generates a VCGen to traverse

the AST. Together, we get a VCGen with a parser as front-

end. A typical application is also shown in Fig 1. When a

program with specifications implemented using corresponding

source language and assertion language is input, verification

conditions will be output to an automated theorem prover

producing a result indicating that the verification conditions

are valid or not.

Next we emphasize three aspects in the design of such a

framework.

A. InterfacesIn order to generate a VCGen, three inputs should be

provided necessarily: the programming language, assertion

language and corresponding calculus rules for verification

condition generation. The generated tool must embed a parser

as front-end to construct the syntax skeleton (abstract syntax

tree) of the input program. Based on the syntax tree, calculus

rules can be carried out to generate verification conditions.As what has to be done in generating a compiler commonly,

to generate a parser, users at least should supply the lexical

and grammar rules according to some standard, either using

existing techniques or designing from scratch. Of course, users

could build symbol tables and provide rules for type checking,

making the front-end more practical to use. For this part of

input, existing techniques to represent lexical and grammar

rules are recommended to be used in order to simplify the

implementation work by directly using some existing tool such

as lex, flex, yacc, etc.Calculus rules of verification generation are essential in

both developing and generating a VCGen. In general when

programmers begin to design a VCGen, they would set up a

set of logical rules firstly. Once these rules are figured out,

the implementation of a VCGen is just a routine work. We

should describe the calculus rules clearly. That is not enough

since the tool should know what these rules mean. That is, we

must make the rules understood by the tool exactly according

to what they are designed to do. This is one of the challenges

how to make the VCGEN2 understand the rules. We propose

a method called action function combination, in which the

calculus rules are represented by a set of actions executing in

sequence. A library of action functions is provided along with

VCGEN2. Rule analyzer generates a VCGen based on action

functions. To support more logics, action functions could be

added easily.

B. The Structure of ASTThe abstract syntax tree bridges the grammar analysis and

the process of verification condition generation. In order to

generate verification conditions, the VCGen traverses and pro-

cesses the AST structure produced by the parser. In traditional,

compiler parsers of different programming languages in differ-

ent projects output different ASTs. That could be impossible

to design a generator of VCGen according to an arbitrary

AST structure. So although the programming languages and

assertion languages are changeable, a uniform (fixed) AST

structure should be used. We need to make general design

standards of the AST structure to perform the role mentioned

before, so that no matter what kind of language structures the

tool deals with, the AST generated by the parser can always

be identified by the generated VCGen:

• According to the grammar rules, the left side symbol

is regarded as parent nodes, and all the terminative

symbols and non-terminative symbols on the right side

are regarded as child nodes;

• The child nodes are stored using singly-linked list in the

same order as the symbols they represent in the rules.

240

Page 3: [IEEE 2011 IEEE 5th International Symposium on Theoretical Aspects of Software Engineering (TASE) - Xi'an, China (2011.08.29-2011.08.31)] 2011 Fifth International Conference on Theoretical

The parent has a pointer pointing to the first child. The

correspondent node of a terminative symbol is a leaf node

of the abstract syntax tree, while the node of a non-

terminative symbol will be constructed to a non-leaf tree

node.

Unnecessary to understand the structure of AST, users can

input the grammar of languages and use the form of a number

following the symbol “$” to represent the corresponding ter-

minative or non-terminative symbol. Because of the structure

of production and AST are same, $n represents the nth symbol

to operate on the AST. In this way, symbols constructed by

the parser can be used in action functions and be processed

in VCGen. As a result of what we have discussed above, we

need a fixed AST structure to bridge the parser module and

the VCGen module. If we use yacc, users must participate

in the construction of AST and it will make implementation

troublesome. Instead, in our prototype implementation, we

embed a parser module which can analyze the lexical and

syntax rules and generate the AST synchronically.

C. Rule Presentation and Analyzer

Calculus rules (of verification condition generation) ana-

lyzer is the core module of VCGEN2. The following form

should be used to describe the grammar production rule and

the calculus rule using action function sequence:

production rule {f1; . . . ; fn},where f1, . . ., fn are action functions to represent a calculus

rule, or operations to modify the structure of AST.

A set of action functions is pre-defined and implemented

as library. Users can use a combination of action functions to

represent different calculus rules. In essence, action functions

describe a series of operations which will be taken sequentially

on the abstract syntax tree of the program. Such a combination

of action functions can represent the process of verification

condition generation. We will introduce our pre-defined action

functions in details in Section III.

Note that it is the users’ responsibility to ensure the calculus

rules exactly do what they want to do. Currently we do not

afford any mechanism to check whether the provided action

function sequences are “correct”. The generated VCGen just

traverses the AST, makes a series of actions on the AST and

produces certain verification conditions.

III. THE ACTION FUNCTIONS

One of the difficult points is how to express calculus rule to

make it easy to understand and “execute” automatically. In or-

der to solve this problem, this paper proposes some embedded

functions called “action functions”. This section introduces

the design of action functions and how to use them. One

challenge is designing less action functions which can be used

in different languages and logics. Only informal descriptions

of action functions are given and interested readers please refer

to the web site [10] for detailed implementation.

VCGEN2 provides a library of embedded functions which

implement some special operations on the AST structure, and

users can combine these functions in certain order into a

sequence to describe a calculus rule. Action functions, such

as AND, SUB, EQ, etc., are all keywords. This method can be

applied to generate VCGens implementing both strongest post-

condition calculation and weakest pre-condition calculation.

Due to limited space, we will focus only on strongest post-

condition calculation in this paper.

A. Categories of Action Functions

In the strongest post-condition calculation, a calculus rule,

in essence, describes how to get the post-condition (assertion

Q) if we know a pre-condition P holds at the point before

executing statements S. That is, Q = SP(P , S). For exam-

ple, if assertions just describe the states at program points,

strongest post-condition will directly represent the semantics

of the executed code. In a sense, the calculus rule describes

a transformation between assertions (from the pre-condition

to the post-condition). Such transformations can be divided

to many smaller atomic transformations called actions. Some

actions represent an operation on assertions (also a node in the

AST), and othere actions operate directly on the structure of

the AST to construct a new AST. Thus we can compose them

to construct the AST and express the calculus in verification

condition generation.

As we can see, the pre-condition (current assertion) P will

be used frequently. We will omit it in the parameter list of the

action functions and use p (a pointer to the node representing

an assertion) to access it (treat it like a global variable).

Thus SP(node S) denotes the main function implementing the

strongest post-condition calculation in the generated VCGen.

Next, a number of categories of action function (part) are

listed according to what the action does. Other categories

please refer to our technique report [10].

• Action functions which modify the current assertion.

This class of action functions can operate directly on

current assertion (AST with root node whose token is

ASSERT). SetP(x) is used to reset the current assertion pto the new value x. AND(x) joins the expression x to the

current assertion p. SUB(x, y) is used for variable sub-

stitution recursively. EQ(x1, x2) is used when describing

actions of an assignment statement like “a = E”. This

action function will create a new node whose token is

EXP and its children are nodes representing a, EQ (==)

and E.

• Store the current assertion.

In the verification condition generation, before modifying

the current assertion, users would like to store the current

assertion for using it later. That happens usually dealing

with the conditional statement such as if-else statement

and loop statement. VCGEN2 provides a vector V of

node pointers (initially NULL), using which users can

store and load node pointers pointing to a copy of certain

assertion (AST). Users can use the action functions

SetV(i) to store the current assertion p into the vector

V at the index i, and load the assertion at index i using

V (i).

241

Page 4: [IEEE 2011 IEEE 5th International Symposium on Theoretical Aspects of Software Engineering (TASE) - Xi'an, China (2011.08.29-2011.08.31)] 2011 Fifth International Conference on Theoretical

• Generate the verification condition.

The action function IMPLY(x) is used to generate a

verification condition which is a tree whose root node

has a token IMPLY. Two children of the IMPLY tree are

p (current assertion) and x.

B. Using Action Functions

Next we give some simple examples to show how to use

various action functions by representing the calculus rules for

some statements according to the production rules. Two points

must be understood before using these action functions:

• The parameters written in action functions are in the form

of a number following the symbol “$”, which stands for

the correspondent place of terminative symbol or non-

terminative symbol in the production rule.

• If the production rule is for a statement, the following

action functions describe its corresponding calculus rule;

Otherwise, the action functions describe certain operation

building AST or inform the tool with some information

such as types.

Representing Calculus Rule of Assignment Statement. The

inference rule for the assignment statement (x is non-pointer

variable) is given in Section II. It is designed in favor of

strongest post-condition calculus using variable substitution.

The calculus rule is quite simple using our action functions:

AND(EQ($1, $3) since we make the complicated variable

substitution implicit (The substitution is done in the action

function AND).

IV. RELATED WORK AND CONCLUSION

A. Related Work

It is not easy to verify various properties of the programs in

different languages using an all-in-one program logic. To solve

this problem, domain-specific logics are designed to certify

programs written in corresponding domain-specific languages.

In recent years, many projects using the verification-condtion-

based method have been proposed.

Boogie [5] is an intermediate language which is used to

verify program by Spec# [6]. And FreeBoogie is Boogie’s

VCGen which can both support the weakest pre-condition

calculus and the strongest post-condition calculus. Why [7]

is a general-purpose VCGen, which is used as a back-end in

several verification tools and can also be used directly to verify

programs. Based on it, there are two tools which deal with

different languages: the tool Krakatoa [7] for the verification

of Java programs and Caduceus [7] for the verification of C

programs. In our previous work of certifying compilers [8],

different VCGens are designed as components of source-level

verification dealing with two different program logics: the

pointer logic [9] and the separation logic.

The most closely related work is by Why [7] and Boogie [5].

Although they can build a serial of different tools of supporting

different programming languages above on their intermediate

languages. But their key contributions are not to consider how

to generate VCGen automatically. Our work is make a brand

new experiment in VCGen generating.

B. ConclusionWe propose a method to generate certain verification condi-

tion generators automatically to be used in certifying compilers

or other verification tools in this paper.VCGEN2 is a research prototype still under heavy develop-

ment and therefore, there is plenty of room for improvement.

Due to the introduction of action functions, VCGEN2 can

support more language features and logics by added more

action functions. Currently, users are required to have some

knowledge of verification condition generation in order to

describe the calculus rules by action functions correctly. There-

fore, at the moment, VCGEN2 might be a little complex to be

used by common programmers.In the future, we are planning to make several kinds

of possible improvement which may help to alleviate this

problem and make the tool more useful. We are planning to

design several sets of action functions to cover a wide range of

languages features (e.g., to support object-oriented languages)

and logics (e.g., the separation logic).

ACKNOWLEDGMENT

We thank anonymous referees for their suggestions and

comments. This research is based on work supported in part

by grants from National Natural Science Foundation of China

(under grants No.60928004, No.61073040 and No.61003043).

Any opinions, findings and conclusions contained in this

document are those of the authors and do not reflect the views

of these agencies.

REFERENCES

[1] Muller-Olm, M., Schmidt, D.A. and Steffen, B. Model checking: a tutorialintroduction. In Proc. 6th Static Analysis Symposium, Springer LNCS1694, 1999, pp. 330C354.

[2] Xinyu Feng, Zhong Shao, Yu Guo and Yuan Dong. Combining Domain-Specific and Foundational Logics to Verify Complete Software Systems.In Proc. Second IFIP Working Conference on Verified Software: Theories,Tools, and Experiments (VSTTE’08), Toronto, Canada, October 2008.Lecture Notes in Computer Science Vol. 5295, pages 54-69.

[3] Stephen C. Johnson. YACC: Yet Another Compiler-Compiler. UnixProgrammer’s Manual Vol 2b, 1979.

[4] J. C. Reynolds. Separation logic: a Logic for Shared Mutable DataStructures. In Proceedings of the 17th Annual IEEE Symposium on Logicin Computer Science, pages 55-74, July 2002.

[5] Mike Barnett, Bor-Yuh Evan Chang, Robert DeLine, Bart Jacobs, andK. Rustan M. Leino. Boogie: A Modular Reusable Verifier for Object-Oriented Programs. In FMCO 2005, LNCS vol. 4111, Springer, 2006.

[6] Mike Barnett, K. Rustan M. Leino, and Wolfram Schulte.The Spec#Programming System: An overview. In CASSIS 2004, LNCS vol. 3362,Springer, 2004.

[7] Jean-Christophe Filliatre and Claude Marche. TheWhy/Krakatoa/Caduceus platform for deductive program verification.In Werner Damm and Holger Hermanns, editors, 19th InternationalConference on Computer Aided Verification, Lecture Notes in ComputerScience, Berlin, Germany, July 2007. Springer-Verlag.

[8] Zhaopeng Li, Zhong Zhuang, Yiyun Chen, Simin Yang, Zhenting Zhangand Dawei Fan. A Certifying Compiler for Clike Subset of C Language.In Proc. of 4th IEEE International Symposium on Theoretical Aspects ofSoftware Engineering (TASE’2010) :47-56, Taiwan, China, August 2010.

[9] Yiyun Chen, Zhaopeng Li, Zhifang Wang and Baojian Hua. A PointerLogic for Verification of Pointer Programs. Chinese Journal of Software.Vol.21(No.3):124-137, March, 2010.

[10] Zhaopeng Li, Yang Zhang and Yinyun Chen. Technical Report andImplementation of A Generator for Verification Condition Generator.Website URL: http://kyhcs.ustcsz.edu.cn/content/vcgen-generator, April2011.

242