of 30 /30
Checking Reachability using Matching Logic Grigore Rosu and Andrei Stefanescu University of Illinois, USA

Checking Reachability using Matching Logic

  • Author
    rhona

  • View
    27

  • Download
    0

Embed Size (px)

DESCRIPTION

Checking Reachability using Matching Logic. Grigore Rosu and Andrei Stefanescu University of Illinois, USA. Main Goal. Language-independent program verification framework D erive program properties from operational semantics Questions : Is it possible? Is it practical? Answers : - PowerPoint PPT Presentation

Text of Checking Reachability using Matching Logic

From Hoare Logic to Matching Logic Reachability

Checking Reachability using Matching LogicGrigore Rosu and Andrei StefanescuUniversity of Illinois, USA

0Main GoalLanguage-independent program verification frameworkDerive program properties from operational semanticsQuestions:Is it possible?Is it practical?Answers: Sound and complete proof system, so YES, it is possible!Efficient automated verifier MatchC, so YES, it is practical!1The main goal of our research is to develop a language independent program verification framework, which can derive program correctness properties directly from the operational semantics of a language, without any axiomatic semantics. Questions: Is this approach possible? If yes, is it practical? We present a sound and complete proof system, so the answer is yes, it is possible. We have implemented MatchC, an efficient automated verifier based on the proof system, so the answer is yes, it is practical. 1OverviewState-of-the-art in Certifiable VerificationOur ApproachSpecifying Reachability PropertiesReasoning about Reachability

During this presentation, I will briefly discuss the state-of-the-art in certifiable verification, then I will present our approach, in particular how we specify program reachability properties and how we reason about them efficiently using our proof system.2Operational SemanticsEasy to define and understandCan be regarded as formal implementationsRequire little mathematical knowledgeGreat introductory topics in PL coursesScale up wellC (>1000 rules), Java, Scheme, Verilog, , definedExecutable, so testableC semantics tested against real benchmarks

3Operational semantics are easy to define and understands. Giving an operational semantics to a language is like implementing a formal interpreter. That requires little mathematical knowledge, making them great introductory topics in PL courses. Operational semantics scale well. A number of real life languages have been defined in this style, including C (with a definition consisting of more than 1000 semantic rules), Java, Scheme, Verilog, and so on. A major point in favor of operational semantics is that they are executable, so they can be tested against benchmarks of programs, to ensure their accuracy. For example, the C semantics was tested on the GCC torture tests. For these reasons, operational semantics serve as trusted models of programming languages.3Operational SemanticsSample rule (may require a configuration context)

Define languages only with rules of the form

l, r are configuration termsb is a Boolean side condition

4To give an example, the rule on the slide (show on the slide) captures the semantics of while, via loop unrolling. The rule is given in reduction semantics style, and applies in a configuration context. Quite intuitive, right? In general, we can give an operational definition to any programming languages only with rules of the form left reduces to right if side condition holds, where left and right are configuration terms

4Unfortunately Operational semantics considered inappropriate for program verification; proofs are low-level and tedious:Formalization of and working with transition systemTypically by inductionon the structure of the programon the number of execution stepsetc.

5However, operational semantics are in general considered inappropriate for program verification, mainly because proofs based on operational semantics are low-level and tedious, as one has to formalize the transition system induced by the semantics and then work in it, typically by explicit induction on the structure of the program or on the number of execution steps.

5Axiomatic Semantics(Hoare Logic)Emphasis on program verificationProgramming language captured as a formal proof system deriving Hoare triples

6preconditionpostconditionAxiomatic semantics address this limitations, by emphasizing program verification instead. They capture a programming language as a formal proof system deriving Hoare triples of the form precondition, code, postcondition.6Axiomatic SemanticsNot easy to define and understand, error-proneNot executable, hard to testRequire program transformations, behavior loss

7Write e = 1 and youve got a wrong semantics!However, axiomatic semantics are not as easy to define and understands as operational semantics, making them more error-prone. Moreover, they are not executable, and so they are harder to test. For example, the HL rule for while (show on the slide) takes into account certain language specific features, like the fact that non-zero means true and 0 means false. Also, it makes certain assumptions, like the fact that e must not have side-effect. Thus, axiomatic semantics may require program transformations, which could lose behaviors.7State-of-the-art inCertifiable VerificationDefine an operational semantics: trusted language modelDefine an axiomatic semantics: for verification purposesProve axiomatic semantics sound for operational semanticsNow we have trusted verification BUTRequires two semantics of the same languageC operational semantics took more than 2 years!Must be done individually for each language

8So, to achieve trusted verification, we define an operational semantics, which acts as a trusted model of the language, then we define an axiomatic semantics for verification, then we prove the axiomatic semantics sound for the operational semantics, and voila, we have certifiable verification. BUT, this requires two semantics of the same language, which is not convenient considering that, for example, defining C operationally took more than 2 years. Also, as languages evolve, this requires changes in two semantics and the soundness proof. And all of this must be done individually for each language.// For this reasons many existing verification tools are not semantically sound.

8OverviewState-of-the-art in Certifiable VerificationOur ApproachSpecifying Reachability PropertiesReasoning about Reachability

We want to change that! Our approach9

Our ApproachUnderlying belief: one semantics for each language!Executable (testable), easy to define and understandSuitable for program verification, as isApproach: language-independent proof systemTakes operational semantics unchangedDerives program propertiesBoth operational semantics rules and program specifications stated as reachability rules

10 is based on the underlying belief is that a language needs only one semantics! (pause) This semantics should be executable (so testable), and easy to define and understand, essentially it should have all the nice properties of operational semantics. And it should be suitable for program verification as it it, without any axiomatic semantics or explicit induction.So, our approach is to devise a language-independent proof system which takes operational semantics unchanged and derives program properties directly form them. To achieve this, both operational semantics rules and program specifications are stated as reachability rules.10Reachability RulesPairs of configuration predicates11

Reachability: Any concrete configuration satisfying and terminating reaches a configuration satisfying , in the transition system induced by the operational semantics .

Reachability rules are pairs of configuration predicates, capturing the dynamic properties of configurations. By configuration we refer to both code and state. Formally, the meaning of a rule is that any concrete configuration satisfying \phi and terminating eventually reaches a configuration satisfying \phi. The execution takes place in the transition system induced by the operational semantics S.

11OverviewState-of-the-art in Certifiable VerificationOur ApproachSpecifying Reachability Properties Reasoning about Reachability

Lets see how to specify program reachability properties.12Reachability Rules- Operational + Axiomatic -Operational flavor

Axiomatic flavor13

We care about reachability rules because they have both operational and axiomatic elements. This rule reducing *x to its value V (show on the slide) has operational flavor. The evaluation takes place in a particular configuration containing an environment mapping x into location L and a heap with value V at location L. The rule mentions only what it needs, so the refer to the rest of the environment or the heap or the configuration, and is not changed.The next rule specifies the behavior of a code fragment computing the sum of the first n natural numbers and has an axiomatic flavor. The configuration with the code SUM and variable n greater than 0 in the environment reaches a configuration where the code SUM has been executed and variable s in the environment holds the value of the sum. In the second rule, the heap is not needed, so it is not mentioned, and stays unchanged. This is how we achieve framing.13Hoare Triple = Syntactic Sugar14

This is a code fragment from a program reversing a singly-linked list verified by MatchC (which we will discuss later). The invariant, states that p points to the part of the list already reversed, and x points to the part of the list yet to be reversed. This Hoare-style invariant is just syntactic sugar for a reachability rule. The LHS combines the code of the loop (shown in red) and the invariant, while in the RHS the code has been executed, and the condition of the loop has been evaluated with the semantics and its negation added as a constraint (shown in blue).14Matching LogicState static properties of program configurationsParametric in a model of configurationsExtends first-order logic with patternsSpecial predicates which are configuration termsConfigurations satisfy patterns iff they match themC Configurations15

Extra 70 cellsMatching logic is designed to state and reason about static properties of the program configuration, so it is parametric in a model of program configuration. It extends first-order logic with special predicates, called patterns, which are open configuration terms. Concrete configurations satisfy patterns if they match them. The C configuration contains, among other things, code (also called k), heap, input and output buffer, and some extra 70 cells. 15Model of Configurations- Properties -Configuration abstraction (list)Separation achieved at term level

Operations (reverse)

16

As mentioned before, ML is parametric in a model of configurations, which are characterized by certain properties. The linked list abstraction we have just seen has the following: either the address p is NULL and the list is empty, or the address p points to some list entry with the next field pointing to the rest of the list. The comma in the heap means the entry and the list are disjoint, because they match different subterms. We also have properties of the operations defined on the mathematical domains, like this one for reverse.

16Separation Logic =Matching Logic Instance Separation logic: popular logic for heap propertiesMechanical translation to matching logic (see paper)Configuration:Separation encoded using different sub-termsNo expressiveness loss from using matching logicMatching logic gives structural separation anywhere in the configuration, not only in the heap

17Separation logic is a popular choice for specifying heap properties. As it turns out, it can be mechanically translated into a matching logic instance: pick a configuration with only one component, namely the heap; then the separation constraint is encoded using different sub-terms. So by using matching logic for stating properties, we do not loose expressiveness. In general matching logic gives structural separation anywhere in the configuration, not just in the heap.17Operational and Axiomatic Semantics Rules as Reachability RulesReachability rules generalizeOperational semantics rulesHoare triplesOperational semantics rule is syntactic sugar for reachability rule Hoare triple encoded in a reachability rule with the empty code in the right-hand-side (see FM12)

18Reachability rules capture both operational semantics rules and Hoare triples. The operational semantics rule left reduces to right if condition is syntactic sugar for the reachability rule left and condition reaches right. A Hoare triple can be encoded in a reachability rule with the empty code in the RHS.18OverviewState-of-the-art in Certifiable VerificationOur ApproachSpecifying Reachability PropertiesReasoning about Reachability

Now lets see how to reason about reachability.

19Reasoning about ReachabilityThe main result of our paper is a proof system deriving reachability rules from reachability rules:20

Trusted reachability rules(starts with operational semantics)Target reachability rule

Claimed reachability rulesThe main result of the paper is a language-independent proof system which derives reachability rules specifying program properties from trusted reachability rules. In the beginning the trusted rules are just the operational semantics rules. During the proof one can claim additional reachability rules, which cannot be used right away.They can be used only after taking at least one step with the trusted rules in A.// The rules in C are added to those on A only after taking at least one step with the trusted rules in A.20Reachability Proof System- 8 Rules -

21Symbolic execution (multiple steps)Symbolic execution (one step)Code with circular behaviorThe language independent proof system for matching logic reachability consists of 8 rules.The proof rules of Axiom and Logic Framing allow for one step of symbolic execution (for instance, a basic statement).The rules of Axiom, Transitivity, Reflexivity, Logic Framing, Consequence, Case Analysis, and Abstraction allow for symbolic execution of code fragments without circular behavior (without loops, mutually recursive functions, etc.)The rule of Circularity deals with such repetitive behaviors.Once claims have been made, reflexivity cannot be used until one step is performed, which means a non-empty set of claims ensure at least on step is taken with the trusted rules.21Circular behaviorsCircularity and Transitivity proof rules

Hoare logic rule for while loops

22Language-independentLanguage-specific

So how do we reason about circular behaviors? Circularity allows us to add the current rule to the set C of claimed circularities. Then transitivity allows us to use the circularities in C in their own proofs, but only after deriving at least one step with the trusted axioms in A. For instance, to achieve a similar effect as the Hoare logic invariant rule, we apply Circularity for the specification of the while, we execute on iteration of the loop with the operational semantics, which counts as trusted steps for Transitivity, and then we use the specification for the rest of the iterations.22SoundnessTheorem: If is derivable by the proof system, then is semantically valid.

23Our proof system is sound: any derived specification is correct w.r.t the transition system induced by the operational semantics S. This result is language independent, so we only prove it once and then use it for all the languages.23Relative CompletenessRelativityValidity oracle for static configuration propertiesLanguage-independent result, unlike Hoare logicsTheorem: If is semantically valid, then is derivable by the proof system, with the operational semantics of a language.

24Our proof system is also relatively complete: any correct specification is derivable from the semantics. The relativity refers to the fact that we assume an oracle capable of establishing validity of static configuration properties. This assumption is made by the completeness results of Hoare logics as well. Effectively, the result says that our proof system for dynamic properties is complete. This result is language independent, so we only prove it once for all, unlike the similar results for Hoare logics, which are proven individually for each language.24MatchCProof-of-concept verifier for a C fragmentDerives program specifications from the operational semantics (in K framework) using the proof systemNo Hoare/separation logic, no WP, no VC generationAutomated, user only provideSpecifications for recursive functions and loops25MatchC is a proof-of-concept verifier for a C fragment. It derives program specifications given as reachability rules from the operational semantics written in the k-framework using the proof system. There is no Hoare or separation logic involved, no weakest-precondition, no verification-condition-generation. The verifier is automated, requiring only specifications for while loops and recursive functions.25MatchC Snapshot26

List reverse: code + invariantThis is a snapshot of MatchC verifying the list reverse example mentioned before.26ImplementationHeuristics for applying the proof system(forward) symbolic executionMatching logic reasoningMaude: efficient structure matching and rearrangingmatching a list the heap, SMTs (CVC3, Z3): simplifying constraintssmall queries (milliseconds each)

27MatchC symbolically executes the LHS of rules searching for the RHS, applying the proof system according to simple heuristics. For the static configuration reasoning required during the execution path, we use Maude for efficient structural matching and rearranging (for instance, matching a list in the heap), and we use SMTs for simplifying constraints. The size of each query is small and does not depend on the length of the execution path, only on the complexity of the specifications.//Each query takes milliseconds, enabling MatchC to scale well with the code size.27Preliminary Evaluation28ProgramTime (s)Buffered read-write0.15Stack inspection0.24Insertion sort0.41Merge sort0.47Quicksort1.97AVL find0.15AVL insert43.5AVL delete133.58Schorr-Waite (tree)0.28Schorr-Waite (graph)1.73Dozens more programs at matching-logic.orgOnly annotated main functions (insert/delete).Inlined auxiliary functions (balance, rotate, ).We have evaluated MatchC on a list or programs using data-structures and I/O and implement call policies, sorting algorithms, search trees, and the Schorr-Waite graph marking algorithm. This is a selection of interesting programs verified by MatchC for functional correctness. There are dozens more available on the matching logic webpage. The times are low (in the range of a few seconds), so there is no performance loss for basing the verification on the operational semantics instead of a dedicated axiomatic semantics. The AVL takes longer because we have only annotated the main functions (find, insert, delete), while the others (rotate, balance, and so on) are inlined leading to a path-explosion. This is an extreme case, and one can always annotate all functions (which is done by default in HL).

28ConclusionsMatching logic reachability proof systemSound and (relatively) completePracticalMatchC, an automated verifierExpressiveEfficientOperational semantics based verification is viable!29matching-logic.orgTo conclude, we have presented the matching logic reachability proof system, which is sound and complete. We have establish it is practical by implementing MatchC, an expressive and efficient automated verifier. And most importantly, we provided evidence that operational semantics based verification is viable.29