11
NOVEMBER/DECEMBER 2006 1541-1672/06/$20.00 © 2006 IEEE 71 Published by the IEEE Computer Society Verified Software: Proposal for a Grand Challenge Project Tony Hoare, Microsoft Research Verified software consists of programs that have been proved free of certain rigorously specified kinds of error. Confidence in the correctness of the proof is very high be- cause it has been generated and checked automatically by computer. Our understanding of the concept of a proof goes back to the Greek philosophers Pythagoras and Aristotle. The German philosopher Leibniz proposed mechanical reasoning for mathematical theorems. James King ex- plored the idea of a mechanical program verifier in his 1969 doctoral thesis. 1 Since then, the ideal of verified soft- ware has inspired a generation of research into program correctness, programming language semantics, and me- chanical theorem proving. The results have moved into in- dustrial applications to detect errors in software simula- tions of hardware devices. A new Grand Challenge? A Grand Challenge is a scientific project that an interna- tional team of scientists coordinates and carries out over 10 years or more. A recent example, the Human Genome Pro- ject (1991–2004), was inspired by an altruistic scientific ideal to uncover the six billion bits of the human genome— effectively, the blueprint from which nature manufactures each of us. It was expected that on successful completion of the project (15 years ahead), the pharmaceutical indus- try could exploit its results and experimental techniques and tools to benefit human health. This is now beginning to happen. Many scientists took the turn of the millennium as an opportunity to ask themselves and their colleagues whether the time was ripe for a Grand Challenge project of similar scale and ambition in their own branch of science. I pro- pose that computer scientists reactivate the program veri- fier challenge and solve it by concerted collaborative and competitive efforts within the foreseeable future. Leading computer scientists have discussed this proposal at confer- ences in Europe, North America, and Asia over the last two years. Vision The scientists planning this project envision a world in which computer programmers seldom make mistakes and never deliver them to their customers. At present, program- ming is the only engineering profession that spends half its time and effort on detecting and correcting mistakes made in the other half of its time. We envision a world in which computer programs are always the most reliable component of any system or de- vice that contains them. At present, devices must often be rebooted to clear a software error that has stopped them from working. Intelligent Systems and Formal Methods in Software Engineering Bernhard Beckert Software is vital for modern society. It manages our finances, regulates communication and power generation, controls air- planes, and processes security-critical information. Consequently, the efficient development of dependable software is of ever- growing importance. Over the last few years, technologies for the formal description, construction, analysis, and validation of soft- ware—based mostly on logics and formal reasoning—have ma- tured. We can expect them to complement and partly replace tra- ditional software engineering methods in the future. Formal methods in software engineering are an increasingly important application area for intelligent systems. The field has outgrown the area of academic case studies, and industry is show- ing serious interest. This installment of Trends & Controversies looks at the state of the art in formal methods and discusses the develop- ments that make successful applications possible. Tony Hoare, a pioneer in the field, convincingly argues that we’ve reached the point where we can solve the problem of how to formally verify industrial-scale software. He has proposed program verification as a computer science Grand Challenge. Deductive software verifica- tion is a core technology of formal methods. Reiner Hähnle describes recent dramatic changes in the way it’s perceived and used. Another important base technique of formal methods, besides software verification, is synthesizing software that’s correct by construction because it’s formally derived from its specification. Douglas R. Smith and Cordell Green discuss recent developments and trends in this area. Surprisingly efficient decision procedures for the satisfiability modulo theories problem have recently emerged. Silvio Ranise and Cesare Tinelli explain these techniques and why they’re important for all formal-methods tools. Thomas Ball and Sriram Rajamani look at formal methods from an industry perspective. They explain the success of Microsoft Research’s SLAM project, which has developed a verification tool for device drivers. –Bernhard Beckert Trends & Controversies

Intelligent Systems and Formal Methods in Software Engineering · 2014-01-28 · ditional software engineering methods in the future. Formal methods in software engineering are an

  • Upload
    others

  • View
    12

  • Download
    0

Embed Size (px)

Citation preview

NOVEMBER/DECEMBER 2006 1541-1672/06/$20.00 © 2006 IEEE 71Published by the IEEE Computer Society

Verified Software: Proposal for a Grand Challenge ProjectTony Hoare, Microsoft Research

Verified software consists of programs that have beenproved free of certain rigorously specified kinds of error.Confidence in the correctness of the proof is very high be-cause it has been generated and checked automatically bycomputer. Our understanding of the concept of a proof goesback to the Greek philosophers Pythagoras and Aristotle.

The German philosopher Leibniz proposed mechanicalreasoning for mathematical theorems. James King ex-plored the idea of a mechanical program verifier in his1969 doctoral thesis.1 Since then, the ideal of verified soft-ware has inspired a generation of research into programcorrectness, programming language semantics, and me-chanical theorem proving. The results have moved into in-dustrial applications to detect errors in software simula-tions of hardware devices.

A new Grand Challenge?A Grand Challenge is a scientific project that an interna-

tional team of scientists coordinates and carries out over 10years or more. A recent example, the Human Genome Pro-ject (1991–2004), was inspired by an altruistic scientificideal to uncover the six billion bits of the human genome—effectively, the blueprint from which nature manufactureseach of us. It was expected that on successful completionof the project (15 years ahead), the pharmaceutical indus-try could exploit its results and experimental techniquesand tools to benefit human health. This is now beginning tohappen.

Many scientists took the turn of the millennium as anopportunity to ask themselves and their colleagues whetherthe time was ripe for a Grand Challenge project of similarscale and ambition in their own branch of science. I pro-pose that computer scientists reactivate the program veri-fier challenge and solve it by concerted collaborative andcompetitive efforts within the foreseeable future. Leadingcomputer scientists have discussed this proposal at confer-ences in Europe, North America, and Asia over the last twoyears.

VisionThe scientists planning this project envision a world in

which computer programmers seldom make mistakes andnever deliver them to their customers. At present, program-ming is the only engineering profession that spends half itstime and effort on detecting and correcting mistakes madein the other half of its time.

We envision a world in which computer programs arealways the most reliable component of any system or de-vice that contains them. At present, devices must often berebooted to clear a software error that has stopped themfrom working.

Intelligent Systems and Formal Methods in Software EngineeringBernhard Beckert

Software is vital for modern society. It manages our finances,regulates communication and power generation, controls air-planes, and processes security-critical information. Consequently,the efficient development of dependable software is of ever-growing importance. Over the last few years, technologies for theformal description, construction, analysis, and validation of soft-ware—based mostly on logics and formal reasoning—have ma-tured. We can expect them to complement and partly replace tra-ditional software engineering methods in the future.

Formal methods in software engineering are an increasinglyimportant application area for intelligent systems. The field hasoutgrown the area of academic case studies, and industry is show-ing serious interest. This installment of Trends & Controversies looksat the state of the art in formal methods and discusses the develop-ments that make successful applications possible. Tony Hoare, apioneer in the field, convincingly argues that we’ve reached thepoint where we can solve the problem of how to formally verifyindustrial-scale software. He has proposed program verification asa computer science Grand Challenge. Deductive software verifica-tion is a core technology of formal methods. Reiner Hähnle describesrecent dramatic changes in the way it’s perceived and used. Anotherimportant base technique of formal methods, besides softwareverification, is synthesizing software that’s correct by constructionbecause it’s formally derived from its specification. Douglas R. Smithand Cordell Green discuss recent developments and trends in thisarea. Surprisingly efficient decision procedures for the satisfiabilitymodulo theories problem have recently emerged. Silvio Ranise andCesare Tinelli explain these techniques and why they’re importantfor all formal-methods tools. Thomas Ball and Sriram Rajamani lookat formal methods from an industry perspective. They explain thesuccess of Microsoft Research’s SLAM project, which has developeda verification tool for device drivers.

–Bernhard Beckert

T r e n d s & C o n t r o v e r s i e s

72 www.computer.org/intelligent IEEE INTELLIGENT SYSTEMS

Cost-benefitCurrent estimates of the cost of software

error to the world’s economies are in the re-gion of US$100 billion a year and increas-ing. Software producers and users share thiscost. The total cost of our proposed researchover a period of 10 years is $1 billion, sharedamong all participating nations.

If the research is successful, the cost ofdeploying its results will be many timeslarger than the research cost. Those whoprofit from the deployment will meet thesecosts.

SecurityVerification methods already exist that de-

tect some errors that make software vulnera-ble to virus attack. However, the technologyis also available to those who plan such at-tacks. We must develop the technology notonly to detect but also to eliminate such errors.

The risksThe main scientific risk to the project is

that methods of computer generation andchecking of proofs continue to requireskilled human intervention. The tools willtherefore remain as scientific tools, suit-able to advance research but inadequatefor engineers’ needs.

Another risk is that the computing re-search community has little experience withlong-term research, requiring collaborationamong specialized research areas. The pro-ject’s success will require adjustment of thecurrent research culture in computer scienceand alignment of research-funding mecha-nisms to new scientific opportunities, meth-ods, and ideals.

The main impediment to deploying theresearch results will be educating and train-ing programmers to exploit the tools. Theprimary motivator for deployment will becommercial competition among softwaresuppliers.

The methodThe project will exploit normal scientific

research methods. Advances in knowledgeand understanding will be accumulated incomprehensive mathematical theories ofcorrect programming. These will be testedby extensive experimentation on real work-ing programs from a broad range of appli-cations. The research results will be incor-porated in sets of tools that will enable thepracticing programmer to exploit them re-liably and conveniently.

Scientific idealsAs with the Human Genome Project, the

driving force of the project is scientific cur-iosity to answer basic questions underlyingthe whole discipline. For software engineer-ing, the basic questions are: What do pro-grams do? How do they work? Why do theywork? What evidence exists for the correct-ness of the answers? And finally, how canwe exploit the answers to improve the qual-ity of delivered software?

The ideal of program correctness is likeother scientific ideals—for example, accu-racy of measurement in physics, purity ofmaterials in chemistry, rigor of proofs inmathematics. Although these absolute idealswill never in practice be achieved, scientistswill pursue them far beyond any immediatemarket need for their realization. Experienceshows that eventually some ingenious inven-

tor or entrepreneur will find an opportunityto exploit such scientific advances for com-mercial benefit.

Commitment to such ideals contributesto the cohesion of integrity of scientific re-search. It’s essential to the organization of alarge-scale, lengthy project such as a GrandChallenge. A similar commitment by theengineer contributes to the trust and esteemthat society accords to the profession.

Industrial participationThe programming tools developed in this

project must be tested against a broad rangeof programs representative of those in usetoday. We propose that our industrial part-ners contribute realistic challenge programsof various sizes and degrees of difficulty.These must be programs with no competitivevalue so that they can be released for sci-entific scrutiny. Industry might also contri-

bute valuable prizes to the teams making thegreatest progress on these and other chal-lenge problems.

The tools will be continuously im-proved in the light of experience of theiruse. Throughout the project, students willbe trained to apply the tools to commer-cially relevant programs and will obtainemployment that exploits their skills oncompetitive proprietary products.

The scientists engaged in the project willgive any required support and advice for thedevelopment of commercial toolsets that willspread the project’s results throughout theprogramming profession.

The state of the artAdvanced engineering industries—inclu-

ding transport, aerospace, electronics hard-ware, communications, defense, and nationalsecurity—already exploit verification tech-nology in small, safety-critical cases. Spe-cialized tools in these areas are available fromnumerous start-up companies, together withconsultation and collaboration in their use.Other tools are available for rapid detectionof as many errors as possible.

In scientific research laboratories, guar-antees based on complete verification havebeen given for small microprocessors, op-erating systems, programming languagecompilers, communication protocols, largemathematical proofs, and even the essentialkernels of the proof tools themselves.

In the last 10 years, some of the softwarecarrying out the basic task of constructingproofs has improved in performance by a fac-tor of 1,000. This is in addition to the factorof 100 achieved by improvements in the hard-ware’s speed, capacity, and affordability. Boththe software and hardware continue to im-prove, giving a fair chance of substantialprogram verification over the next 10 years.

The start of the projectMany research centers throughout the

world are already engaged in the early stagesof this project. The next step is to persuadethe national funding agencies of the coun-tries most capable of contributing that thisGrand Challenge potentially offers a goodreturn on investment.

Reference

1. J.C. King, “A Program Verifier,” PhD thesis,Carnegie Inst. of Technology, 1969.

Programming is the only

engineering profession that

spends half its time and effort

on detecting and correcting

mistakes made in the other half

of its time.

NOVEMBER/DECEMBER 2006 www.computer.org/intelligent 73

Deductive SoftwareVerificationReiner Hähnle, University of Koblenz

Deductive software verification is a for-mal technique for reasoning about propertiesof programs that has been around for nearly40 years (see the previous essay for more de-tails about verification’s history). However,numerous developments during the last dec-ade have dramatically changed how we per-ceive and use deductive verification:

• The era of verification of individual algo-rithms written in academic languages isover. Contemporary verification toolssupport commercial programming lan-guages such as Java or C#, and they’reready to deal with industrial applications.

• Deductive verification tools used to bestand-alone applications that could beused effectively only after years of acad-emic training. Now we see a new genera-tion of tools that require only minimaltraining to use and are integrated intomodern development environments.

• Perhaps the most striking trend is thatdeductive verification is emerging as abase technology. It’s employed not onlyin formal software verification but alsoin automatic test generation, in bug find-ing, and within the proof-carrying codeframework—with more applications onthe horizon.

What is deductive softwareverification?

Three ingredients characterize deductivesoftware verification: First, it represents tar-get programs as well as the properties to beverified as logical formulas that must beproven to be valid. Second, it proves valid-ity by deduction in a logic calculus. Third, ituses computer assistance for proof searchand bookkeeping.

In contrast to static analysis and modelchecking, you can model the semantics ofthe target programming language precisely—that is, without abstracting from unboundeddata structures (integers, lists, trees, and soon) or unbounded programming constructs(such as loops or recursive method calls).The logics used for deductive verification areat least as expressive as first-order logic withinduction. So, you can formalize and provefar-reaching properties of target programs.

This precision has a price, of course: evenmore so than in model checking, predicting

the amount of computing resources neces-sary to achieve a verification task is difficult.In general, computability theory impliestheoretical limitations that exclude a verifi-cation system that can prove program prop-erties over infinite structures.

In light of these results, in the past,designers of theorem provers often tradedoff automation for precision by relying onhuman interaction. A recent trend is to insiston automation but to accept occasionallyapproximate results. In contrast to static an-alysis and model checking, which use ab-straction, in deductive verification you giveup on completeness by shortcutting proofsearch. In this case, you can’t obtain func-tional correctness proofs, but automatic gen-eration of unit tests and warnings about po-tential bugs are still useful.

Embedding versus encodingTwo approaches to logical modeling of

programming languages and their semanticsexist. In the first, target programs appear as aseparate syntactical category embedded inlogical expressions. The best-known formal-ism of this kind is Hoare logic. More recentones include Dijkstra’s weakest preconditioncalculus and dynamic logic. Logical rulesthat characterize the derivability of those for-mulas that contain programs reflect the targetprogramming language’s operational seman-tics. Rule application is syntax driven, and atleast one rule exists for each target-languageconstruct. You can view proofs in such cal-culi as symbolic execution of the programunder verification. You can compute either aprogram’s strongest postcondition with res-pect to a given start state or, vice versa, theweakest precondition with respect to a givenfinal state. In either case, symbolic execution

results in a set of program-free formulas, theverification conditions. A first-order theoremprover can then automatically carry out acomparison of the strongest postcondition(weakest precondition) to the final state (startstate) in the specification using a first-ordertheorem prover.

Representative state-of-the-art systems(see table 1) employing the embedding ap-proach include Spec# and KeY. Spec#, al-though its authors refer to it as a “staticprogram verifier,” is an advanced verifica-tion condition generator with a theoremprover back end for C# programs annotatedwith specifications written in the Spec#language. KeY supports the full Java Card2.2 standard (http://java.sun.com/products/javacard). It’s not merely a verification con-dition generator; it interleaves symbolic ex-ecution and automated theorem proving toincrease efficiency.

The second modeling approach encodesboth the target language syntax and seman-tics as theories, often in higher-order logic.This involves formalizing many auxiliarydata structures such as sets, functions, lists,tuples, and records. To verify a target pro-gram—that is, to prove a theorem about it—you need a large library of lemmas for theauxiliary theories used in the formalization.Programs are typically encoded as mutuallyrecursive function definitions, which canalso be seen as an abstract functional pro-gramming language. Needless to say, fullyformalizing an imperative programminglanguage such as Java in this way is a sub-stantial undertaking.

Arguably the most powerful verificationsystem along these lines is ACL2, which itsauthors have been improving since they cre-ated it more than 30 years ago. Another well-known system is HOL, a general interactivetheorem-proving system based on typedhigher-order logic. The generic theoremprover Isabelle is increasingly popular—iteven lets you define the logic it uses as abasis of formalization. HOL and Isabelleweren’t designed to be software verificationtools; they’re logical and reasoning frame-works that can be just as well applied (and infact have been applied) to pure mathematics.

A changing perspectiveBoth approaches to logic modeling of

programs—embedding and encoding—have specific strengths and weaknesses.These lead to distinct usage scenarios.The embedding approach is giving rise to

A new generation of

deductive verification tools

require only minimal training

to use and are integrated

into modern development

environments.

a new generation of program analysistools.

Despite ACL2’s impressive performanceand detailed documentation,1 using it effec-tively requires a long apprenticeship: theformalization style is tightly interwovenwith the control of heuristics that are essen-tial for automation. Consequently, detailedtool knowledge is required for effective use,limiting the number of potential users.

General-purpose verification tools basedon encoding programs into logical frame-works require you to master their theoreti-cal underpinnings before you can use them.Large, imperative programming languagesdon’t match these tools’ functional flavorwell, and formalizing the target languagesemantics is a considerable undertaking. Inconjunction with their generality, this makesthem predestined for theoretical investiga-tions (for example, precise semanticalmodeling of programming languages)rather than routine verification tasks. Onthe other hand, these tools’ expressivity letsyou formally verify the correctness of

compilers or of the calculi used in the sys-tems based on embedding that have a sim-pler theory. This latter capability of cross-validating the target language’s formalsemantics in different verification tools isessential to ensure trust in the verificationprocess. In the light of certification, thisis increasingly important.

Research groups within the theorem-proving community rooted in logic andproof theory (where formal foundationsand theoretical completeness are majorvirtues) often developed tools based onencoding of target programs. The recentgeneration of tools based on embedding,however, is heavily influenced by the pro-gramming languages and software engi-neering communities’ needs:

• User interfaces shouldn’t be more com-plex than those of debuggers or compilers.

• Tools must be integrated into develop-ment environments and processes.

• Bug finding and testing are as importantas verification.

On the other hand, approximation viaincompleteness is quite acceptable.

For verification tools to be usable, theymust cover to a considerable extent modernindustrial programming languages such asJava and C# as verification targets. Equallyimportant is the possibility of writing speci-fications in languages close to those devel-opers use. For example, KeY supports OCL(the Object Constraint Language, which ispart of the Unified Modeling Language; seewww.uml.org) and JML (the Java ModelingLanguage; see www.cs.iastate.edu/~leavens/JML). The latter is compatible with Java ex-pression syntax and has become the de factostandard for verification tools targeting Java.JML is also supported by JACK (Java AppletCorrectness Kit) and JIVE (Java Interactivesoftware Visualization Environment), amongother tools. Spec#, named identically to thetool set, is a JML analog for C#. In all cases,high-level specifications are compiled auto-matically to logic-based representations, sothat users don’t need to write such low-levelspecifications.

Trends and challengesOne major trend is that verification tools

attempt to combine formal verification, au-tomatic test generation, and automatic bugfinding into a uniform framework that, inturn, is integrated into standard software de-velopment platforms such as Eclipse (www.eclipse.org). The driving force behind thistrend is the insight that mere formal verifica-tion is too limited in many cases: it dependson the existence of relatively extensive for-mal specifications and, as I explained earlier,often isn’t fully automatic. In addition, for-mal verification target languages are at thesource-code or, at most, bytecode level. Butthere must also be ways to validate binariesdeployed on, for example, smart cards. Weshould also remember that industrial usersoften fail to see the point of proving correct-ness, but they always see the need for debug-ging and testing.

A uniform view of verification, testgeneration, and bug finding is well justi-fied: the Extended Static Checking project(http://secure.ucd.ie/products/opensource/ESCJava2) pioneered verification technol-ogy for bug finding. A theoretical basis al-so exists for recasting the search for bugsas verification attempts of invalid claims.2

Likewise, automated white- and black-boxtest generation can be embedded into a ver-ification framework.3

74 www.computer.org/intelligent IEEE INTELLIGENT SYSTEMS

Table 1. A selection of verification tools.

Modeling TargetName Creator Availability approach language Remarks

ACL2 University Public Encoding ACL2of Texas

HOL (Higher- University Public Encoding Various Programming order logic) of Cambridge environment

Isabelle University Public Encoding Variousof Cambridge, Technical University Munich

JACK (Java INRIA Need Embedding JavaApplet to registerCorrectnessKit)

JIVE (Java Swiss Federal No Embedding JavaInteractive Institute of software Technology (ETH) Visualization Zürich, University Environment) of Kaiserslautern

KeY Chalmers Public Embedding Java Card Eclipse and University, Borland Together University of plug-ins, test Karlsruhe, generationUniversity of Koblenz

KIV University Public Hybrid ASM Multiple (Karlsruhe of Augsburg (except the Pascal-like specification Interactive Java version) Java methodsVerifier)

Spec# Microsoft Public Embedding C# Visual Studio Research plug-in, bug

finding

A renewed interest in verification itself ismotivating the development of better inte-grated, more automatic verification tools.This is partly driven by enhanced certifica-tion requirements for safety-critical soft-ware, but also by technological and econom-ical considerations. Mobile software hostedon smart cards and cell phones shares im-portant deployment characteristics withhardware: recalling and updating it is diffi-cult and costly. In addition, typical mobilesoftware applications bear a high economicrisk. Most security policies can’t be enforcedstatically or by language-based mechanismsalone, but they can be seen as particular in-stances of verification problems and, there-fore, can be tackled with deductive verifica-tion tools. The EU Integrated Project MOBIUS

(Mobility, Ubiquity, and Security, http://mobius.inria.fr) combines type-based pro-gram analysis and deductive software verifi-cation in a proof-carrying code architectureto establish security policies offline in Java-enabled mobile devices.

Exciting new applications of deductiveverification technology are emerging in de-pendable computing—for example, sym-bolic fault injection, symbolic fault coverageanalysis, or deductive cause-consequenceanalysis.

Deductive verification is quickly movingforward, but formidable challenges lie ahead.Several important program features can’t yetbe handled adequately—notably concurrentthreads, support for modularity, and floating-point numbers.

The lack of available documentation out-side of technical research articles threatensto hamper progress. Currently, only onebook attempts to document the theory andapplication of a newer-generation verifica-tion tool.4 We need more.

Verification tools’ reach can be extendedwith powerful automated theorem proversthat deal with verification conditions and in-termittent simplification of proof goals. Nu-merous theorem provers optimized for verifi-cation have appeared recently. The needs ofabstract interpretation and software modelchecking have mostly driven their develop-ment, so these systems aren’t yet fully opti-mized for deductive verification. Neverthe-less, early experiments have been promising.Most deductive-verification tools offer atheorem prover back-end interface in the re-cently adopted SMT-LIB (Satisfiability Mod-ulo Theories Library) format (www.smt-lib.org).

Although tool and method integrationhave progressed substantially, we don’t yetfully understand how best to integrate verifi-cation into software development processesor into large-scale enterprise software frame-works.

In business and automotive applica-tions, using software that has been auto-matically generated from more abstractmodeling languages such as UML orSimulink is becoming popular. This posesa challenge to verification because mod-eling languages lack conventional pro-gramming languages’ maturity and pre-cise semantics. Besides, few verificationtools can handle them. Verifying the gen-erated source code isn’t possible eitherbecause it’s too unstructured.

The available specification languages forexpressing formal software requirements

aren’t fully satisfactory. General-purposeformalisms such as Z or RSL tend to be toocumbersome to use. OCL is too abstract,while JML and Spec# are in flux and are tiedto a particular target programming language.Specification is, of course, a problem notonly of deductive verification but also ofany verification technique.

References

1. M. Kaufmann, P. Manolios, and J StrotherMoore, Computer-Aided Reasoning: An Ap-proach, Kluwer Academic, 2000.

2. P. Rümmer, “Generating Counterexamplesfor Java Dynamic Logic,” Proc. CADE-20Workshop Disproving, 2005, pp. 32–44;www.cs.chalmers.se/~ahrendt/cade20-ws-disproving/proceedings.pdf.

3. A.D. Brucker and B. Wolff, “Interactive Test-ing with HOLTestGen,” Proc. Workshop on

Formal Aspects of Testing (FATES 05), LNCS3997, Springer, 2005, pp. 87–102.

4. B. Beckert, R. Hähnle, and P. Schmitt, eds.,Verification of Object-Oriented Software: TheKeY Approach, LNCS 4334, Springer, 2006.

Software Development by RefinementDouglas R. Smith and Cordell Green,Kestrel Institute

The term automatic programming hasbeen used since computing’s early days torefer to the generation of programs fromhigher-level descriptions. The first COBOL

compilers were called automatic program-mers, and since then the notion of what con-stitutes “higher level” description has beenrising steadily. Here, we focus on automatingthe generation of correct-by-constructionsoftware from formal-requirements-levelspecifications. Interest in this topic arisesfrom several disciplines, including AI, soft-ware engineering, programming languages,formal methods, and mathematical founda-tions of computing, resulting in a variety ofapproaches. Our work at the Kestrel Instituteemphasizes formal representation and rea-soning about expert design knowledge,thereby taking an AI approach to automatedsoftware development.

Automated software development (alsocalled software synthesis) starts with real-world requirements that are formalized intospecifications. Specifications then undergoa series of refinements that preserve prop-erties while introducing implementationdetails. These refinements result from ap-plying representations of abstract designknowledge to an intermediate design.

The sweet spot for automated softwaredevelopment lies in domain-specific mod-eling and specification languages accom-panied by automatic code generators.Domain-specific code generation helpsput programming power in the hands ofdomain experts who aren’t skilled pro-grammers. An active research topic is howto compose heterogeneous models to geta global model of a system. Another trendis toward increased modularity throughaspect- and feature-oriented technologies.These techniques provide a novel meansfor automatically incorporating new re-quirements into existing code. Here, wesummarize Kestrel’s approach to automated

NOVEMBER/DECEMBER 2006 www.computer.org/intelligent 75

Deductive verification

is quickly moving forward,

but formidable challenges

lie ahead. Several important

features can’t yet

be handled adequately.

software development, which is embodiedin the Specware system (www.specware.org).

Applications of software synthesis

Projects spanning various applicationareas convey a sense of what’s possible inautomated software development.

Synthesis of scheduling softwareIn the late 1990s, Kestrel developed a

strategic airlift scheduler for the US AirForce that was entirely evolved at the specifi-cation level. From a first-order-logic specifi-cation, the application had approximately24,000 lines of generated code. Using theKestrel Interactive Development System(KIDS), more than 100 evolutionary deriva-tions were carried out over a period of sev-eral years,1 each consisting of approximatelya dozen user design decisions. KIDS usedabstract representations of knowledge aboutglobal search algorithms, constraint propaga-tion, data type refinements (for example, setsto splay trees), and high-level code optimiza-tion techniques such as finite differencingand partial evaluation. The resulting code ranmore than two orders of magnitude fasterthan comparable code that experts had man-ually written.

Continued progress in generating schedul-ing software led to the development of thedomain-specific Planware system.2 Planwaredefines a domain-specific requirement lan-guage for modeling planning and schedulingproblems. From such a problem model (typi-cally 100 to 1,000 lines of text derived frommixed text and graphical input), it automati-cally generates a complex planner/schedulertogether with editors and visual display code(more than 100,000 lines of code for someapplications). Design theories similar to theones used in KIDS are specialized and fullyautomated in Planware, so code generation(or regeneration) takes only a few minutes.

Synthesis of Java Card appletsThe current Java Card system uses a

domain-specific language for specifyingJava Card applets.3 The system not onlygenerates Java Card code but also producesformal proofs of consistency between thesource specification and the target code.Certification is a major cost in the deploy-ment of high-assurance software. By deliv-ering both Java Card code and proofs, anexternal certifier can mechanically checkthe proofs’ correctness. Our goal is to dra-

matically lower the cost of deploying soft-ware that must pass government certifica-tion processes. Elsewhere, NASA Ames’AutoBayes and AutoFilter projects are al-so exploring the simultaneous generationof code and certification evidence fromdomain-specific specifications.

Model-based code generationThe Forges project explored generating

production-quality code from MatLab mod-els. To do this, the project developed an op-erational semantics model of the StateFlowlanguage. It then used partial evaluation ofthat operational semantics with respect to agiven StateFlow model to produce execu-table code. The result was then optimizedand sent to a C compiler. The code qualitycompared favorably to code that commer-cially available tools produced.

Deriving authentication protocolsThe Protocol Derivation Assistant tool

explores the transformations and refinementsneeded to derive correct-by-constructionauthentication protocols. A major result hasbeen the derivation of various well-knownfamilies of protocols.4 Each family arisesfrom choices of various features that are com-posed in. Hand-simulating this approach,Cathy Meadows and Dusko Pavlovic found aflaw in a protocol (called Group Domain ofInterpretation, GDOI) that had been exten-sively studied and analyzed.

Automated policy enforcementMany safety and security policies have

effects that cut across a system’s compo-nent structure. Recent theoretical work hasshown that you can treat many such cross-cutting requirements as system invariants.Moreover, we’ve developed techniques for

calculating where and how to modify thesystem to enforce those constraints. Thesetechniques depend on automated inference,static analysis of program flow, and synthe-sis of many small segments of code for in-sertion at appropriate locations.

More details about SpecwareMost of the example applications we

discussed earlier were developed usingKestrel’s Specware system.

From requirements to specificationsRefinement-oriented development starts

with the procuring organization’s require-ments, which are typically a mixture of in-formal and semiformal notations reflectingvarious stakeholders’ needs. To provide thebasis for a clear contract, the stakeholdersmust formalize the requirements into speci-fications that both the procuring organiza-tion (the buyer) and the developer (theseller) can agree to. You can express specifi-cations at various levels of abstraction. Atone extreme, a suitable high-level program-ming language can sometimes serve to ex-press executable specifications. However,an executable specification must includeimplementation detail that’s time-consum-ing to develop and get right and that mightbe better left to the developer’s discretion.At the other extreme, a property-orientedlanguage (such as a higher-order logic) canbe used to prescribe the intended software’sproperties with minimal prescription of im-plementation detail. The solution in Spec-ware is a mixture of logic and high-levelprogramming constructs providing a wide-spectrum approach. This lets specificationwriters choose an appropriate level of ab-straction from implementation detail.

Refinement: From specifications to codeA formal specification serves as the cen-

tral document of the development and evo-lution process. It’s incrementally refined toexecutable code. A refinement typicallyembodies a well-defined unit of program-ming knowledge. Refinements can rangefrom situation-specific or ad hoc rules, todomain-specific transformations, to domain-independent theories or representations ofabstract algorithms, data structures, opti-mization techniques, software architectures,design patterns, protocol abstractions, andso on. KIDS and a Specware extensioncalled Designware are systems that auto-mate the construction of refinements from

76 www.computer.org/intelligent IEEE INTELLIGENT SYSTEMS

Domain-specific code

generation helps put

programming power

in the hands of domain

experts who aren’t skilled

programmers.

reusable or abstract design theories. A cru-cial feature of a refinement from specifica-tion A to specification B is that it preservesA’s properties and behaviors in B while typi-cally adding more detail in B. This preserva-tion property lets you compose refinements,meaning that you can treat a chain of refine-ments from an initial specification to a low-level executable specification as a singleproperty- and behavior-preserving refine-ment, thereby establishing that the generatedcode satisfies the initial specification.

The Specware frameworkSpecware provides a mechanized frame-

work for composing specifications and re-fining them into executable code. Theframework is founded on a category ofspecifications. The specification language,called MetaSlang, is based on a higher-order logic with predicate subtypes andextended with a variety of ML-like pro-gramming constructs. MetaSlang supportspure property-oriented specifications, exe-cutable specifications, and mixtures of thesetwo styles. You use specification morphismsto structure, parameterize, and refine speci-fications. You use colimits to compose spec-ifications, instantiate parameterized speci-fications, and construct refinements. Youuse diagrams to express the structure oflarge specifications, the refinement of spec-ifications to code, and the application ofdesign knowledge to a specification. A recentSpecware extension supports specifying,composing, and refining behavior througha category of abstract state machines.

The framework features a collection oftechniques for constructing refinementsbased on formal representations of program-ming knowledge. Abstract algorithmic con-cepts, data type refinements, program opti-mization rules, software architectures, abstractuser interfaces, and so on are represented asdiagrams of specifications and morphisms.These diagrams can be arranged into tax-onomies, which allow incremental access toand construction of refinements for particu-lar requirement specifications.

Concluding remarksA key feature of Kestrel’s approach is the

automated application of reusable refine-ments and the automated generation of re-finements by specializing design theories.Previous attempts to manually constructand verify refinements have tended to re-quire costly rework when requirements

change. In contrast, automated construc-tion of refinements allows larger-scale ap-plications and more rapid development andevolution.

Near-term practical applications of auto-mated software development will resultfrom narrowing down the specification lan-guage and design knowledge to specific ap-plication domains, as in the Planware andJava Card prototypes we mentioned earlier.By narrowing the specification language, thegenerator developers can effectively hard-wire a fixed sequence of design choices (ofalgorithms, data type refinements, and opti-mizations) into an automatic-design tactic.Also, by restricting the domain, a generatordeveloper can specialize the design knowl-edge to the point that little or no inference isnecessary, resulting in completely automa-ted code generation.

References

1. D.R. Smith, “KIDS: A Semiautomatic Pro-gram Development System,” IEEE Trans.Software Eng., vol. 16, no. 9, 1990, pp. 1024–1043.

2. M. Becker, L. Gilham, and D.R. Smith, Plan-ware II: Synthesis of Schedulers for ComplexResource Systems, tech. report, Kestrel Tech-nology, 2003.

3. A. Coglio, “Toward Automatic Generation ofProvably Correct Java Card Applets,” Proc.5th ECOOP Workshop Formal Techniques forJava-like Programs, 2003; ftp://ftp.kestrel.edu/pub/papers/coglio/ftjp03.pdf.

4. A. Datta et al., “A Derivation System forSecurity Protocols and Its Logical Formal-ization,” Proc. 16th IEEE Computer SecurityFoundations Workshop (CSFW 03), IEEE CSPress, 2003, pp. 109–125.

Satisfiability Modulo TheoriesSilvio Ranise, INRIA

Cesare Tinelli, University of Iowa

Many applications of formal methods relyon generating formulas of first-order logicand proving or disproving their validity. Des-pite the great progress in the last 20 years inautomated theorem proving (and disproving)in FOL, general-purpose theorem provers,such as provers based on the resolution cal-culus, are typically inadequate to work withthe sort of formulas that formal-methodstools generate. The main reason is that thesetools aren’t interested in validity in generalbut in validity with respect to some back-ground theory, a logical theory that fixes theinterpretations of certain predicate and func-tion symbols. For instance, in formal meth-ods involving the integers, we’re only inter-ested in showing that the formula

�x �y (x < y � x < y � y)

is true in those interpretations in which <denotes the usual ordering over the integersand � denotes the addition function. Whenproving a formula’s validity, general-pur-pose reasoning methods have only one wayto consider only the interpretations allowedby a background theory: add as a premiseto the formula a conjunction of the theory’saxioms. When this is possible at all (somebackground theories can’t be captured by afinite set of FOL formulas), generic theo-rem provers usually perform unacceptablyfor realistic formal-method applications. Amore viable alternative is using specializedreasoning methods for the background the-ory of interest. This is particularly the casefor ground formulas, FOL formulas withno variables (so also no quantifiers) but pos-sibly with free constants—constant sym-bols not in the background theory.

For many theories, specialized methodsactually yield decision procedures for thevalidity of ground formulas or some sub-set of them. For instance, this is the case,thanks to classical results in mathematics,for the theory of integer numbers (and for-mulas with no multiplication symbols) orthe theory of real numbers. In the last twodecades, however, specialized decisionprocedures have also been discovered fora long (and growing) list of theories ofother important data types, such as

• certain theories of arrays,

NOVEMBER/DECEMBER 2006 www.computer.org/intelligent 77

General-purpose

theorem provers are

typically inadequate to

work with the sort of

formulas that formal-

methods tools generate.

• certain theories of strings,• several variants of the theory of finite

sets,• some theories of lattices,• the theories of finite, regular, and infinite

trees, and• theories of lists, tuples, records, queues,

hash tables, and bit vectors of a fixed orarbitrary finite size.

The literature on these procedures oftendescribes them in terms of satisfiability in atheory—relying on the fact that a formula isvalid in a theory T exactly when no inter-pretation of T satisfies the formula’s nega-tion. So, we call the field satisfiability mod-ulo theories and call those procedures SMTsolvers.

Using SMT solvers in formal methodsisn’t new. It was championed in the early1980s by Greg Nelson and Derek Oppen atStanford University, by Robert Shostak atSRI, and by Robert Boyer and J StrotherMoore at the University of Texas at Austin.Building on this work, however, the last 10years have seen an explosion of interestand research on the foundations and practi-cal aspects of SMT. Several SMT solvershave been developed in academia and in-dustry with continually increasing scopeand improved performance. Some of themhave been or are being integrated into

• interactive theorem provers for high-order logic (such as HOL and Isabelle),

• extended static checkers (such as CAs-CaDE, Boogie, and ESC/Java),

• verification systems (such as ACL2,Caduceus, SAL, and UCLID),

• formal CASE environments (such asKeY),

• model checkers (such as BLAST, MAGIC,and SLAM),

• certifying compilers (such as Touchstone),and

• unit test generators (such as CUTE andMUTT).

In industry, Cadence, Intel, Microsoft, andNEC (among others) are undertaking SMT-related research projects.

Main approachesThe design, proof of correctness, and

implementation of SMT methods poseseveral challenges. First, formal methodsnaturally involve more than one data type,each with its own background theory, so

suitable combination techniques are nec-essary. Second, satisfiability proceduresmust be proved sound and complete.Although proving soundness is usuallyeasy, proving completeness requires spe-cific model construction arguments show-ing that, whenever the procedure finds aformula satisfiable, a satisfying theoryinterpretation for it does indeed exist. Thismeans that each new procedure in princi-ple requires a new completeness proof.Third, data structures and algorithms for anew procedure, precisely for being spe-cialized, are often implemented fromscratch, with little software reuse. Threemajor approaches for implementing SMTsolvers currently exist, each addressingthese challenges differently and having itsown pros and cons.

SAT encodingsThis approach is based on ad hoc transla-

tions that convert an input formula and rele-vant consequences of its background theoryinto an equisatisfiable propositional formula(see, for instance, Ofer Strichman, Sanjit A.Seshia, and Randal E. Bryant’s work1). Theapproach applies in principle to all theorieswhose ground satisfiability problem is de-cidable, but possibly at the cost of an expo-nential blow-up in the translation (that is, thesize of the equisatisfiable formula the trans-lation produces is exponentially larger thanthe size of the original formula). The ap-proach is nevertheless appealing becauseSAT solvers can quickly process extremelylarge formulas. Proving soundness and com-pleteness is relatively simple because it re-duces to proving that the translation preservessatisfiability. Also, the implementation effortis relatively small for being limited to the

translator—after that, you can use any off-the-shelf SAT solver. Eager versions of theapproach, which first generate the completetranslation and then pass it to a SAT solver,have recently produced competitive solversfor the theory of equality and for certainfragments of the integers theory. However,current SAT encodings don’t scale up aswell as SMT solvers based on the small en-gines approach (which we discuss next) be-cause of the exponential blow-up of the eagertranslation and the difficulty of combiningencodings for different theories.

Small enginesThis approach is the most popular and

consists of building procedures implement-ing an inference system specialized on atheory T. The lure of these “small engines”is that you can use whatever algorithms anddata structures are best for T, typically lead-ing to better performance. A disadvantage isthat proving an ad hoc procedure’s correct-ness might be nontrivial. A possibly biggerdisadvantage is that you must write an en-tire solver for each new theory, possibly du-plicating internal functionalities and imple-mentation effort.

One way to address the latter problem is toreduce a theory solver to its essence by sepa-rating generic Boolean reasoning from the-ory reasoning proper. The common practiceis to write theory solvers just for conjunc-tions of ground literals—atomic formulasand negations of atomic formulas. Thesepared-down solvers are then embedded asseparate submodules into an efficient SATsolver, letting the joint system accept arbi-trary ground formulas. Such a scheme (for-mulated generally and abstractly by RobertNieuwenhuis, Albert Oliveras, and CesareTinelli2) lets you plug in a new theory solverinto the same SAT engine as long as the solverconforms to a simple common interface.

Another way to reduce developmentcosts is to decompose, when possible, abackground theory into two or more com-ponent theories, write a solver for eachsmaller theory, and then use the solvers co-operatively. Greg Nelson and Derek Op-pen developed a general and highly in-fluential method for doing this.3 All majorSMT solvers based on the small-enginesapproach use some enhancement or varia-tion of this method.

Big enginesYou can apply this rather recent approach

78 www.computer.org/intelligent IEEE INTELLIGENT SYSTEMS

Several SMT solvers

have been developed

in academia and industry

with continually increasing

scope and improved

performance.

to theories T that admit a finite FOL axiom-atization, capitalizing on the power and flex-ibility of current automated theorem proversfor FOL. In particular, you can apply it toprovers based on the superposition calculus,a modern version of resolution with a built-in treatment of the equality predicate andpowerful techniques for reducing the searchspace. The approach consists of instrument-ing a superposition prover with specializedcontrol strategies that, together with the ax-ioms of T, effectively turn this “big engine”into a decision procedure for ground satisfi-ability in T.

A big plus of the approach is a simplifiedproof of correctness, which reduces to aroutine termination proof for an exhaustiveand fair application of the superpositioncalculus’ rules.4 Another advantage is that,when the approach applies to two theories,obtaining a decision procedure for theirunion is, under reasonable assumptions, assimple as feeding the union of the axioms tothe prover. A further advantage is the reuseof efficient data structures and algorithmsfor automated deduction implemented instate-of-the-art provers.

The main disadvantage is that to get addi-tional functionalities (such as incremental-ity, proof production, or model building),you might need to modify the prover inways that the original implementers didn’tforesee, which might require a considerableimplementation (or reimplementation)effort.

Standardization effortsBecause of their specialized nature, differ-

ent SMT solvers often are based on differentFOL variants, work with different theories,deal with different classes of formulas, andhave different interfaces and input formats.Until a few years ago, this made it arduous toassess the relative merits of existing SMTsolvers and techniques, theoretically or inpractice. In fact, even testing and evaluatinga single solver in isolation was difficult be-cause of the dearth of benchmarks.

To mitigate these problems, in 2002 theSMT community launched the SMT-LIB(Satisfiability Modulo Theories Library)initiative, a standardization effort co-led byus and supported by the vast majority ofSMT research groups. The initiative’s maingoals are to define standard I/O formatsand interfaces for SMT solvers and to builda large online repository of benchmarks forseveral theories. Several SMT solvers world-

wide now support the current version of theinput format, and numerous formal-methodsapplications are adopting it. The repository,which is still growing, includes about 40,000benchmarks from academia and industry.(See www.smt-lib.org for more details onSMT-LIB and on SMT-COMP, the affili-ated solver competition.)

Future directionsAlthough SMT tools are proving in-

creasingly useful in formal-methods ap-plications, more work is needed to im-prove the trade-off between their efficiencyand their expressiveness. For example,software verification problems often re-quire handling formulas with quantifiers,something that SMT solvers don’t yet dosatisfactorily. Finding good heuristics forlifting current SMT techniques from

ground formulas to quantified formulas isa key challenge.

Other lines of further research come fromthe need to enhance SMT solvers’ interfacefunctionalities. For instance, to validate theresults of a solver or integrate it in interac-tive provers, we need solvers that can pro-duce a machine-checkable proof every timethey declare a formula to be unsatisfiable.Similarly, to be useful to larger tools suchas static checkers, model checkers, and testset generators, an SMT solver must be ableto produce, for a formula it finds satisfiable,a concrete, finite representation of the inter-pretation that satisfies it. Other potentiallyuseful functionalities are the generation ofunsatisfiable cores of unsatisfiable formulasor of interpolants for pairs of jointly unsat-isfiable formulas. More research on effi-ciently realizing all these functionalities isunder way.

References

1. O. Strichman, S.A. Seshia, and R.E. Bryant,“Deciding Separation Formulas with SAT,”Proc. 14th Int’l Conf. Computer Aided Veri-fication (CAV 02), LNCS 2404, Springer,2002, pp. 209–222.

2. R. Nieuwenhuis, A. Oliveras, and C. Tinelli,“Abstract DPLL and Abstract DPLL ModuloTheories,” Proc. 11th Int’l Conf. Logic forProgramming, Artificial Intelligence, andReasoning (LPAR 04), LNCS 3452, Springer,2005, pp. 36–50.

3. G. Nelson and D.C. Oppen, “Simplificationby Cooperating Decision Procedures,” ACMTrans. Programming Languages and Systems,vol. 1, no. 2, 1979, pp. 245–257.

4. A. Armando, S. Ranise, and M. Rusinowitch,“A Rewriting Approach to Satisfiability Pro-cedures,” Information and Computation, June2003, pp. 140–164.

Static Driver Verifier: An Industrial Application of Formal MethodsThomas Ball and Sriram K. Rajamani,Microsoft Research

Microsoft will soon release WindowsVista, the latest in its line of PC operatingsystems. Each version of the Windows OShas simultaneous releases of developmentkits that let third-party developers createsoftware that builds upon it. An importantclass of software that extends Windows isdevice drivers (or simply “drivers”), whichconnect the huge and diverse number ofdevices to the Windows OS. The WindowsDriver Kit (www.microsoft.com/whdc/devtools/ddk/default.mspx) “provides abuild environment, tools driver samples anddocumentation to support driver develop-ment for the Windows family of operatingsystems.”

The Static Driver Verifier tool, part of theWindows Driver Kit, uses static analysis tocheck if Windows device drivers obey therules of the Windows Driver Model (WDM),a subset of the Windows kernel interface thatdevice drivers use to link into the operatingsystem. The SDV tool is powered by theSLAM software-model-checking engine,1

which uses several formal methods such asmodel checking, theorem proving, and staticprogram analysis. We describe technical andnontechnical factors that led to this tool’ssuccessful adoption and release.

Successful applications of formal meth-

NOVEMBER/DECEMBER 2006 www.computer.org/intelligent 79

Software verification

problems often require

handling formulas with

quantifiers, something

that SMT solvers don’t

yet do satisfactorily.

ods in commercial software engineeringare rare for several reasons, including

• the lack of a compelling value proposi-tion to users,

• the high cost of obtaining formal specifi-cations, and

• a lack of scalability and automation intools based on formal methods.

We describe how the SLAM engine andSDV address these obstacles.

SDV’s value propositionThe biggest reason SDV succeeded is

the importance of the driver reliabilityproblem. The reliable operation of devicedrivers is fundamental to the reliability ofWindows. According to Rob Short, Corpo-rate Vice President of Windows Core Tech-nology, failed drivers cause 85 percent ofWindows XP crashes.2 Device-driver fail-ures appear to the customer as Windowsfailures. So, Microsoft invests heavily inimproving the Windows Driver Kit, whichincludes new driver models (interfaces withsupporting libraries) as well as a diverse setof tools to catch bugs early in the develop-ment of device drivers.

SDV’s value proposition is that it detectserrors in device drivers that are hard to findusing conventional testing. SDV places adriver in a hostile environment and usesSLAM to systematically test all code pathswhile looking for violations of WDM usagerules. SLAM’s symbolic execution makesfew assumptions about the state of the OS orthe driver’s initial state, so it can exercisesituations that are difficult to exercise bytraditional testing.

The cost of creating formal specifications

SDV analyzes a driver (written in C)against interface rules for the WDM. Theserules are written in a state-machine descrip-tion language called SLIC (SpecificationLanguage for Interface Checking). Be-cause many thousands of drivers use thesame WDM interface, it made economicsense for Microsoft to write the specifica-tions because the cost is amortized by run-ning SDV across many drivers. Further-more, the specifications themselves areprograms, and, like all programs, they’reerror prone. As the specifications wereused on drivers, they were debugged, andonly those that produced valuable results

are part of the SDV release. Creating theSLIC specifications as well as the environ-ment model (a C program) representinghow Windows invokes a device driver tookseveral staff-years of effort.

Scalability and automationExhaustively analyzing all possible be-

haviors of large programs isn’t possiblebecause programs have an astronomicalnumber of behaviors. Instead, the SLAMengine focuses on an “abstraction” of theprogram, which contains only the state ofthe program that’s relevant to the propertybeing verified. For example, to check if adriver is properly using a spinlock, SLAMtracks only the state of the lock itself andthe variables in the driver that guard thestate transitions involving the lock. SLAMuses a technique called counterexample-

driven refinement to automatically discoverthe relevant state. The SLAM toolkit usesBoolean programs as models to representthese abstractions. Boolean programs aresimple, and property checking for Booleanprograms is decidable. Our experiencewith SLAM demonstrates that for control-dominated properties, this model sufficesto represent useful abstractions.

SLAM’s contributionsSLAM builds on much work from the

formal-methods community, such as sym-bolic model checking, predicate abstrac-tion, abstract interpretation, counterexam-ple-guided abstraction refinement, andautomatic theorem proving using decisionprocedures.

SLAM’s research contribution is to applythese concepts in the context of program-ming languages. Its abstraction tool is the

first to solve the problem of modularly build-ing a sound predicate abstraction (Booleanprogram) of a C program. SLAM’s modelchecker is a symbolic dataflow analysisengine (a cross between an interproceduralanalysis and a model checker) that checksBoolean programs. SLAM symbolicallyexecutes a C program path to determine itsfeasibility and to discover predicates to elim-inate infeasible execution paths, thus refiningthe Boolean program abstraction. Adaptingexisting techniques, such as predicate ab-straction, to work in the context of a real-world programming language required newresearch, in addition to the engineering effortto make them scale and perform acceptably.

DirectionsSince SLAM’s invention, many in acad-

emia and industry have done considerablework on improving counterexample-drivenrefinement for software and addressing keyrelated problems.

PerformanceOf course, users want tools to run in min-

utes rather than hours (and seconds are evenbetter). Increased performance also lets toolproviders more quickly debug the propertiesthat their tools take as input. Recent work onusing interpolant-based model checking(rather than predicate abstraction) appearspromising, as do approaches that leveragerecent advances in satisfiability solvers.

PrecisionSLAM and many other tools sacrifice

precision for performance. For example,SLAM doesn’t precisely model integeroverflow or bit vector operations. Usingmore precise decision procedures formodular arithmetic and bit vectors wouldreduce the number of false error reportsin tools such as SLAM and might alsoincrease performance.

Explaining the cause of errorsTools such as SLAM produce error paths

that are often long and contain much irrele-vant information. We and others have shownthat you can automatically prune away thisirrelevant information and present a muchmore concise, relevant error report.

Inferring specificationsAs we noted earlier, developing useful

specifications is error prone and time con-suming. Recent work on automatically infer-

80 www.computer.org/intelligent IEEE INTELLIGENT SYSTEMS

Since SLAM’s invention, many in

academia and industry have done

considerable work on improving

counterexample-driven

refinement for software and

addressing key related problems.

ring simple specifications from a code cor-pus shows much promise and will be anessential part of methodologies for insertingsoftware tools such as SLAM into legacyenvironments.

AcknowledgmentsWe’re grateful for the contributions of many

people in academia and inside Microsoft. In addi-tion to providing the theoretic and algorithmicfoundations over several decades, on which SLAMrelied, the academic community was generous tolicense software to be used inside SDV. In particu-lar, the CUDD (CU Decision Diagram) BinaryDecision Diagram package (http://vlsi.colorado.edu/~fabio/CUDD) from the University of Col-orado and the OCaml (http://caml.inria.fr) runtimeenvironment from INRIA ship as part of SDV. Thiswouldn’t have been possible without cooperationfrom the researchers who built these tools to workout the legal agreements that enabled Microsoft toinclude these components.

The Windows product group showed remark-able foresight in signing up to ship SLAM, whichwas a research prototype at that time. In additionto support from senior management, the SDV de-velopment team in Windows invested many staff-years of time and energy in turning SDV from aresearch prototype to a product. Without their ef-forts, SDV wouldn’t be a shipping Microsoft prod-uct today.

References

1. T. Ball et al., “Thorough Static Analysis ofDevice Drivers,” Proc. European SystemsConf. (EuroSys 06), 2006, pp. 73–85.

2. M.M. Swift et al., “Recovering Device Dri-vers,” Proc. 6th USENIX Symp. Operating Sys-tems Design and Implementation (OSDI 04),USENIX, 2004, pp. 1–16.

NOVEMBER/DECEMBER 2006 www.computer.org/intelligent 81

Bernhard Beckertis an assistant profes-sor of computer sci-ence at the Univer-sity of Koblenz,Germany. Contacthim at [email protected].

Tony Hoare is asenior researcherwith Microsoft Re-search in Cambridgeand emeritus pro-fessor at OxfordUniversity. Contacthim at [email protected].

Reiner Hähnle is aprofessor in the De-partment of Compu-ter Science at Chalm-ers University ofTechnology in Goth-enburg, Sweden. Con-tact him at [email protected].

Douglas R. Smith isprincipal scientistwith the Kestrel In-stitute and the execu-tive vice presidentand chief technologyofficer of Kestrel Tech-nology. Contact himat [email protected].

Cordell Green is thefounder, director, andchief scientist of theKestrel Institute. Con-tact him at [email protected].

Silvio Ranise has apermanent researchposition at INRIA, inthe LORIA laboratorycommon to CNRS(Centre National de laRecherche Scienti-fique), INRIA, and theUniversities of Nancy.

Contact him at silvio. [email protected].

Cesare Tinelli is anassociate professor ofcomputer science atthe University ofIowa. Contact him [email protected].

Thomas Ball is aprincipal researcherand research managerat Microsoft Re-search, Redmond.Contact him at [email protected].

Sriram K. Rajamaniis a senior researcherand research managerat Microsoft ResearchIndia, Bangalore. Con-tact him at [email protected].

V I S I T U S O N L I N E w w w. c o m p u t e r. o r g /s o f t w a r e

Software magazine delivers reliable, useful, leading-edge software developmentinformation to keep engineers and managers abreast of rapid technology change.The authority on translating software theory into practice, the magazine positionsitself between pure research and pure practice, transferring ideas, methods, andexperiences among researchers and engineers. Peer-reviewed articles and columnsby real-world experts illuminate all aspects of the industry, including process im-provement, project management, development tools, software maintenance, Webapplications and opportunities, testing, usability, and much more.