Automatic Synthesis of Safety-RelatedSoftware — … Synthesis of Safety-RelatedSoftware — Short...

Preview:

Citation preview

Automatic Synthesisof Safety-RelatedSoftware�

— Short Paper —

Johann SchumannRIACS/ NASA Ames

email:schumann@email.arc.nasa.gov

Abstract

For specific domains (e.g., data analysis, planning andscheduling,or stateestimation),automatedprogramsynthe-sis systemshave beendevelopedwhich arecapableof pro-ducinghundredsof lines of non-trivial code. However, thepotentialapplicabilityof anautomaticprogramsynthesissys-temdoesnotonly dependonsizeandqualityof thegeneratedcode,but alsoits ability to beintegratedinto theoverall soft-wareprocess.Therefore,the generationof executablecodealoneis not enough. In this paper, we will describethreetechniqueswhich enhancethecapabilitiesof a synthesistoolwith respectto generationof explanations,certificates,andsimulationdata. The synthesissystemencodesenoughdo-main knowledge,suchthat the appropriateinformationcandirectlybeextractedduringthesynthesisprocess.ExplainIt! is a componentfor the AMPHION/NAV system(synthesisof stateestimationsoftware)which automaticallygeneratesanddisplaysexplanationsfor eachpieceof thesyn-thesizedcode,thuseffectively achieving traceabilitybetweencodeandspecification.For safety-relevantapplications,softwaremustundergoarig-orous certification processwhere it must be demonstratedthat certainsafetypoliciesarenot violated. Traditional for-mal verification approaches(e.g., with Hoare-stylerules)areimpractical,becausethey requirelargeamountsof man-ual codeannotations. In this paper, we discussan exten-sion of the AUTOBAYES system(synthesisof data analy-sis programs)for the automaticgenerationof codeannota-tions which canbe handledby a verificationconditiongen-eratorandan automatedtheoremprover. Speedof this ap-proachcomparesfavorably with commercialstatic analysistools(e.g.,PolySpace).Finally, we discussa moduleof AUTOBAYES which synthe-sizescodefor thegenerationof artificial datafor simulation,experimentation,andtestingpurposes.

Intr oductionOver the recentyears,size and complexity of software insafety-relatedareashasgrown tremendously. A major rea-�This paperdiscusseswork donein several synthesisprojects

at the AutomatedSoftware Engineeringgroup (GuillaumeBrat,Bernd Fischer, Mike Lowry, JohnPenix, Tom Pressburger, PhilOh, GrigoreRosu,Mahadevan Subramaniam,Jeffrey vanBaalen,JonathanWhittle).Copyright c

�2002, American Associationfor Artificial Intelli-

gence(www.aaai.org). All rightsreserved.

sonfor this is thatfunctionalitywhichhasbeentraditionallyrealizedby hardwareis now implementedasa programonageneral-purposeprocessor, thusreducingproductioncostsandincreasingfunctionality. Typicalapplicationareasrangefrom avionics,processcontrol(e.g.,for chemicalor nuclearplants)to car industry. However, theproductionof reliable,high-qualitycodefor safety-relatedapplicationsis far fromeasy. In particular, modern,highly iterative softwarelifecy-cles(e.g.,spiralor use-casebasedprocesses)aremajorcostdrivers,becausefor eachiteration,substantialtesting,doc-umentation,andcertificationefforts arenecessary. For ex-ample,flight-critical software(e.g., for positionestimationor control of an aircraft) requiresrigorouscertificationbyan independentcertificationauthority(e.g.,the FAA). Thistime-consuming,highly manualprocesswhich is definedinstandarddocuments(e.g.,DO-178B)prescribestherequiredtesting,documentation,andengineeringefforts to guaranteetraceabilitybetweenspecificationandtheexecutablebinary.

An approachwhichcouldfacilitatetheproductionof suchpiecesof software is automatedprogramsynthesis.Givena high-level specification,an automatedprogramsynthe-sis tool generatesexecutablecode which implementsthespecification.Becauserigorousformal logic underliesthisapproach,the synthesizedcode is often consideredto be“correct-by-construction”.

Deduction-basedprogram synthesis is around for along time, and several synthesis systems (e.g., Am-phion (Stickel etal. 1994), KIDS (Smith1990), or Plan-ware (Burstein& Smith1996)) have beendevelopedoverthe yearsand it seemsthat in certain (albeit small) do-mains such systemsare capableof producingreasonablygoodcode. However, the usability of suchsystemsin theareaof safety-relateddomainsis still rather limited. Infact,they sharemany severelimitationswith state-of-the-artcodegeneratorsfor traditionalmodelingsystems(e.g.,Ma-trixX (MatrixX 2001), ControlShell(ControllShell2001)).As discussedabove, productionof a pieceof code is notenough.Rather, a code-producingsystemneedsto synthe-sizethefollowing artifacts:

� well documented,human-understandablecode.Only if apieceof softwarecanbeeasilyunderstood,manualmodi-ficationscanbeappliedor it canbesubjectto (successful)codereviews.

� traceability information betweencodeand specificationsuch� thatall piecesof thecodecanberelatedto their ori-gin in thespecification.

� supportfor simulation,animation,andtesting.A success-ful synthesissystemneedsto beableto produceartificialdatawhich conformto the given specification.State-of-the-artmodelingtools (e.g.,Simulink/MatLab,Controll-Shell)arealreadyprettyadvancedwith thatrespect.

� support for certification (e.g., providing annotationsorevenproofs).

In this paper, we demonstratethat a programsynthesissystemencodesenoughdomainknowledgeto supporttherequirementslistedabove. We will discussthreeextensionsto a programsynthesisarchitecturewhich, in addition toproducingexecutablecode, generatedetaileddocumenta-tion/explanations,certificatesfor thesynthesizedcodewithrespectto a givensafetypolicy, andtest/simulationdata,re-spectively.

The work, describedin this paper, is ongoing work.Therefore, these extensions have not been devel-oped for one single program synthesis system, butrather for two tools, namely AMPHION/NAV andAUTOBAYES. AMPHION/NAV (Whittle et al. 2001,-Schumann& Robinson2001) is a tool based on theAmphion system (Stickel etal. 1994) which is capableof automatically synthesizing C/C++ code for state-estimationand navigation of aircraft or spacecraft. Thedomain of AUTOBAYES (Fischer& Schumann2001,-Fischer, Schumann,& Pressburger2000) is data analysis,using the approachof Bayesiannetworks. This tool canbe used for scientific data analysis (e.g., clustering orclassificationproblems),but it alsocansynthesizecodetomodel sensorsand sensorfailures. Both systemsare aimtowardapplicationswheresafetyis important,for example,state-estimationof Marsroversor (on-board)scientificdataanalysis.

Ar chitecture of an ExtendedSynthesisSystem

Figure 1 shows the systemarchitectureof a modern,ex-tendedprogramsynthesissystem.Givena specification,thesynthesissystemproducesexecutablecode. For this coretask,domainknowledgein form of a domaintheoryis usedto guidethesynthesisprocess.Theunderlyingprincipleofthe synthesisengineis of no greatimportancefor the dis-cussionin this paper. For example, the AMPHION/NAVsystemis basedupondeduction-basedsynthesis(usingthefirst-ordertheoremprover SNARK), whereasAUTOBAYESusesschema-guidedsynthesis.However, all thesesystemshave in commonthat they rely on a substantialbodyof en-codeddomainknowledge. This domainknowledge,com-binedwith informationon how theprogramwasassembled(e.g.,a proof) canbeusedto extendthesynthesissystemtoproducecommentedcode,designdocuments,testdata,andsupportfor rigorouscertification. Theseextensionswill bedescribedin thefollowing sections.

x ~ N(mu,sigma)max pr(x | mu..

FOR i:=1 TO N mu[i] := ...

forall I : int & asize(mu) = N and ...

mem_safety: OKop_safety : OK...

Certifier

product certificate

Certification SupportFOR i:=1 TO N X[i] = rnd(...);

0.12361.02020.34320.31030.00133.2322

simulation

DocumentDesign

Program Synthesis System

input specification

synthesized, commentedcode

design document

knowledge

Domaintheory/

data

Figure 1: SystemArchitecturefor an ExtendedProgramSynthesisSystem

Explaining SynthesizedCodeIn the AMPHION/NAV system,mostaxiomsin thedomaintheory1 are given as a set of first-orderequations. Theseequationsrelatethevariousobjectson differentabstractionlevels. Due to the synthesisprocess(deductive synthesis)andadditionalprogramtransformationsteps,it is nearlyim-possibleto tell which partsof the synthesizedcodecorre-spondsto which part of the specification,or why the codeis structuredin a specificway. In a safety-relatedappli-cationenvironment,traceabilitybetweenspecificationandcodeis of major importance.During manualdevelopmentof suchsoftware,considerableeffort is spenton writing de-taileddocumentationon all aspectsof thecode.

Here,deductive programsynthesiscanhelp, becauseallinformationrelatingspecification,code,anddomaintheoryis available in the proof producedby the automatedtheo-rem prover. The proof, containinghundredsof inferencestepsis converted in such a way that it relatesthe inputspecificationwith the final product(C/C++ code). The ex-planationthuscanbe seenasa descriptionof the programdesign “from first principle”. AMPHION/NAV containsthe subsystem“ExplainIt!” which producesexplanationsfor the synthesistask(for detailssee(Whittle etal. 2001,-Schumann& Robinson2001)). Eachaxiom of the domaintheoryis annotatedby explanationtemplates,consistingofplain text and (logical) variables. Whenever an axiom isusedfor theproof,thevariablesin thetemplatesareinstanti-ated.In orderto find theentireexplanation,asetof explana-tionequalities(vanBaalenet al. 1998) is generatedwhichisusedto composethecorrespondingexplanationtemplates.

Humanreadabilityandunderstandabilityof suchan ex-planationis extremelyimportant.However, thetargetaudi-enceis notalogically trainedsynthesisperson,but adomainexpert/engineer. This meansthat not only all evidenceof

1Thedomaintheoryfor AMPHION/NAV is built on top of thedomaintheoryof the AMPHION system(Stickel et al. 1994) ongeometricrelationships,coordinatesystems,andcelestialmechan-ics.

Figure2: Screendumpof apartof theexplanationdocument

low-level deductionneedsto behiddenfrom theuser. Fur-thermore,therepresentationof datashoulduseform andvo-cabulary of thedomain.In thedomainof AMPHION/NAV,thecommonlyuseddatastructuresarevectorsandmatrices(asopposedto listsandlists-of-listsin AMPHION/NAV’sin-ternalrepresentation).Thus,explanationof a matrix is bestrepresentedin a tabular form, asshown in thescreen-dumpin Figure2. It shows a partof the explanationfor a matrix(“measurementmatrix � ”) which relatesthemeasurementswith the currentposition estimate. Eachcell of the tablecorrespondsto a single entry in the matrix. This HTMLdocumentis producedfrom theinternalXML representationwhich is generatedby “ExplainIt!”. Translation/formattingis donewith XSLT. HyperlinkedHTML documentshavetheadvantagethatall statementsof thesynthesizedcodecanbelinkedto their explanations.Thus,a simpleclick on a state-mentimmediatelyproducestherelateddocumentation.Us-ing XML asa flexible internaldocumentformatenablesusto alsogenerateprintedPDF documentationin a standard-izedform.

Certifying SynthesizedCodeCode certification is a lightweight approach(as opposedto e.g.,full functionalverification)to demonstratesoftwarequality on a formal level. Its basicideais to produceformalproofsdemonstratingthat the codesatisfiescertainqualityproperties(e.g.,memoryor operatorsafety). Theseproofscanbeseenascertificates(for theproducedcode)whichcanbecheckedindependentlyby a simpleproof checker. Sincecodecertificationusesthe sameunderlyingtechnologyasHoare-styleprogramverification,it alsorequiresmany de-tailedannotations(e.g.,loop invariants)to make theproofspossible.However, manuallyaddingtheseannotationsto thecodeis anextremelytime-consuminganderror-pronetask..

In a certificationextensionof AUTOBAYES, we addressthis problem(Whalen,Schumann,& Fischer2002). AUTO-BAYES containssufficient high-level domainknowledgeto

generatetherequireddetailedannotations.Becauseall con-straintsandinformationondesigndecisionsis availabledur-ing synthesistime, detailedandpowerful local annotationscanbegeneratedeasilyby AUTOBAYES. A separateprop-agationalgorithmdistributestheannotationsto all placesinthecodewherethey arevalid. Whenannotationsweregen-eratedby AUTOBAYES, theoriginal380linesof commentedcodegrew to morethan2100linesof codewith annotations.Thisis aclearindicationthatwriting manualannotationsareinfeasible.

From this annotated code, a general-purposever-ification condition generator (in our case MOPS(Kaiser, Fischer, & Struckmann2000)) produces a setof proof obligationsin first-orderlogic. Theobligationsarethenprocessedby theautomatedtheoremproverE-SETHEO(CASC2001).

In (Whalen,Schumann,& Fischer2002) wehavedemon-stratedourapproachby certifyingoperatorsafetyandmem-ory safetyfor a generatediterative dataclassificationpro-gram( ����� linesof documentedC++ code)without man-ual annotationof the code. For this example,a total of 69proof taskshave beengenerated.E-SETHEO could solve65 automaticallywith a run-time limit of �� secondson a1000MHz SunBladeworkstation. Most of the taskscouldbe solved in aboutone second,but several taskstook upto ��� seconds(averagetime: �� � seconds).The remainingfour proof taskscurrentlyrequiresomemanualpreprocess-ing which will be automatedin future versions. A com-parisonwith the state-of-the-artcommercialstaticanalysistool PolySpace(PolySpace2002) showedthatour approachcould reacha bettercoveragewith a substantiallyshorterruntime.

Generationof Simulation and TestData

Testingandsimulationplaysavital rolein mostsoftwarede-velopmentprocesses.Whereastestingaimsat showing thatthe pieceof codeworks correctly, simulationis often usedto demonstratehow thecodeworksandto assessits qualityandperformance.Therefore,the availability of simulationandtest-datais of greatimportance.To setup a simulationenvironmentmanually, however, is usuallya very time con-suminganderror pronetask. This is especiallytrue whentherequirementsspecificationsaremodifiedin a rapidsuc-cession(e.g.,in aniterative life cycle).

With programsynthesis,thedevelopmentof a simulationenvironmentcan be very straightforward; we synthesizeaprogramfrom our given specificationwhich generatestestdata.Theadvantagesareobvious: we alreadyhave a spec-ification, andmostof thesynthesizer’s infra-structure(e.g.,symbolichandling,codegeneration)canbe usedas is forthis task.For AUTOBAYES, we havedevelopeda tool com-ponentwhichcansynthesizeaprogramto generaterandom-izeddataaccordingto thegivenspecification.Thisdatagen-eratorcouldbeimplementedin lessthan200linesof Prologcodeon top of theAUTOBAYES system.

ConclusionsIn this� paper, we have briefly describedthree extensionsto bare-bonesprogramsynthesistechnologywhich canin-creaseusability of a synthesistool in safety-relatedappli-cationareas.In the AMPHION/NAV system,a detailedex-planationis generatedfully automaticallyandpresentedina way suitablefor the domainengineer. It fully hidestheunderlyinglogic and reasoningsystemusedto synthesizethe program. The proof stepsis convertedin sucha waythat it relatesthe input specificationwith the final product,thusopeningupanentirelynew levelof traceabilitybetweenspecificationandsourcecode.

Explanationanddocumentationis only oneaspect.Cur-rent practice of certification of safety-critical code re-quires huge testing effort and lengthy manual code re-views. Automaticcertificationof synthesizedcodehasthepotentialto substantiallyfacilitateandacceleratecertifica-tion. In combinationwith techniquesfrom proof-carryingcode(Necula& Lee1998), dynamic certificationof field-loadablesoftwarecanbe addressed.Hereagain,we bene-fit from the fact, that the synthesissystemencodesenoughdomainknowledgesuchthattherequiredHoare-styleanno-tationscanbemadeautomatically. Last,but not least,tryingout synthesizedcodeduringsimulationrunsis animportantfeaturefor apracticalusablesystem.Thetestdatageneratorprovidesimmediatefeed-backon the specification(doesitmake senseor aretheresomeobviousbugs?) andhelpstonavigatethroughthedesignspace.

All thosefeaturesform essentialingredientsof a modernprogramsynthesissystemif it shouldhave a chanceto beusedin practice.Bare-bonessynthesispower doesnot helphere,it only leadsto repeatingthe samemistakesashavebeenmadewith automatedtheoremprovers,which areusu-ally restricted“moreby generalusabilitythanby raw deduc-tivepower”2.

ReferencesBurstein,M. B., andSmith, D. 1996. ITAS: A PortableInteractiveTransportationSchedulingTool Usinga SearchEngineGeneratedfrom FormalSpecifications.In Proceed-ings of the 3rd International Conferenceon AI PlanningSystems(AIPS-96), 35–44.AAAI Press.CASC-JC,2001. TheCASC-JCtheoremproving compe-tition. URL:http://www.cs.miams.edu/˜tptp/CAS C/JC .Controlshell. 2001. RTI Real-Time Innovations.http://www.rti.com .Fischer, B., andSchumann,J. 2001. AutoBayes:A sys-tem for generatingdataanalysisprogramsfrom statisticalmodels. Submittedfor publication.Preprintavailable athttp://ase.arc.nasa.gov/people/.. .fischer/papers.html .Fischer, B.; Schumann,J.;andPressburger, T. 2000.Gen-eratingdataanalysisprogramsfrom statisticalmodels(po-sitionpaper).In Taha,W., ed.,Proc.Intl. WorkshopSeman-tics Applications,and Implementationof ProgramGener-

2M. Kaufmannin his invited talk duringCADE 15,1998.

ation, volume1924 of Lect. NotesComp.Sci., 212–229.Montreal,Canada:Springer.Kaiser, T.; Fischer, B.; andStruckmann,W. 2000. Mops:Verifying Modula-2 programsspecifiedin VDM-SL. InProc. 4th WorkshopTools for SystemDesignandVerifica-tion, 163–167.MatrixX: AutoCode Product Overview. ISI. URL:http://www.isi.com .Necula,G. C., and Lee, P. 1998. Efficient representa-tion and validation of logical proofs. In Proceedingsofthe13thAnnualSymposiumonLogic in ComputerScience(LICS’98), 93–104.IEEEComputerSocietyPress.PolySpacetechnologies.URL: http://www.polyspace.com .Schumann,J.,andRobinson,P. 2001. [] or successis notenough:Currenttechnologyandfuturedirectionsin proofpresentation. In Future Trendsin AutomatedDeduction(during IJCAR2001).Smith,D. R. 1990. KIDS: A SemiautomaticProgramDe-velopmentSystem.IEEE Trans.on Software Engineering16(9):1024–1043.Stickel, M.; Waldinger, R.; Lowry, M.; Pressburger, T.;andUnderwood,I. 1994.Deductivecompositionof astro-nomicalsoftwarefrom subroutinelibraries. In Bundy, A.,ed., Proc. 12th International ConferenceAutomatedDe-duction, volume814of Lecture Notesin Artificial Intelli-gence, 341–355.Springer.vanBaalen,J.; Robinson,P.; Lowry, M.; andPressburger,T. 1998.Explainingsynthesizedsoftware.In ThirteenthIn-ternationalConferenceon AutomatedSoftware Engineer-ing, 240–248.IEEEComputerSocietyPress.Whalen,M.; Schumann,J.; andFischer, B. 2002. Synthe-sizingcertifiedcode.In Proc. ICSE2002. (submitted).Whittle, J.; van Baalen,J.; Schumann,J.; Robinson,P.;Pressburger, T.; Penix,J.; Oh, P.; Lowry, M.; andBrat, G.2001. Amphion/NAV: Deductive Synthesisof StateEsti-mation(shortpaper).In Proceedingsof the16thAutomatedSoftwareEngineeringConference2001(ASE2001). IEEE.

Recommended