25
Symbolic Execu.on for Python— Progress & Challenges Maverick Woo [email protected] 2017-09-26

Symbolic Execuon for Python— Progress & Challenges · Krishnamurthi, “Python: The Full Monty-A Tested Semantics for the Python Programming Language ,” in Proceedings of the

Embed Size (px)

Citation preview

Page 1: Symbolic Execuon for Python— Progress & Challenges · Krishnamurthi, “Python: The Full Monty-A Tested Semantics for the Python Programming Language ,” in Proceedings of the

SymbolicExecu.onforPython—Progress&Challenges

MaverickWoo

[email protected]

Page 2: Symbolic Execuon for Python— Progress & Challenges · Krishnamurthi, “Python: The Full Monty-A Tested Semantics for the Python Programming Language ,” in Proceedings of the

MyCollaborators•  CMU–  PeterChapman;nowatDuolingo–  DavidBrumley

•  UniversityofIowa–  AndrewReynolds–  TianyiLiang;nowatTwoSigma–  CesareTinelli

•  StanfordUniversity–  ClarkBarrett

WealsoacknowledgetherestoftheCVC4developerteamandThomasBall(MSR)fortheiradvice!

2

Page 3: Symbolic Execuon for Python— Progress & Challenges · Krishnamurthi, “Python: The Full Monty-A Tested Semantics for the Python Programming Language ,” in Proceedings of the

PythonisImportant•  Pythoniseverywhere–  Dataanalysis:NumPy,Pandas,Matplotlib,…–  Machinelearning:Scikit-Learn,TensorFlow,…– Webapplications:Django,Tornado,…

•  Pythondevelopmentenvironmentismature–  PyCharm–  Emacs,Vimetc.withsuitableplugins

3

“IntelliSense”inMSlingo

Page 4: Symbolic Execuon for Python— Progress & Challenges · Krishnamurthi, “Python: The Full Monty-A Tested Semantics for the Python Programming Language ,” in Proceedings of the

IntelliTestforC#(VS2015)

4https://msdn.microsoft.com/en-us/library/dn823749.aspx

Page 5: Symbolic Execuon for Python— Progress & Challenges · Krishnamurthi, “Python: The Full Monty-A Tested Semantics for the Python Programming Language ,” in Proceedings of the

IntelliTestforC#(VS2015)

5https://msdn.microsoft.com/en-us/library/dn823749.aspx

DrivenbyDynamicSymbolicExecution

Page 6: Symbolic Execuon for Python— Progress & Challenges · Krishnamurthi, “Python: The Full Monty-A Tested Semantics for the Python Programming Language ,” in Proceedings of the

DynamicSymbolicExecu.on“Theinstrumentedexecutionsofaprogramwhere•  eachconcreteload/storeofavariableisaccompaniedwith

•  acorrespondingsymbolicdereference/assignmentofthatvariable’sexpression”

ConsiderthisPythonprogram:1.x=int(input(“Num?”))2.y=x+42

6

Page 7: Symbolic Execuon for Python— Progress & Challenges · Krishnamurthi, “Python: The Full Monty-A Tested Semantics for the Python Programming Language ,” in Proceedings of the

Example

ConcreteExecution1.  xß10/let’ssay/2.  yß52

SymbolicExecution1.  x_1==sym_int()2.  y_2==plus(x_1,42)

7

1.x=int(input(“Num?”))2.y=x+42

Logicconstraints:Equalityforassignment;Inequalityforif-then-else

Page 8: Symbolic Execuon for Python— Progress & Challenges · Krishnamurthi, “Python: The Full Monty-A Tested Semantics for the Python Programming Language ,” in Proceedings of the

Sa.sfiabilityModuloTheorySolver

8

Page 9: Symbolic Execuon for Python— Progress & Challenges · Krishnamurthi, “Python: The Full Monty-A Tested Semantics for the Python Programming Language ,” in Proceedings of the

DynamicSymbolicExecutor

9

1.x=int(input(“Num?”))2.y=x+42

Automaticgenerationofthesefromprogram

Moresophisticatedcontrollogictoexploreprogram

DSEcomprises:

Page 10: Symbolic Execuon for Python— Progress & Challenges · Krishnamurthi, “Python: The Full Monty-A Tested Semantics for the Python Programming Language ,” in Proceedings of the

IntelliTestFacilitatesMaintenance

ConcreteExecution1.  xß10/let’sjustsay/2.  yß52

SymbolicExecution1.  x_1==sym_int()2.  y_2==plus(x_1,427)

10

1.x=int(input(“Num?”))2.y=x+427

Previousmodel{x_1ß–42,y_2ß0}nolongersatiseiesthenewsetofgeneratedconstraints—wehavemadearegressionorafunctionalchange!

Page 11: Symbolic Execuon for Python— Progress & Challenges · Krishnamurthi, “Python: The Full Monty-A Tested Semantics for the Python Programming Language ,” in Proceedings of the

Let’sBuild“IntelliTestforPython”DynamicSymbolicExecutionforPython•  StartedinFall2014,ledbyPeterChapman•  UsesthenewstringreasoningintheCVC4SMTsolver•  InitiallysponsoredbyNSFSecureandTrustworthyCyberspace(SaTC)

“ThisbeingCyLab,wheredoessecuritycomein?”•  DSEaffordstestcasegeneration•  DSEaffordsexploitgenerationJ–  “Generateaninputthat(i)drivestheexecutiontoavulnerablepartoftheprogramand(ii)triggersthatbug”

11

Page 12: Symbolic Execuon for Python— Progress & Challenges · Krishnamurthi, “Python: The Full Monty-A Tested Semantics for the Python Programming Language ,” in Proceedings of the

TestHarnessforstr.substringfromsymbolic.argsimportsymbolic@symbolic(s="foo")defstrsubstring(s):"""TestcaseforPythonslicing,negativeindicesandstepsarenotcurrentlytested."""ifs[2:]=="obar":return0elifs[:2]=="bb":return1elifs[1:3]=="bb":return2else:return3defexpected_result():return[0,1,2,3]

12

Page 13: Symbolic Execuon for Python— Progress & Challenges · Krishnamurthi, “Python: The Full Monty-A Tested Semantics for the Python Programming Language ,” in Proceedings of the

$pyex--cvcstrsubstring.pyExploring./test/cvc/strsubstring.py.strsubstring[('s','foo')]3[('s','Abb')]2[('s','bb')]1[('s','AAobar')]0strsubstringtestpassed<---Executiontime:0.43secondsSolverCPU:0.06secondsInstrumentationCPU:0.08secondsPathcoverage:4pathsLinecoverage:10/11lines(90.91%)Branchcoverage:15branchesExceptions:0exceptionsraisedTriagedexceptions:0triagedexceptionsraised

13

Page 14: Symbolic Execuon for Python— Progress & Challenges · Krishnamurthi, “Python: The Full Monty-A Tested Semantics for the Python Programming Language ,” in Proceedings of the

TheRestofThisTalk•  Whathasbeenachievedsofar?•  Whatarethecurrentobstacles?•  Whatareourideastoovercomethem?

•  WhatcanyoudotospeedupDSEresearch?

14

Page 15: Symbolic Execuon for Python— Progress & Challenges · Krishnamurthi, “Python: The Full Monty-A Tested Semantics for the Python Programming Language ,” in Proceedings of the

ToBuildaDSEforPython1.WeneedthesemanticsofPython

[1]G.J.Smeding,“AnExecutableOperationalSemanticsforPython,”UniversiteitUtrecht,2009.

[2]J.G.Politz,A.Martinez,M.Milano,S.Warren,D.Patterson,J.Li,A.Chitipothu,andS.Krishnamurthi,“Python:TheFullMonty-ATestedSemanticsforthePythonProgrammingLanguage,”inProceedingsofthe2013ACMInternationalConferenceonObjectOrientedProgrammingSystemsLanguages&Applications,2013,pp.217–232.

[3]S.Sapra,M.Minea,S.Chaki,A.Gureinkel,andE.M.Clarke,“FindingErrorsinPythonProgramsUsingDynamicSymbolicExecution,”inProceedingsoftheInternationalConferenceonTestingSoftwareandSystems,2013,pp.283–289.

[4]T.BallandJ.Daniel,“DeconstructingDynamicSymbolicExecution,”inProceedingsofthe2014MarktoberdorfSummerSchoolonDependableSoftwareSystemsEngineering,2014.

15

BuiltontopofBall-Daniel

Page 16: Symbolic Execuon for Python— Progress & Challenges · Krishnamurthi, “Python: The Full Monty-A Tested Semantics for the Python Programming Language ,” in Proceedings of the

ToBuildaDSEforPython1.  ImplementthesemanticsofPython

Pythonhasmanyversions!•  ImplementationstrategytakenbyBall-Daniel2014cancopewithlanguagechanges–  DivergencedetectionisamustforanyDSEanyway

•  ButNOTwithlibrarychanges,e.g.,– What’sNewinPython3.7:“bytes.fromhex()andbytearray.fromhex()nowignoreallASCIIwhitespace,notonlyspaces.(ContributedbyRobertXiaoinbpo-28927.)”

16

Page 17: Symbolic Execuon for Python— Progress & Challenges · Krishnamurthi, “Python: The Full Monty-A Tested Semantics for the Python Programming Language ,” in Proceedings of the

ToBuildaDSEforPython1.  ImplementthesemanticsofPythonForeachstringfunction,weneedtomanuallycodeitssemanticstolow-levelstringoperationssupportedbyCVC4:

18

Dispatchbyversion

Page 18: Symbolic Execuon for Python— Progress & Challenges · Krishnamurthi, “Python: The Full Monty-A Tested Semantics for the Python Programming Language ,” in Proceedings of the

ToBuildaDSEforPython2.ReasonwithPythondatatypes

Pythonhasmanybuilt-intypes:•  Numeric:int,eloat,complex•  Sequence:list,tuple,range•  Textsequence:str•  Settypes:set,frozenset•  Maptypes:dict•  …

19

Reasoningw/stringsisasignieicant

4-yearNSFproject

Page 19: Symbolic Execuon for Python— Progress & Challenges · Krishnamurthi, “Python: The Full Monty-A Tested Semantics for the Python Programming Language ,” in Proceedings of the

ToBuildaDSEforPython2.ReasonwithPythondatatypes

Stringreasoningismuchharderthannumericalreasoning:•  Givemeastringthat(i)doesnotstartwith“e”,(ii)haslength8,(iii)contains“eric”and“mark”assubsequences,and(iv)doesnotcontainmorethanone“r”–  Therearelotsofstringswithlength8:28*8

20

Page 20: Symbolic Execuon for Python— Progress & Challenges · Krishnamurthi, “Python: The Full Monty-A Tested Semantics for the Python Programming Language ,” in Proceedings of the

ToBuildaDSEforPython2.ReasonwithPythondatatypesThisisafragmentoftheCVC4logicforstringreasoning:

21

Thisiswhywehaveworld-classlogiciansinourteam!

Page 21: Symbolic Execuon for Python— Progress & Challenges · Krishnamurthi, “Python: The Full Monty-A Tested Semantics for the Python Programming Language ,” in Proceedings of the

ToBuildaDSEforPython2.ReasonwithPythondatatypesThisishowwesimplifylogicstatementsinthisfragment:

22

Thisiswhywehaveworld-classlogiciansinourteam!

Page 22: Symbolic Execuon for Python— Progress & Challenges · Krishnamurthi, “Python: The Full Monty-A Tested Semantics for the Python Programming Language ,” in Proceedings of the

AdvancesinCVC4:2017Edi.onNewsimplieierin~2000linesofC++,plusmanymoreimprovementsintherestofthesolver:A.Reynolds,M.Woo,C.Barrett,D.Brumley,T.Liang,andC.Tinelli,“ScalingUpDPLL(T)StringSolversUsingContext-DependentSimpliJication,”inProceedingsofthe29thInternationalConferenceonComputerAidedVerieication,Springer,2017,pp.453–474.

23

Previousstate-of-the-artinliterature

NewCVC4iswaybetter

Canstilltakehours

Page 23: Symbolic Execuon for Python— Progress & Challenges · Krishnamurthi, “Python: The Full Monty-A Tested Semantics for the Python Programming Language ,” in Proceedings of the

HighPriorityTODO—Logic•  “High-FidelityPython”regex–  CVC4has“textbookregex”:+,|,*–  Pythonregexsupportsbackreferences+namedgroups(easy)andnon-greedycaptures(hard,orisit?)

•  Symbolicdictionary–  Adictionaryisafunctionmappingkeystovalues–  Efeicientmodel-eindingwhenkeysandvaluesareconcrete:seepapersonfunctionsynthesis

– Whenthekeysandvaluescanbesymbolicallyspecieied,itstressessolvers“togowherenoonehasgonebefore”

24

Page 24: Symbolic Execuon for Python— Progress & Challenges · Krishnamurthi, “Python: The Full Monty-A Tested Semantics for the Python Programming Language ,” in Proceedings of the

HighPriorityTODO—Systems•  Effectivehorizontalscaling–  Ball-Daniel2014isaprototypedesignedforteachingatasummerschool

–  RealchallengewithDSEisscalingacrossmultiplemachinesinacluster/datacenter(manyPhDtheseshaveyettobewritten)

•  IntegrationwithPyCharm/Emacs/Vim–  MatchtherealIntelliTestexperience–  Persistgeneratedtest-casesforthedeveloper

25

Page 25: Symbolic Execuon for Python— Progress & Challenges · Krishnamurthi, “Python: The Full Monty-A Tested Semantics for the Python Programming Language ,” in Proceedings of the

CalltoAc.on•  IfyoubelieveDynamicSymbolicExecutionforPythonisworthpursuing,pleasetalkwithMichaelLisantiorme–  Partnersat100Korabovegettopicktheirfavoriteprojectstosupport

– Weacknowledgeyourorganizationinthepaperandintheconferencepresentation—exposure==>advantagesinhiring!

– Wecanintegrateyourengineersintoourtoolstudy

26