A Symbolic Execution Framework for JavaScriptdawnsong/papers/2010 symbolic execution.pdf · DOI 10.1109/SP.2010.38 513 Authorized licensed use limited to: Univ of Calif Berkeley

A Symbolic Execution Framework for JavaScript

Prateek Saxena, Devdatta Akhawe, Steve Hanna, Feng Mao, Stephen McCamant, Dawn SongComputer Science Division, EECS Department

University of California, Berkeley{prateeks, devdatta, sch, fmao, smcc, dawnsong}@cs.berkeley.edu

Abstract—As AJAX applications gain popularity, client-sideJavaScript code is becoming increasingly complex. However,few automated vulnerability analysis tools for JavaScript exist.In this paper, we describe the first system for exploring theexecution space of JavaScript code using symbolic execution.To handle JavaScript code’s complex use of string operations,we design a new language of string constraints and implementa solver for it. We build an automatic end-to-end tool, Kudzu,and apply it to the problem of finding client-side code injectionvulnerabilities. In experiments on 18 live web applications,Kudzu automatically discovers 2 previously unknown vulner-abilities and 9 more that were previously found only with amanually-constructed test suite.

Keywords-web security; symbolic execution; string decisionprocedures

I. INTRODUCTION

Rich web applications have a significant fraction of theircode written in client-side scripting languages, such asJavaScript. As an increasing fraction of code is found onthe client, client-side security vulnerabilities (such as client-side code injection [18], [22], [24], [25]) are becominga prominent threat. However, a majority of the researchon web vulnerabilities so far has focused on server-sideapplication code written in PHP and Java. There is agrowing need for powerful analysis tools for the client-sidecomponents of web applications. This paper presents thefirst techniques and system for automatically exploring theexecution space of client-side JavaScript code. To explorethis execution space, our techniques generate new inputsto cover a program’s value space using dynamic symbolicexecution of JavaScript, and to cover its event space byautomatic GUI exploration.Dynamic symbolic execution for JavaScript has numerous

applications in web security. In this paper we focus on oneof these applications: automatically finding client-side codeinjection vulnerabilities. A client-side code injection attackoccurs when client-side code passes untrusted input to adynamic code evaluation construct, without proper validationor sanitization, allowing an attacker to inject JavaScript codethat runs with the privileges of a web application.JavaScript execution space exploration is challenging for

many reasons. In particular, JavaScript applications acceptmany kinds of input, and those inputs are structured justas strings. For instance, a typical application might takeuser input from form fields, messages from its server via

XMLHttpRequest, and data from code running concur-rently in other browser windows. Each kind of input stringhas its own format, so developers use a combination of cus-tom routines and third-party libraries to parse and validatethe inputs they receive. To effectively explore a program’sexecution space, a tool must be able to supply values for allof these different kinds of inputs and reason about how theyare parsed and validated.

Approach. In this paper, we develop the first com-plete symbolic-execution based framework for client-sideJavaScript code analysis. We build an automated, stand-alone tool that, given a URL for a web application, automat-ically generates high-coverage test cases to systematicallyexplore its execution space. Automatically reasoning aboutthe operations we see in real JavaScript applications requiresa powerful constraint solver, especially for the theory ofstrings. However, the power needed to express the semanticsof JavaScript operations is beyond what existing stringconstraint solvers [14], [16] offer. As a central contributionof this work, we overcome this difficulty by proposinga constraint language and building a practical solver thatsupports the specification of boolean, machine integer (bit-vector), and string constraints, including regular expressions,over multiple variable-length string inputs. This language’srich support for string operations is crucial for reasoningabout the parsing and validation checks that JavaScriptapplications perform.To show the practicality of our constraint language, we

detail a translation from the most commonly used JavaScriptstring operations to our constraints. This translation alsoharnesses concrete information from a dynamic executionof the program in a way that allows the analysis to scale.We analyze the theoretical expressiveness of the theory ofstrings supported by our language (including in comparisonto existing constraint solvers), and bound its computationalcomplexity. We then give a sound and complete decisionprocedure for the bounded-length version of the constraintlanguage. We develop an end-to-end system, called Kudzu,that performs symbolic execution with this constraint solverat its core.

End-to-end system. We identify further challenges in build-ing an end-to-end automated tool for rich web applications.For instance, because JavaScript code interacts closely with a

2010 IEEE Symposium on Security and Privacy

1081-6011/10 $26.00 © 2010 IEEE

DOI 10.1109/SP.2010.38

513

Authorized licensed use limited to: Univ of Calif Berkeley. Downloaded on August 09,2010 at 20:12:15 UTC from IEEE Xplore. Restrictions apply.

user interface, its input space can be divided into two classes,the events space and the value space. The former includesthe state (check boxes, list selections) and sequence ofactions of user-interface elements, while the latter includesthe contents of external inputs. These kinds of input jointlydetermine the code’s behavior, but they are suited to differ-ent exploration techniques. Kudzu uses GUI exploration toexplore the event space, and symbolic execution to explorethe value space.We evaluate Kudzu’s end-to-end effectiveness by applying

it to a collection of 18 JavaScript applications. The resultsshow that Kudzu is effective at getting good coverageby discovering new execution paths, and it automaticallydiscovers 2 previously-unknown vulnerabilities, as well as 9client-side code injection vulnerabilities that were previouslyfound only with a manually-created test suite.

Contributions. In summary, this paper makes the followingmain contributions:

• We identify the limitations of previous string constraintlanguages that make them insufficient for parsing-heavyJavaScript code, and design a new constraint languageto resolve those limitations. (Section IV)

• We design and implement a practical decision proce-dure for this constraint language. (Section V)

• We build the first symbolic execution engine forJavaScript, using our constraint solver. (Sections III andVI)

• Combining symbolic execution of JavaScript with au-tomatic GUI exploration and other needed components,we build the first end-to-end automated system forexploration of client-side JavaScript. (Section III)

• We demonstrate the practical use of our implementationby applying it to automatically discovering 11 client-side code injection vulnerabilities, including two thatwere previously unknown. (Section VII)

II. PROBLEM STATEMENT AND OVERVIEW

In this section we state the problem we focus on, exploringthe execution space of JavaScript applications; describeone of its applications, finding client-side code injectionvulnerabilities; and give an overview of our approach.

Problem statement. We develop techniques to systemati-cally explore the execution space of JavaScript applicationcode.JavaScript applications often take many kinds of input.

We view the input space of a JavaScript program as splitinto two categories: the event space and the value space.

• Event space. Rich web applications typically define tensto hundreds of JavaScript event handlers, which mayexecute in any order as a result of user actions suchas clicking buttons or submitting forms. Event handlercode may check the state of GUI elements (such ascheck-boxes or selection lists). The ordering of events

and the state of the GUI elements together affects thebehavior of the application code.

• Value space. The values of inputs supplied to a programalso determine its behavior. JavaScript has numerousinterfaces through which input is received:

– User data. Form fields, text areas, and so on.– URL and cross-window communication abstrac-

tions. Web principals hosted in other windowsor frames can communicate with JavaScript codevia inter-frame communication abstractions such asURL fragment identifiers and HTML 5’s proposedpostMessage, or via URL parameters.

– HTTP channels. Client-side JavaScript code canexchange data with its originating web server usingXMLHttpRequest, HTTP cookies, or additionalHTTP GET or POST requests.

This paper primarily focuses on techniques to systemat-ically explore the value space using symbolic execution ofJavaScript, with the goal of generating inputs that exercisenew program paths. However, automatically exploring theevent space is also required to achieve good coverage. Todemonstrate the efficacy of our techniques in an end-to-endsystem, we combine symbolic execution of JavaScript for thevalue space with a GUI exploration technique for the eventspace. This full system is able to automatically explore thecombined input space of client-side web application code.

Application: finding client-side code injection vulnerabil-ities. Exploring a program’s execution space has a number ofapplications in the security of client-side web applications.In this paper, we focus specifically on one security applica-tion, finding client-side code injection vulnerabilities.

Client-side code injection attacks, which are sometimesreferred to as DOM-based XSS, occur when client-sidecode uses untrusted input data in dynamic code evaluationconstructs without sufficient validation. Like reflected orstored XSS attacks, client-side code injection vulnerabilitiescan be used to inject script code chosen by an attacker,giving the attacker the full privileges of the web application.We call the program input that supplies the data for an attackthe untrusted source, and the potentially vulnerable codeevaluation construct the critical sink. Examples of criticalsinks include eval, and HTML creation interfaces likedocument.write and .innerHTML.In our threat model, we treat all URLs and cross-window

communication abstractions as untrusted sources, as such in-puts may be controlled by an untrusted web principal. In ad-dition, we also treat user data as an untrusted source becausewe aim to find cases where user data may be interpreted ascode. The severity of attacks from user-data on client-side isoften less severe than a remote XSS attack, but developerstend to fix these and Kudzu takes a conservative approach ofreporting them. HTTP channels such as XMLHttpRequestare currently restricted to communicating with a web server

514


from the same domain as the client application, so wedo not treat them as untrusted sources. Developers maywish to treat HTTP channels as untrusted in the futurewhen determining susceptibility to cross-channel scriptingattacks [5], or when enhanced abstractions (such as theproposed cross-origin XMLHttpRequest [27]) allow cross-domain HTTP communication directly from JavaScript.To effectively find XSS vulnerabilities, we require two

capabilities: (a) generating directed test cases that explorethe execution space of the program, and (b) checking, ona given execution path, whether the program validates alluntrusted data sufficiently before using it in a critical sink.Custom validation checks and parsing routines are the normrather than the exception in JavaScript applications, so ourtool must check the behavior of validation rather than simplyconfirming that it is performed.In previous work, we developed a tool called FLAX which

employs taint-guided fuzzing for finding client-side codeinjection attacks [24]. However, FLAX relies on an external,manually developed test harness to explore the path space.Kudzu, in contrast, automatically generates a test suite thatexplores the execution space systematically. Kudzu also usessymbolic reasoning (with its constraint solver) to check if thevalidation logic employed by the application is sufficient toblock malicious inputs — this is a one-step mechanism fordirected exploit generation as opposed to multiple roundsof undirected fuzzing employed in FLAX. Static analysistechniques have also been employed for JavaScript [12]to reason about multiple paths, but can suffer from falsepositives and do not produce test inputs or attack instances.Symbolic analyses and model-checking have been used forserver-side code [2], [19]; however, the complexity of pathconditions we observe requires more expressive symbolicreasoning than supported by tools for server-side code.

Approach Overview. The value space and event space ofa web application are two different components of its inputspace: code reachable by exploring one part of the inputspace may not be reachable by exploring the other com-ponent alone. For instance, exploring the GUI event spaceresults in discovering new views of the web application, butthis does not directly affect the coverage that can be achievedby systematically exploring all the paths in the code imple-menting each view. Conversely, maximizing path coverage isunlikely to discover functionality of the application that onlyhappens when the user explores a different application view.Therefore, Kudzu employs different techniques to exploreeach part of the input space independently.Value space exploration. To systematically explore differ-

ent execution paths, we develop a component that performsdynamic symbolic execution of JavaScript code, and a newconstraint solver that offers the desired expressiveness forautomatic symbolic reasoning.In dynamic symbolic execution, certain inputs are treated

as symbolic variables. Dynamic symbolic execution differsfrom normal execution in that while many variable have theirusual (concrete) values, like 5 for an integer variable, thevalues of other variables which depend on symbolic inputsare represented by symbolic formulas over the symbolicinputs, like input

1+ 5. Whenever any of the operands of a

JavaScript operation is symbolic, the operation is simulatedby creating a formula for the result of the operation in termsof the formulas for the operands. When a symbolic valuepropagates to the condition of a branch, Kudzu can use itsconstraint solver to search for an input to the program thatwould cause the branch to make the opposite choice.Event space exploration. As a component of Kudzu we

develop a GUI explorer that searches the space of all eventsequences using a random exploration strategy. Kudzu’s GUIexplorer component randomly selects an ordering among theuser events registered by the web page, and automaticallyfires these events using an instrumented version of the webbrowser. Kudzu also has an input-feedback component thatcan replay the sequence of GUI events explored in anygiven run, along with feeding new values generated by theconstraint solver to the application’s data inputs.Testing for client-side code injection vulnerabilities. For

each input explored, Kudzu determines whether there is aflow of data from an untrusted data source to a criticalsink. If it finds one, it seeks to determine whether theprogram sanitizes and/or validates the input correctly toprevent attackers from injecting dangerous elements intothe critical sink. Specifically, it attempts to prove that thevalidation is insufficient by constructing an attack input. Aswe will describe in more detail in Section III-B, it combinesthe results of symbolic execution with a specification forattacks to create a constraint solver query. If the constraintsolver finds a solution to the query, it represents an attackthat can reach the critical sink and exploit a client-side codeinjection vulnerability.

III. END-TO-END SYSTEM DESIGN

This section describes the various components that worktogether to make a complete Kudzu-based vulnerability-discovery system work. The full explanation of the constraintsolver is in Sections IV through VI. For reference, therelationships between the components are summarized inFigure 1.

A. System Components

First, we discuss the core components that would be usedin any application of Kudzu: the GUI explorer that generatesinput events to explore the event space, the dynamic symbolicinterpreter that performs symbolic execution of JavaScript,the path constraint extractor that builds queries based onthe results of symbolic execution, the constraint solver thatfinds satisfying assignments to those queries, and the input

515


Figure 1: Architecture diagram for Kudzu. The components drawn in the dashed box perform functions specific to ourapplication of finding client-side code injection. The remaining components are application-agnostic. Components shaded inlight gray are the core contribution of this paper.

feedback component that uses the results from the constraintsolver as new program inputs.

The GUI explorer. The first step in automating JavaScriptapplication analysis is exploring the event space of userinteractions. Each event corresponds to a user interactionsuch as clicking a check-box or a button, setting focuson a field, adding data to data fields, clicking a link,and so on. Kudzu currently explores the space of all se-quences of events using a random exploration strategy. Oneof the challenges is to comprehensively detect all eventsthat could result in JavaScript code execution. To addressthis, Kudzu instruments the browser functions that processHTML elements on the current web page to record when anevent handler is created or destroyed. Kudzu’s GUI explorercomponent randomly selects an ordering among the userevents registered by the web page and executes them1. Therandom seed can be controlled to replay the same orderingof events. While invoking handlers, the GUI component alsogenerates (benign) random test strings to fill text fields.(Later, symbolic execution will generate new input valuesfor these fields to explore the input space further.) Linksthat navigate the page away from the application’s domainare cancelled, thereby constraining the testing to a singleapplication domain at a time. In the future, we plan toinvestigate alternative strategies to prioritize the executionof events discovered as well.

Dynamic symbolic interpreter. Kudzu performs dynamicsymbolic execution by first recording an execution of theprogram with concrete inputs, and then symbolically in-terpreting the recorded execution in a dynamic symbolic

1Invoking an event handler may invalidate another handler (for instance,when the page navigates as a result). In that case, the invalidated handlersare ignored and if new handlers are created by the event that causesinvalidation, these events are explored subsequently.

interpreter. For recording an execution trace, Kudzu employsan existing instrumentation component [24] implemented inthe web browser’s JavaScript interpreter. For each JavaScriptbytecode instruction executed, it records the semantics ofthe operation, its operands and operand values in a sim-plified intermediate language called JASIL [24]. The setof JavaScript operations captured includes all operationson integers, booleans, strings, arrays, as well as control-flow decisions, object types, and calls to browser-nativemethods. For the second step, dynamic symbolic execution,we have developed from scratch a symbolic interpreter forthe recorded JASIL instructions.Symbolic inputs for Kudzu are configurable to match the

needs of an application. For instance, in the application weconsider, detecting client-side code injection, all URL data,data received over cross-window communication abstrac-tions, and user data fields are marked symbolic. Symbolicinputs may be strings, integers, or booleans. Symbolic execu-tion proceeds on the JASIL instructions in the order they arerecorded in the execution trace. At any point during dynamicsymbolic execution, a given operand is either symbolic orconcrete. If the operand is symbolic, it is associated with asymbolic value; otherwise, its value is purely concrete and isstored in the dynamic execution trace. When interpreting aJASIL operation in the dynamic symbolic interpreter, theoperation is symbolically executed if one or more of itsinput operands is symbolic. Otherwise the operation of thesymbolic interpreter on concrete values would be exactly thesame as the real JavaScript interpreter, so we simply reusethe concrete results already stored in the execution trace.The symbolic value of an operand is a formula that

represents its computation from the symbolic inputs. Forinstance, for the JASIL assignment operation y := x, if xis symbolic (say, with the value input

1+ 5), then symbolic

execution of the operation copies this value to y, giving

516


y the same symbolic value. For an arithmetic operation,say y := x1 + x2 where x1 is symbolic (say with valueinput

2+ 3) and x2 is not (say with the concrete value 7),

the symbolic value for y is the formula representing thesum (input

2+10). Operations over strings and booleans are

treated in the same way, generating formulas that involvestring operations like match and boolean operations likeand. At this point, string operations are treated simplyas uninterpreted functions. During the symbolic execution,whenever the symbolic interpreter encounters an operationoutside the supported formula grammar, it forces the desti-nation operand to be concrete. For instance, if the operationis x = parseFloat(s) for a symbolic string s, x ands can be replaced with their concrete values from the trace(say, 4.3 and “4.3”). This allows symbolic computation tocontinue for other values in the execution.

Path constraint extractor. The execution trace recordseach control-flow branch (e.g., if statement) encounteredduring execution, along with the concrete value (true orfalse) representing whether the branch was taken. Duringsymbolic execution, the corresponding branch condition isrecorded by the path constraint extractor if it is symbolic.As execution continues, the formula formed by conjoiningthe symbolic branch conditions (negating the conditions ofbranches that were not taken) is called the path constraint. Ifan input value satisfies the path constraint, then the programexecution on that input will follow the same execution path.To explore a different execution path, Kudzu selects a

branch on the execution path and builds a modified pathconstraint that is the same up to that branch, but that hasthe negation of that branch condition (later conditions fromthe original branch are omitted). An input that satisfiesthis condition will execute along the same path up to theselected branch, and then explore the opposite alternative.There are several strategies for picking the order in whichbranch conditions can be negated — Kudzu currently usesa generational search strategy [11].

Constraint solver.Most symbolic execution tools in the pasthave relied on an existing constraint solver. However, Kudzugenerates a rich set of constraints over string, integer andboolean variables for which existing off-the-shelf solversare not powerful enough. Therefore, we have built a newsolver for our constraints (we present the algorithm anddesign details in Section V). In designing this component,we examined the symbolic constraints Kudzu generatesin practice. From the string constraints arising in these,we distilled a set of primitive operations required in acore constraint language. (This core language is detailedin Section IV, while the solver’s full interface is given inSection VI.) We justify our intuition that solving the coreconstraints is sufficient to model JavaScript string operationsin Section VI, where we show a practical translation ofJavaScript string operations into our constraint language.

Input feedback. Solving the path constraint formula usingthe solver creates a new input that explores a new programpath. These newly generated inputs must be fed back tothe JavaScript program: for instance simulated user inputsmust go in their text fields, and GUI events should bereplayed in the same sequence as on the original run. Theinput feedback component is responsible for this task. As aparticular HTML element (e.g a text field) in a documentis likely allocated a different memory address on everyexecution, the input feedback component uses XPath [29]and DOM identifiers to uniquely identify HTML elementsacross executions and feed appropriate values into them. Ifan input comes from an attribute for a DOM object, the inputfeedback component sets that attribute when the object iscreated. If the input comes via a property of an event thatis generated by the browser when handling cross-windowcommunication, such as the origin and data propertiesof a postMessage event, the component updates thatproperty when the JavaScript engine accesses it. Kudzuinstruments the web browser to determine the context ofaccesses, to distinguish between accesses coming from theJavaScript engine and accesses coming from the browsercore or instrumentation code.

B. Application-specific components

Next, we discuss three components that are specialized forthe task of finding client-side code injection vulnerabilities: asink-source identification component that determines whichcritical sinks might receive untrusted input, a vulnerabilitycondition extractor that captures domain knowledge aboutclient-side code injection attacks, and the attack verificationcomponent that checks whether inputs generated by the toolin fact represent exploits.

Sink-source identification. To identify if external inputsare used in critical sink operations such as eval ordocument.write, we perform a dynamic data flow anal-ysis on the execution trace. As outlined earlier, we treat allURL data, data received over cross-window communicationabstractions (such as postMessage), and data filled intouser data fields as potentially untrusted. The data flowanalysis is similar to a dynamic taint analysis. Any executiontrace that reveals a flow of data to a critical sink is subjectto further symbolic analysis for exploit generation. Weuse an existing framework, FLAX, for this instrumentationand taint-tracking [24] in a manner that is faithful to theimplementation of JavaScript in the WebKit interpreter.

Vulnerability condition extractor. An input represents anattack against a program if it passes the program’s validationchecks, but nonetheless implements the attacker’s goals (i.e.,causes a client-side code injection attack) when it reaches acritical sink. The vulnerability condition extractor collectsfrom the symbolic interpreter a formula representing the(possibly transformed) value used at a critical sink, and

517


combines it with the path constraint to create a formuladescribing the program’s validation of the input.2. To de-termine whether this value constitutes an attack, the vulner-ability condition extractor applies a sink-specific vulnera-bility condition specification, which is a (regular) grammarencoding a set of strings that would constitute an attackagainst a particular sink. This specification is conjoined withthe formula representing the transformed input to create aformula representing values that are still dangerous after thetransformation.For instance, in the case of the eval sink, the vulnera-

bility specification asserts that a valid attack should be zeroor more statements each terminated by a ‘;’, followed bythe payload. Such grammars can be constructed by usingpublicly available attack patterns [13]. The tool’s attackgrammars are currently simple and can be extended easilyfor comprehensiveness and to incorporate new attacks.To search only for realistic attacks, the specification also

incorporates domain knowledge about the possible valuesof certain inputs. For instance, when a string variable cor-responds to the web URL for the application, we assert thatthe string starts with the same domain as the application.

Attack verification. Kudzu automatically tests the exploitinstance by feeding the input back to the application, andchecking if the attack payload (such as a script with an alertmessage) is executed. If this verification fails, Kudzu doesnot report an alarm.

IV. CORE CONSTRAINT LANGUAGE

In order to support a rich language of input constraintswith a simple solving back end, we have designed anintermediate form we call the core constraint language.This language is rich enough to express constraints fromJavaScript programs, but simple enough to make solving theconstraints efficient. In this section we define the constraintlanguage, analyze its expressiveness and the theoreticalcomplexity of deciding it, and compare its expressivenessto the core languages of previous solvers.

A. Language Definition

The abstract syntax for our core constraint language isshown in Figure 2. A formula in the language is an ar-bitrary boolean combination of constraints. Variables whichrepresent strings may appear in five types of constraints. Thefirst three constraint types indicate that a string is a memberof the language defined by a regular expression, that twostrings are equal, or one string is equal to the concatenationof two other strings. The two remaining constraints relatethe length of one string to a constant natural number, or tothe length of another string, by any of the usual equality or

2Sanitization for critical client-side sink operations may happen on theserver side (when data is sent back via XMLHttpRequest). Our imple-mentation handles this by recognizing such transformations using approxi-mate tainting techniques [24] for data transmitted over XMLHttpRequest

Formula ::= ¬Formula| Formula ∧ Formula| Constraint

Constraint ::= Var ∈ RegExp| Var = Var| Var = Var ◦ Var| length(Var) Rel Number| length(Var) Rel length(Var)

RegExp ::= Character| ε

| RegExp RegExp| RegExp|RegExp| RexExp*

Rel ::= < | ≤ | = | ≥ | >

Figure 2: Abstract grammar of the core constraint language.

ordering operations. Regular expressions are formed fromcharacters or the empty string (denoted by ε) via the usualoperations of concatenation (represented by juxtaposition),alternation (|), and repetition zero or more times (Kleenestar *).

The constraints all have their usual meanings. Any numberof variables may be introduced, and Characters are drawnfrom an arbitrary non-empty alphabet, but Numbers must benon-negative integers. For present purposes, strings may beof unbounded length, though we will introduce upper boundson their lengths later.

B. Expressiveness and Complexity

Though the core constraint language is intentionally small,it is not minimal: some types of constraints are included forthe convenience of translating to and from the core language,but do not fundamentally increase its expressiveness. Stringequality, comparisons between lengths and constants, andinequality comparisons between lengths can be expressedusing concatenation, regular expressions, and equality be-tween string lengths respectively; the details are omitted forspace.

Each of the remaining constraint types (regular expres-sion membership, concatenation, and length) makes its owncontribution to the expressiveness of the core constraints.Appendix A gives examples of the sets of strings that eachconstraint type can uniquely define. The core constraint lan-guage is expressive enough that the complexity of decidingit is not known precisely; it is at least PSPACE-hard. Theserelationships are summarized in Figure 3. The complexity ofour core constraint language falls to NP-complete when thelengths of string variables are bounded, as they are in ourimplementation. Further details are in the extended versionof this paper [23].

518


Figure 3: Relations between the unbounded versions ofseveral theories of strings. Theories higher in the graph arestrictly more expressive but are also at least as complexto decide. Kudzu’s core constraint language (shaded) isstrictly more expressive than either the core language ofHAMPI [16] or the theory of word equations and an equallength predicate (the “pure library language” of [4]).

C. Expressiveness Comparison

Our system’s core constraint language is more expressivethan the constraints used in similar previous systems, andthis expressiveness is key in allowing it to handle a morecomplex class of applications.Bjørner et al. [4] present a “pure library language” that,

like our core constraint language, includes word equationsand the ability to assert that two strings have the same length,so like our language its decidability is open. However,their language does not include regular expressions. Regularexpressions may be less common in the .NET applicationsBjørner et al. study, but they are used ubiquitously inJavaScript, so regular expression support is mandatory in ourdomain. Similarly the work of Caballero et al. [3], [7] dealswith programs compiled from C-family languages, whosestring operations are much more limited.The DPRLE tool [14] focuses on a class of constraints that

combine concatenation and regular expression matching, butin a more limited way than our tool supports. DPRLE ad-dresses a different problem domain, since it gives solutionsfor constraints over languages (potentially infinite sets ofstrings) rather than single strings, but this makes the tasksignificantly more difficult. We were unable to express theconstraints from our application in DPRLE’s input format orany straightforward extension of it. For instance, there is noway to express the constraint that two language expressionsshould be equal, not surprising since such constraints ingeneral are undecidable [8].HAMPI [16] provides support for regular expression

constraints (and in fact we build on its implementation for

this feature), but its support for other constraints is limited,particularly by the fact that it supports only a single stringvariable. The variable can be concatenated with constantstrings, but these string expressions cannot be compared witheach other, only with regular expressions, so HAMPI lacksthe full generality of word equations. For instance, HAMPIconstraints cannot define the set {uv#u#v : u, v ∈ {0, 1}∗}.

It is worth reemphasizing that these limitations are not justtheoretical: they make these previous systems unsuitable forour applications. One of the most common operations inthe programs we examine is to parse a single input string(such as a URL) into separate input variables using splitor repeated regular expression matching. Representing thesemantics of such an operation requires relating the contentsof one string variable to those of another, something thatneither DPRLE nor HAMPI supports.

V. CORE CONSTRAINT SOLVING APPROACH

In this section, we explain our algorithm for solving thecore set of constraints. We introduce a bounded versionof the constraints where we assume a user-supplied upperbound k on the length of the variables. This allows us toemploy a SAT-based solution strategy without reducing thepractical expressiveness of the language.The algorithm satisfies three important properties, whose

informal proof appears in the extended version of thispaper [23]:

1) Soundness. Whenever the algorithm terminates with anassignment of values to string variables, the solutionis consistent with the input constraints.

2) Bound-k completeness. If there exists a solution forthe string variables where all strings have length k orless, then our algorithm finds one such solution.

3) Termination. The algorithm requires only a finite num-ber of steps (a function of the bound) for any input.

The solver translates constraints on the contents of stringsinto bit-vector constraints that can be solved with a SAT-based SMT solver. For this purpose, the solver translateseach input string into a sequence of n-bit integers (n = 8 inthe current implementation). Each string variable S also hasan associated integer variable LS representing its length. Asingle string is converted to a bit-vector by concatenatingthe binary representations of each character. Then, the bit-vectors representing each string are themselves concatenatedinto a single long bit-vector. (The order in which thestrings are concatenated into the long vector reflects theconcatenation constraints, as detailed in step 1 below.) Thesolver passes the constraints over this bit vector to a SMT(satisfiability modulo theories) decision procedure for thetheory of bit vectors, STP [10] in our implementation.Informally, it is convenient to refer to the combined bitvector as if it were an array indexed by character offsets, butwe do not use STP’s theory of arrays, and character offsets

519


Input: C : constraint listOutput: (IsSat : bool, Solutions : string list)G← BuildConcatGraph(C);(C′, StrOrderMap)← DecideOrder(G);C ← C ∪ C′;FailLenDB : length assignment list← ∅;while true do

(X, lengths)← SolveLengths(C,FailLenDB);if (X = UNSAT) then

print “Unsatisfiable”;halt(false, ∅);

endFinal : bitvector constraints;Final← CreateBVConstraints(StrOrderMap, C, lengths);(Result,BVSolutions)← BVSolver(Final);if (Result = SAT) then

print “Satisfiable”;printSolutions(BVSolutions, lengths, StrOrderMap);halt(true,BVsToStrings(BVSolutions));

endelse

FailLenDB← FailLenDB ∪ lengths;end

end

Figure 4: Algorithm for solving the core constraints.

are multiplied by n to give bit offsets before producing thefinal constraints.Our algorithm is shown in Figure 4. At a high level,

it has three steps. First, it translates string concatenationconstraints into a layout of variables (with overlap) in thefinal character array mentioned above. Second, it extractsinteger constraints on the lengths of strings and finds asatisfying length assignment using the SMT solver. Finally,given a position and length for each string, the solvertranslates the constraints on the contents of each string intobit-vector constraints and checks if they are satisfiable.In general, because of the interaction of length constraints

and regular expressions, the length assignment chosen in step2 might not correspond to satisfiable contents constraints,even when a different length assignment would. So if step 3fails to find a satisfying assignment, the algorithm returns tostep 2 to generate a new length assignment (distinct from anytried previously). Steps 2 and 3 repeat until the solver findsa satisfying assignment, or it has tried all possible lengthassignments (up to the length bound k).

Step 1: Translating concatenation constraints.. The intu-ition behind Kudzu’s handling of concatenation constraintsis that for a constraint S1 = S2 ◦ S3, it would be sufficientto ensure that S2 comes immediately before S3 in the finalcharacter array, and to lay out S1 as overlapping with S2

and S3 (so that S1 begins at the same character as S2 andends at the same character as S3). This overlapping layoutalso has the advantage of reducing the total length of bit-vectors required. Each concatenation constraint suggests anordering relation among the string variables, but it mightnot be possible to satisfy all such ordering constraints

S1

S2 S3

S4 S5

S6

S7

L

L L

R

R R(1,1)

(2,2) (3,3) (4,4)

(3,4)(2,3)

(1,3)

S1 = S2 . S3S3 = S4 . S5S6 = S5 . S7

INPUT CONCAT CONSTRAINTS

Figure 5: A sample concat graph for a set of concatenationconstraints. The relative ordering of the strings in the finalcharacter array is shown as start and end positions inparentheses alongside each node.

simultaneously.To systematically choose an ordering, the solver builds

a graph of concatenation constraints (a concat graph forshort). The graph has a node for each string variable, and foreach constraint S1 = S2◦S3, S2 and S3 are the left and rightchildren (respectively) of S1. An example of such a graph isshown in Figure 5. Without loss of generality, we can assumethat the graph is acyclic: if there is a cycle from S1 to S2

to S3 . . . back to S1, then S1= S2 ◦ S3 ◦ · · · ◦ S1 (or someother order), so all the variables other than S1 must be theempty string, and can be removed from the constraints. (Inour applications the constraints will in any case be acyclic byconstruction.) Given this graph, the algorithm then choosesthe relative ordering of the strings in the character array byassigning start and end positions to each node with a post-order traversal of the graph. (In Figure 5, these positions areshown in parentheses next to each node.)For the layout generated by the algorithm to be correct, the

concat graph must be a DAG in which each internal nodehas exactly two children, and those children are adjacentin the layout. (This implies that the graph is planar.) Thegraph may not have these properties at construction; forinstance, Figure 6 gives a set of example constraints withcontradictory ordering requirements: S2 cannot be simulta-neously to the left and to the right of S3. The algorithmresolves such requirements by duplicating a subtree of thegraph (for instance as shown in the right half of Figure 6).To maintain the correct semantics, the algorithm adds stringequality constraints to ensure that any duplicated stringshave the same contents as the originals. The algorithmperforms duplications to ensure that the graph satisfies thecorrectness invariant, but our current algorithm does notattempt to perform the minimal number of copies (forinstance, in Figure 6 it would suffice to copy either onlyS2 or only S3), which in our experience has not hurt thesolver’s performance.

Step 2: Finding a satisfiable length assignment. Eachstring variable S has an associated length variable LS . Eachcore string constraint implies a corresponding constrainton the lengths of the strings, as detailed in Table I. For

520


S1

S2 S3

S4L LR R

S1

S2 S3

S4

L LR R

S3_COPY S2_COPY

COPY CREATION

(1,1) (2,2) (3,3) (4,4)

(3,4)(1,2)

S2 = S2_COPYS3 = S3_COPY

NEW CONSTRAINTSDUE TO COPY CREATION

S1 = S2 . S3S4 = S3 . S2

INPUT CONCAT CONSTRAINTS

Figure 6: A set of concat constraints with contradictoryordering requirements. Nodes are duplicated to resolve thecontradiction.

the regular expression containment constraint (S1 ∈ R),the set of possible lengths is an ultimately periodic set:whether a length is possible depends only on its remainderwhen divided by a fixed period, except for a finite set ofexceptions. (Yu et al. use the equivalent concept of a semi-linear set in a conservative automaton-based approach [30].)The details of computing this set are covered in the lit-erature [20]; we note that such sets can be convenientlyrepresented with our SMT solver since it supports a modulusoperation. At each iteration of step 2, the solver conjoinsthe length constraints corresponding to all of the originalstring constraints, along with a constraint to rule out eachlength assignment that had previously been tried, and passesthis formula to the SMT solver. If it returns a satisfyingassignment, it represents a new length assignment to try;if the constraint is unsatisfiable, then so were the originalstring constraints.

Core Constraint Implication on lengthsS1 = S2 ◦ S3 LS1

= LS2+ LS3

S1 ∈ R LS1∈ LengthSet(R)

S1 = S2 LS1= LS2

length(S1) � i LS1� i

length(S1) � length(S2) LS1� LS2

Table I: Length constraints implied by core string con-straints, where LS is the length of a string S, and � rangesover the operators {<,≤,=,≥,>}.

It is not necessary for correctness that the length abstrac-tion performed by the solver be precise, but determiningprecise length bounds improves performance by avoidingwasted iterations. In the complete system, the integer con-straints over lengths are solved together with integer con-straints arising directly from the original program, discussedin Section VI. In our experience it is important for goodperformance to solve these two sets of integer constraintstogether. The two sets of constraints may be interrelated,and solving them together prevents the solver from wastingtime exploring solutions that satisfy one set but not the other.

Step 3: Encoding as bitvectors. Given the array layout andlengths computed in steps 1 and 2, the remaining constraintsover the contents of strings can be expressed as constraintsover fixed-size bit-vectors. String equality translates directly

into bit-vector equality. For the encoding of regular expres-sion constraints, we reuse part of the implementation ofthe HAMPI tool [16]. At a high-level, the translation firstunrolls uses of the Kleene star ∗ in a regular expressioninto a finite number of repetitions (never more than thestring length). Next, where the regular expression has con-catenation, HAMPI determines all possible combinations oflengths that sum to the total length, and instantiates each as aconjunction of constraints. Along with the alternations thatappeared in the original regular expression, each of theseconjunctions also represents an alternative way in which theregular expression could match the string. To complete thetranslation, the choice between all of these alternatives isrepresented with a disjunction. (See [16] for a more detailedexplanation and some optimizations.)HAMPI supports only a single, fixed-length input, so we

invoke it repeatedly to translate each constraint between aregular expression and a string into an STP formula. We thencombine each of these translations with our translations ofother string contents constraints (e.g., string equality), andconjoin all of these constraints so that they apply to thesame single long character array. It is this single combinedformula that we pass to the SMT solver (STP) to find asatisfying assignment.

VI. REDUCING JAVASCRIPT TO STRING CONSTRAINTS

In this section we describe our tool’s translation fromJavaScript to the language of our constraint solver, focusingon the treatment of string operations. We start by giving thefull constraint language the solver supports, then describeour general approach to modeling string operations, our useof concrete values from the dynamic trace, and the process oftranslating real regular expressions into textbook-style ones.

Full constraint language. The core constraint languagepresented in Section IV captures the essence of our solvingapproach, but it excludes several features for simplicity, mostnotably integer constraints. The full constraint language sup-ported by our solver supports values of string, integer, andboolean types, and its grammar is given in Figure 8, alongwith its type system in Figure 7. The additional constraintsare solved at step 2 of the string solution procedure, togetherwith the integer constraints on the lengths of strings. Tomatch common JavaScript implementations (which reservea bit as a type tag), we model integers as 31-bit signed bit-vectors in our SMT solver, which supports all the integeroperations that JavaScript does. The solver replaces eachtoString constraint with the appropriate string once avalue for its argument is selected: for instance, if i is giventhe value 12, toString(i) is replaced with “12”.

JavaScript string operations. JavaScript has a large libraryof string operations, and we do not aim to support everyoperation, or the full generality of their behavior. Beyond theengineering challenge of building such a complete transla-

521


τ ::= string | int | boolConstRegex ::= Regex | CapturedBrack(R, i) |

BetweenCapBrack(R, i, j)

Figure 7: Type system for the full constraint languageS1 : string = (S2 : string) ◦ (S3 : string) ◦ · · ·

I1 : int = length(S : string)S1 : string ∈ R : ConstRegexS1 : string /∈ R : ConstRegex

I1 : int = (I2 : int) {+,−, ·, /} (I3 : int)B1 : bool = (A1 : τ) {=, �=} (A2 : τ)B1 : bool = (I1 : int) {<,≤,≥, >} (I2 : int)B1 : bool = ¬(B2 : bool)B1 : bool = (B1 : bool) {∧,∨} (B2 : bool)

S1 : string = toString(I1 : int)

Figure 8: Grammar and types for the full constraintlanguage including operations on strings, integers, andbooleans.

function validate (input) {//input = ’{"action":"","val":""}’;mustMatch = ’{]:],]:]}’;re1 =/\\(?:["\\\/bfnrt]|u[0-9a-fA-F]{4})/g;re2 =/"[ˆ"\\\n\r]*"|true|false|null|

-?\d+(?:\.\d*)?(?:[eE][+\-]?\d+)?/g;re3 = /(?:ˆ|:|,)(?:\s*\[)+/g;rep1 = str.replace(re1, "@");rep2 = rep1.replace(re2, "]");rep3 = rep2.replace(re3, "");if(rep3==mustMatch) { eval(input); return true; }return false; }

Figure 9: Example of a regular-expression-based validationcheck, adapted from a real-world JavaScript application. Thisillustrates the complexity of real regular expression syntax.

tion, having very complex symbolic translations for commonoperators would likely cause the system to bog down, andthe generality would usually be wasted. Instead, our choicehas been to model the string operations that occur commonlyin web applications, and the core aspects of their behavior.For other operations and behavior aspects our tool usesvalues from the original execution trace (described furtherbelow), so that they are accurate with respect to the originalexecution even if the tool cannot reason symbolically abouthow they might change on modified executions. The detailedtranslation from several common operators (a subset of thosesupported by our implementation) to our constraint languageis shown in Table II.

Using dynamic information. One of the benefits of dy-namic symbolic execution is that it provides the flexibil-ity to choose between symbolic values (which introducegenerality) and concrete values (which are less general,but guaranteed to be precise) to control the scope of thesearch process. Our tool’s handling of string constraintstakes advantage of concrete values from the dynamic tracesin several ways. An example is string replace, whichis often used in sanitization to transform unsafe charactersinto safe ones. Our translation uses a concrete value forthe number of occurrences of the searched-for pattern: if

a pattern was replaced six times in the original run, thetool will search for other inputs in which the pattern occurssix times. This sacrifices some generality (for instance, if acertain attack is only possible when the string appears seventimes). However, we believe this is a beneficial trade-offsince it allows our tool to analyze and find bugs in many usesof replace. For comparison, most previous string con-straint solvers do not support replace at all, and addinga replace that applied to any number of occurrences of astring (even limited to single-character strings) would makeour core constraint language undecidable in the unboundedcase [6].

Regular expressions in practice. The “regular expressions”supported by languages like JavaScript have many morefeatures than the typical definition given in a computabilitytextbook (or Figure 2). Figure 9 shows an example (adaptedfrom a real web site) of one of many regular expressionsKudzu must deal with. Kudzu handles a majority of thesyntax for regular expressions in JavaScript, which includessupport for (possibly negated) character classes, escapedsequences, repetition operators ({n}/?/*/+/) and sub-match extraction using capturing parentheses. Kudzu keepstrack of the nesting of capturing parentheses, so that itcan express the relation between the input string and theparts of it that match the captured groups (as shown inTable II). Kudzu does not currently support back-references(they are strictly more expressive than true regular expres-sions), though if we see a need in the future, many usesof back-references could be translated using (non-regular)concatenation constraints.

VII. EXPERIMENTAL EVALUATION

We have built a full-implementation of Kudzu using theWebKit browser, with 650, 7430 and 2200 lines of code inthe path constraint extraction component, constraint solver,and GUI explorer component, respectively. The system iswritten in a mixture of C++, Ruby, and OCaml languages.

We evaluate Kudzu with three objectives. One objectiveis to determine whether Kudzu is practically effective inexploring the execution space of real-world applications anduncovering new code. The second objective is to determinethe effectiveness of Kudzu as a stand-alone vulnerability dis-covery tool — whether Kudzu can automatically find client-side code injection vulnerabilities and prune away falsereports. Finally, we measure the efficiency of the constraintsolver. Our evaluation results are promising, showing thatKudzu is a powerful system that finds previously unknownvulnerabilities in real-world applications fully automatically.

A. Experiment Setup

We select 18 subject applications consisting of populariGoogle gadgets and AJAX applications, as these werestudied by our previous tool FLAX [24]. FLAX assumesavailability of an external (manually developed) test suite to

522


JavaScript

operation

Reductio

nto

ourconstraint

lang

uage

S1

:string

=charAt(

S:string

,I:int)

(((L

S=length(S

))∧

(I≥

0)∧

(I<

LS)

∧(S

1=

T1◦

T2◦

T3)∧

(T2

=S)

∧(I

=length(T

1)+

1))

∨((

(I≥

LS)∨

(I<

0))∧

(S1

=“”

)))

I 1:int=

charcodeAt(

S:string

,I:int)

(((L

S=length(S

))∧

(I≥

0)∧

(I<

LS)∧

(S1

=T1◦

T2◦

T3)∧

(T2

=S)

∧(I

=length(T

1)+

1)∧

(S1

=toString(I

1))

)∨

(((I≥

LS)∨

(I<

0))∧

(S1

=toString(I

1))∧

(S1

=“NaN”)

))S

:string

=concat

(S1

:string

,S2

:string

,...

,Sk

:string

)(S

=S

1◦

S2◦···◦

Sk)

I:int=

indexO

f(S

:string

,s:string

,startpos:int)

(((I≥

0)∧

((startpos

<length(S

))∧

(startpos≥

0))∧

(S=

S1◦

S2)

∧(s

tartp

os

=length(S

1))∧

(S2

=T1◦

s◦

T3)∧

(I=length(T

1))

∧(T

1/∈Regex

(.*s.*))

)∨

((I

<0)∧

((startpos

<length(S

))∧

(startpos≥

0))∧

(S/∈Regex

(.*s.*))

)∨

(¬((startpos

<length(S

))∧

(startpos≥

0))∧

(S1

=toString(I

1))∧

(S1

=“NaN”)

))I

:int=

lastIndexO

f(S

:string

,s:string

,startpos:int)

(((I≥

0)∧

((startpos

<length(S

))∧

(startpos≥

0))∧

(S=

S1◦

S2)∧

(sta

rtp

os

=length(S

1))

∧(S

2=

T1◦

s◦

T3)∧

(I=length(T

1))∧

(T3

/∈Regex

(.*s.*))

)∨

((I

<0)∧

((startpos

<length(S

))∧

(startpos≥

0))∧

(S/∈Regex

(.*s.*))

)∨

(¬((startpos

<length(S

))∧

(startpos≥

0))∧

(S1

=toString(I

1))∧

(S1

=“NaN”)

))[S

1,S

2,.

..,S

k]:string

list

=match

(S:string

,r:Con

stRegex

)((

(k>

0)∧

(S∈

r)∧

((S

1∈

Cap

turedB

rack

(r,1

))∨

(S1

=“”

))

(non

-greedy)

∧(S

=T0◦

S1◦

T1◦···◦

Sk◦

Tk)∧

(Vk i=

0T

i∈

BetweenC

apBrack

(r,i

,i+

1))

)∧···∧

((S

k∈

Cap

turedB

rack

(r,k

))∨

(Sk

=“”

)))

∨((

k≤

0)∧

(S/∈Regex

(.*r.*))

))[S

1,S

2,.

..,S

n]:string

list

=match

(S:string

,r:Con

stRegex

,n:int)

(((S

=T1◦

M1◦

T2◦

M2◦···◦

Tn◦

Mn◦

Tn+

1)∧

(T1,T

2,.

..,T

n+

1/∈Regex

(.*r.*))

(greedymatch)

∧(M

1,M

2,.

..,M

n∈

Regex

(r))

)nis

theconcrete

numberof

occurrencesof

string

smatching

r.

S1

:string

=replace(

S:string

,r:Con

stRegex

,s:string

,n:int)

((S

=T1◦

M1◦

T2◦

M2◦···◦

Tn◦

Mn◦

Tn+

1)∧

(T1,T

2,.

..,T

n+

1/∈Regex

(.*r.*))

∧(M

1,M

2,.

..,M

n∈

Regex

(r))

))∧

(S1

=T1◦

s◦

T2◦

s◦···◦

Tn+

1)

nis

theconcrete

numberof

occurrencesof

string

smatching

rin

S.

[S1,S

2,.

..,S

k]:string

list

=split

(S:string

,s:string

,n:int)

(((S

=S

1◦

s◦

S2◦

s◦···◦

Sn◦

s◦

Sn+

1)∧

(S1,S

2,.

..,S

n+

1/∈Regex

(.*s.*))

))nis

theconcrete

numberof

occurrencesof

string

smatching

r.

I 1:int=

search

(S:string

,r:Con

stRegex

)((

(I1

<0)∧

(S/∈.*r.*))

∨((

I 1≥

0)∧

(S=

T1◦

T2◦

T3)∧

(I1

=length(T

1))∧

(T2∈

Regex

(r))∧

(T1,T

3/∈

(.*r.*))

))S

1:string

=substring(

S:string

,start

:int,end

:int)

((start≥

0)∧

(end

<length(S

))∧

(end≥

start)∧

(S=

T1◦

S1◦

T2)∧

(start

=length(T

1))

∧(I

1=

end−

start)∧

(I1

=length(S

1))

∨((start≥

0)∧

(end≥length(S

))∧

(end≥

start)∧

(S=

T1◦

S1◦

T2)∧

(start

=length(T

1))

∧(L

S=length(S

))∧

(I1

=L

S−

start)∧

(I1

=length(S

1))

∨((start<

0)∧

(end

<length(S

))∧

(end≥

start)∧

(S=

T1◦

S1◦

T2)∧

(0=length(T

1))

∧(I

1=

end−

start)∧

(I1

=length(S

1))

∨((start<

0)∧

(end≥length(S

))∧

(end≥

start)∧

(S=

T1◦

S1◦

T2)∧

(0=length(T

1))

∧(L

S=length(S

))(L

S=length(S

1))

B1

:bo

ol=

match

(S,r

)((

B1)∧

(S∈

Regex

(r))∨

(¬(B

1)∧

(S/∈Regex

(r))

))I

:int=

parseInt

(S)

(S=toString(I

))∧

((S

=“NaN”)∨

(S=

Regex

([0-9]+))

)

TableII:Our

reductionfrom

common

JavaScript

operations

toourfullconstraint

language.C

apitalized

variablesmay

beconcrete

orsymbolic,w

hile

lowercase

variables(including

regularexpressions)

musttake

aconcrete

value.

Our

implem

entatio

nsupports

alarger

setof

operations,whose

translations

aresimila

r.

523


Substr /

Substring /

CharAt /

CharCodeAt

5%

IndexOf /

LastIndexOf

/ Strlen

78%

Replace /

EncodeURI /

DecodeURI

8%

Match / Test

/ Split

1%Concat

8%

Figure 10: Distribution of string operations in our subjectapplications.

seed its testing; in contrast, Kudzu automatically generatesa much more comprehensive test suite and finds the pointsof vulnerability without requiring any external test harnessa priori. Further, in our experiments Kudzu discovers 2 newvulnerabilities within a few hours of testing which weremissed by the FLAX because of its lack of coverage. Inaddition, as we show later in this section, many of thegenerated constraints are highly complex and not suitablefor manual inspection or fuzzing, whereas Kudzu eitherasserts the safety of the validation checks or finds exploitsfor vulnerabilities in one iteration as opposed to many roundsof random testing.To test each subject application, we seed the system with

the URL of the application. For the gadgets, the URLs arethe same as those used by iGoogle page to embed the gadget.We configure Kudzu to give a pre-prepared username andlogin password for applications that required authentication.We report the results for running each application underKudzu, capping the testing time to a maximum of 6 hoursfor each application. All tests ran on a Ubuntu 9.10 Linuxworkstation with 2.2 GHz Intel dual-core processors and 2GB of RAM.

B. Results

Table III presents the final results of testing the subjectapplications. The summary of our evaluation highlightsthree features of Kudzu: (a) it automatically discovers newprogram paths in real applications, significantly enhancingcode coverage; (b) it finds 2 client-side code injection inthe wild and several in applications that were known tocontain vulnerabilities; and (c) Kudzu significantly prunesaway false positives, successfully discarding cases that doemploy sufficient validation checks.

Characteristics of string operations in our applications.Constraints arising from our applications have an average

of 63 JavaScript string operations, while the remaining

Application # of new Initial / Final Buginputs Code Coverage found

Academia 20 30.27 / 76.47% �

AJAXIm 15 49.58 / 77.67% �

FaceBook Chat 54 26.85 / 76.84% -ParseUri 13 53.90 / 86.10% �

Plaxo 31 5.72 / 76.43% �

AskAWord 10 29.30 / 67.95 % �

Birthday Reminder 27 59.47 / 73.94% -Block Notes 457 65.06 / 71.50 % �

Calorie Watcher 16 64.54 / 73.53% -Expenses Manager 133 61.09 / 76.56% -

Listy 19 65.31 / 79.73% �

Notes LP 25 46.62 / 76.67% -Progress Bar 12 63.60 / 75.09% -

Simple Calculator 1 46.96 / 80.52% �

Todo List 15 72.51 / 86.41% �

TVGuide 6 30.39 / 75.13% �

Word Monkey 20 14.84 / 75.36% �

Zip Code Gas 11 59.05 / 74.28% -

Average 51 46.95 / 76.68% 11

Table III: The top 5 applications are AJAX applications,while the rest are Google/IG gadget applications. Column2 reports the number of distinct new inputs generated, andcolumn 3 reports the increase in code coverage from theinitial run to and the final run.

are boolean, logical and arithmetic constraints. Figure 10groups the observed string operations by similarity. Thelargest fraction are operations like indexOf that take stringinputs and return an integer, which motivate the need fora solver that reasons about integers and strings simulta-neously. A significant fraction of the operations, includingsubtring, split and replace, implicitly give riseto new strings from the original one, thereby giving riseto constraints involving multiple string variables. Of thematch, split and replace operations, 31% are regularexpression based. Over 33% of the regular expressions haveone or more capturing parentheses. Capturing parentheses inregular expression based match operations lead to constraintsinvolving multiple string variables, similar to operations suchas split.

These characteristics show that a significant fraction of thestring constraints arising in our target applications require asolver that can reason about multiple string variables. Weempirically see examples of complex regular expressionsas well as concatenation operations, which stresses theneed for our solver that handles both word equations andregular expression constraints. Prior to this work, off-the-shelf solvers did not support word equations and regularexpressions simultaneously.

Vulnerability Discovery. Kudzu is able to find client-sidecode injection vulnerabilities in 11 of the applications tested.2 of these were not known prior to these experimentsand were missed by FLAX. One of them is on a social-networking application (http://plaxo.com) that was

524


0

1

2

3

4

5

6x 10

4

ask−

a−wor

d

birth

day

block

−not

es

calor

ie−wat

cher

listy

expe

nse−

man

age

note

s−lab

prog

ress

−bar

calcu

lator

todo

−list

tv−gu

ide

word−

mon

key

zipco

de−g

as

acad

emia

ajaxim

face

book

−cha

t

pars

eUri

plaxo

Num

ber

of in

stru

ctio

ns

ExecutedCompiled

Figure 11: Kudzu code coverage improvements over thetesting period. For each experiment, the right bar shows theincrease in the executed code from the initial run to totalcode executed. The left bar shows the increase in the codecompiled from initial run to the total code compiled in theentire test period.

missed by our FLAX tool because the vulnerability existson a page linked several clicks away from the initial post-authentication page. The vulnerable code is executed onlyas part of a feature in which a user sets focus on a textbox and uses it to update his or her profile. This is oneof the many different ways to update the profile that theapplication provides. Kudzu found that only one of theseways resulted in a client-side code injection vulnerability,while the rest were safe. In this particular functionality,the application fails to properly validate a string from apostMessage event before using it in an eval operation.The application implicitly expects to receive this messagefrom a window hosted at a sub-domain of facebook.com;however, Kudzu automatically determines that any webprincipal could inject any data string matching the formatFB_msg:.*{.*}. This subsequently results in code injec-tion because the vulnerable application fails to validate theorigin of the sender and the structure of JSON string beforeits use in eval.The second new vulnerability was found in a ToDo

Google/IG gadget. Similar to the previous case, the vul-nerability becomes reachable only when a specific value isselected from a dropdown box. This interaction is amongmany that the gadget provides and we believe that Kudzu’sautomatic exploration is the key to discovering this use case.In several other cases, such as AjaxIM, the vulnerable codeis executed only after several events are executed after initialsign-in page—Kudzu automatically reaches them during itsexploration.

Kudzu did not find vulnerabilities in only one case that

0

1

2

3

4

5

6x 10

4

ask−

a−wor

d

birth

day

block

−not

es

calor

ie−wat

cher

listy

expe

nse−

man

age

note

s−lab

prog

ress

−bar

calcu

lator

todo

−list

tv−gu

ide

word−

mon

key

zipco

de−g

as

acad

emia

ajaxim

face

book

−cha

t

pars

eUri

plaxo

Num

ber

of in

stru

ctio

ns

ExecutedCompiled

Figure 12: Benefits from symbolic execution alone (darkbars) vs. complete Kudzu (light bars). For each experiment,the right bar shows the increase in the total executed codewhen the event-space exploration is also turned on. The leftbar shows the observed increase in the code compiled whenthe event-space exploration is turned on.

FLAX reported a bug. This is because the vulnerabilitywas patched in the time period between our experimentalevaluation of FLAX and Kudzu.

Code and Event-space Coverage. Table III shows thecode coverage by executing the initial URL, and the finalcoverage after the test period. Measuring code coverage ina dynamically compiled language is challenging because allthe application code is not known prior to the experiments.In our experiments, we measured the total code compiledduring our experiments and the total code executed 3.

Table III shows an average improvement of over 29% incode coverage. The coverage varies significantly dependingon the application. Figure 11 provides more detail. Onseveral large applications, such as Facebook Chat, AjaxIM,and Plaxo, Kudzu discovers a lot of new code during testing.Kudzu is able to concretely execute several code paths, asshown by the increase in the right-side bars in Figure 11.On the other less complex gadget applications, most of therelevant code is observed during compilation in the initialrun itself, leaving a relatively smaller amount of new codefor Kudzu to discover. We also manually analyzed the sourcecode of these applications and found that a large fraction oftheir code branches were not dependent on data we treat asuntrusted.

To measure the benefits of symbolic execution alone, werepeated the experiments with the event-space exploration

3One unit of code in our experiments is a JavaScript bytecode compiledby the interpreter. To avoid counting the same bytecode across several runs,we adopted a conservative counting scheme. We assigned a unique identifierto each bytecode based on the source file name, source line number, lineoffset and a hash of the code block (typically one function body) compiled.

525


Application # of initial # of total Total eventsevents fired events fired discovered

Academia 20 78 310AJAXIm 72 481 988

FaceBook Chat 15 989 1354ParseUri 5 16 17Plaxo 88 381 688

AskAWord 2 8 11Birthday Reminder 12 20 20

Block Notes 7 85 319Calorie Watcher 14 18 22

Expenses Manager 10 107 1473Listy 15 470 638

Notes LP 10 592 1034Progress Bar 8 24 36

Simple Calculator 17 34 67Todo List 8 26 61TVGuide 17 946 1517

Word Monkey 3 10 22Zip Code Gas 12 12 12

Average 18.61 238.72 477.17

Table IV: Event space Coverage: Column 2 and 3 show thenumber of events fired in the first run and in total. The lastcolumn shows the total events discovered during the testing.

turned off during the test period and report the comparisonto full-featured Kudzu in Figure 12. We consistently observethat symbolic execution alone discovers and executes asignificant fraction of the application by itself. The event-exploration combined with symbolic execution does performstrictly better than symbolic execution in all but 3 cases. In amajority of the cases, turning on the event-space explorationsignificantly complements symbolic execution, especiallyfor the AJAX applications which have a significant GUIcomponent. In the 3 cases where improvements are notsignificant, we found that the event exploration generallyeither led to off-site navigations or the code executed couldbe explored by symbolic execution alone. For example, inthe parseUri case, same code is executed by text-box inputas well as by clicking a button on the GUI.Table IV shows the increase in number of events executed

by Kudzu from the initial run to the total at the end of testperiod. These events include all keyboard and mouse eventswhich result in execution of event handlers, navigation,form submissions and so on. We find that new events aredynamically generated during one particular execution aswell as when new code is discovered. As a result, Kudzugradually discovers new events and was able to executeapproximately 50% of the events it observes during theperiod of testing.

Effectiveness. Kudzu automatically generates a test suite ofover 50 new distinct inputs on average for an application inthe test period (shown in column 2 of table III).In the exploitable cases we observed, Kudzu was able

to show the existence of a vulnerability with an attackstring once it reached the point of vulnerability. That is,

its constraint solver correctly determines the sufficiency orinsufficiency of validation checks in a single query withoutmanual intervention or undirected iteration. This eliminatesfalse positives significantly in practice. For instance, Kudzufound that the Facebook web application has several uses ofpostMessage data in eval constructs, but all uses werecorrectly preceded by checks that assert that the origin of themessage is a domain ending in .facebook.com. In con-trast, the vulnerability in Plaxo fails to check this and Kudzuidentifies the vulnerability the first time it reaches that point.Some of the validation checks Kudzu deals with are quitecomplex — Figure 9 shows an example which is simplifiedfrom a real application. These examples are illustrative ofthe need for automated reasoning tools, because checkingthe sufficiency of such validation checks would be onerousby hand and impractical by random fuzzing. Lastly, we pointout that like most other vulnerability discovery tools, Kudzucan have false negatives because it may fail to cover code,or because of overly strict attack grammars.

Constraint Solver Evaluation. Figure 13 shows the runningtimes for solving queries of various input constraint sizes.Each constraint is a either a JavaScript string, arithmetic,logical, or boolean operation. The sizes of the equationsvaried from 1 to up to 250 constraints. The solver decidessatisfiability of the constraints typically under a second forsatisfiable cases. As expected, to assert unsatisfiability, thesolver often takes time varying from nearly a second to 50seconds. The variation is large because in many cases thesolver asserts unsatisfiable by asserting the unsatisfiabilityof length constraints, which is inexpensive because thestep of bit-vector encoding is avoided. In other cases, theunsatisfiability results only when the solver determines theunsatisfiability of bit-vector solutions.Our solver requires only an upper bound on the lengths

of input variables, and is able to infer satisfiable lengths ofvariables internally. In these experiments, we increase theupper bound of the input variables from 10 to 100 charactersin steps of 20 each. If the solver asserts unsatisfiabilityup to the length bound of 100, the constraints are deemedunsatisfiable.

VIII. RELATED WORK

Kudzu is the first application of dynamic symbolic execu-tion to client-side JavaScript. Here, we discuss some relatedprojects that have applied similar techniques to server-sideweb applications, or have used different techniques to searchfor JavaScript bugs. Finally, we summarize why we neededto build a new string constraint solver.

Server-side analysis. JavaScript application code is similarin some ways to server-side code (especially PHP); forinstance, both tend to make heavy use of string operations.Several previous tools have demonstrated the use of sym-bolic execution for finding SQL injection and reflected or

526


0 50 100 150 200 250

0.05

0.5

5

50

Solve Time (SAT cases) Solve Time(UNSAT cases)

Figure 13: The constraint solver’s running time (in seconds)as a function of the size of the input constraints (in termsof the number of symbolic JavaScript operations)

stored cross-site scripting attacks to code written in PHPand Java [1], [17], [28]. However, JavaScript code usuallyparses its own input, so JavaScript symbolic executionrequires more expressive constraints, specifically to relatedifferent strings that were previously part of a single string.Additional challenges unique to JavaScript arise becauseJavaScript programs take many different kinds of input,some of which come via user interface events.Like Kudzu, the Saner [2] tool for PHP aims to check

whether sanitization routines are sufficient, not just that theyare present. However their techniques are quite different:they select paths and model transformations statically, thenperform testing to verify some vulnerabilities. Their def-inition of sanitization covers only string transformations,not validation checks involving branches, which occur fre-quently in our applications.

Analysis frameworks for JavaScript. Several works haverecently applied static analysis to detect bugs in JavaScriptapplications (e.g., [9], [12]). Static analysis is complemen-tary to symbolic execution: if a static analysis is sound, anabsence of bug reports implies the absence of bugs, but staticanalysis warnings may not be enough to let a developerreproduce a failure, and in fact may be false positives.FLAX uses taint-enhanced blackbox fuzzing to detect if

the JavaScript application employs sufficient validation ornot [24]; like Kudzu, it searches for inputs to trigger afailure. However, FLAX requires an external test suite tobe able to reach the vulnerable code, whereas Kudzu gen-erates a high-coverage test suite automatically. Also, FLAXperforms only black-box fuzz testing to find vulnerabilities,while Kudzu’s use of a constraint solver allows it to reasonabout possible vulnerabilities based on the analyzed code.

Crawljax is a recently developed tool for event-spaceexploration of AJAX applications [21]. Specifically, Crawl-jax builds a static representation of a Web 2.0 applicationby clicking elements on the page and building a stategraph from the resulting transitions. Kudzu’s value spaceexploration complements such GUI exploration techniquesand enables a more complete analysis of the applicationusing combined symbolic execution and GUI exploration.

String constraint solvers. String constraint solvers haverecently seen significant development, and practical toolsare beginning to become available, but as detailed in Sec-tion IV-C, no previous solvers would be sufficient forJavaScript, since they lack support for regular expres-sions [3], [4], [7], string equality [14], or multiple vari-ables [16], which are needed in combination to reasonabout JavaScript input parsing. In concurrent work, Veaneset al. give an approach based on automata and quantifiedaxioms to reduce regular expressions to the Z3 decisionprocedure [26]. Combined with [4], this would providesimilar expressiveness to Kudzu.

IX. CONCLUSION

With the rapid growth of AJAX applications, JavaScriptcode is becoming increasingly complex. In this regard,security vulnerabilities and analysis of JavaScript code isan important area of research. In this paper, we presentedthe design of the first complete symbolic-execution basedsystem for exploring the execution space of JavaScriptprograms. In making the system practical we addressedchallenges ranging from designing a more expressive lan-guage for string constraints to implementing exploration andreplay of GUI events. We have implemented our ideas ina tool called Kudzu. Given a URL for a web application,Kudzu automatically generates a high-coverage test suite.We have applied Kudzu to find client-side code injectionvulnerabilities and Kudzu finds 11 vulnerabilities (2 previ-ously unknown) in live applications without producing falsepositives.

X. ACKNOWLEDGMENTS

We thank David Wagner, Adam Barth, Domagoj Babic,Adrian Mettler, Juan Caballero, Pongsin Poosankam and ouranonymous reviewers for helpful feedback on the paper andsuggestions for improvements to earlier drafts of this paper.This material is based upon work partially supported by theNational Science Foundation under Grants No. 0311808,No. 0448452, No. 0627511, and CCF-0424422, by theAir Force Office of Scientific Research under Grant No.22178970-4170, and by the Army Research Office underGrant No. DAAD19-02-1-0389. Any opinions, findings, andconclusions or recommendations expressed in this materialare those of the authors and do not necessarily reflect theviews of the National Science Foundation, the Air ForceOffice of Scientific Research, or the Army Research Office.

527


REFERENCES

[1] S. Artzi, A. Kiezun, J. Dolby, F. Tip, D. Dig, A. Paradkar, andM. D. Ernst. Finding bugs in dynamic web applications. InInternational Symposium on Software Testing and Analysis,2008.

[2] D. Balzarotti, M. Cova, V. Felmetsger, N. Jovanovic, E. Kirda,C. Kruegel, and G. Vigna. Saner: Composing Static andDynamic Analysis to Validate Sanitization in Web Applica-tions. In Proceedings of the IEEE Symposium on Securityand Privacy, Oakland, CA, May 2008.

[3] A. Barth, J. Caballero, and D. Song. Secure content sniffingfor web browsers or how to stop papers from reviewingthemselves. In Proceedings of the 30th IEEE Symposium onSecurity and Privacy, Oakland, CA, May 2009.

[4] N. Bjørner, N. Tillmann, and A. Voronkov. Path feasibilityanalysis for string-manipulating programs. In Proceedings ofthe 15th International Conference on Tools and Algorithmsfor the Construction and Analysis of Systems, 2009.

[5] H. Bojinov, E. Bursztein, and D. Boneh. XCS: Cross channelscripting and its impact on web applications. In CCS, 2009.

[6] J. R. Buchi and S. Senger. Definability in the existentialtheory of concatenation and undecidable extensions of thistheory. Mathematical Logic Quarterly, 34(4):337–342, 1988.

[7] J. Caballero, S. McCamant, A. Barth, and D. Song. Ex-tracting models of security-sensitive operations using string-enhanced white-box exploration on binaries. Technical Re-port UCB/EECS-2009-36, EECS Department, University ofCalifornia, Berkeley, 2009.

[8] A. Chandra, J. Halpern, A. Meyer, and R. Parikh. Equationsbetween regular terms and an application to process logic. InProceedings of the 13th annual ACM Symposium on Theoryof Computing (STOC), pages 384–390, May 1981.

[9] R. Chugh, J. A. Meister, R. Jhala, and S. Lerner. Stagedinformation flow for JavaScript. In PLDI, 2009.

[10] V. Ganesh and D. L. Dill. A decision procedure for bit-vectorsand arrays. In Computer Aided Verification, 19th InternationalConference (CAV), pages 519–531, July 2007.

[11] P. Godefroid, M. Y. Levin, and D. Molnar. Automatedwhitebox fuzz testing. In Network and Distributed SystemSecurity, Feb. 2008.

[12] S. Guarnieri and B. Livshits. Gatekeeper: mostly staticenforcement of security and reliability policies for JavaScriptcode. In Usenix Security, 2009.

[13] R. Hansen. XSS cheat sheet. http://ha.ckers.org/xss.html.[14] P. Hooimeijer and W. Weimer. A decision procedure for

subset constraints over regular languages. In ACM SIGPLANConference on Programming Language Design and Imple-mentation (PLDI), pages 188–198, June 2009.

[15] J. Karhumaki, W. Plandowski, and F. Mignosi. The express-ibility of languages and relations by word equations. InAutomata, Languages and Programming, 24th InternationalColloquium, (ICALP), pages 98–109, 1997.

[16] A. Kiezun, V. Ganesh, P. J. Guo, P. Hooimeijer, and M. D.Ernst. HAMPI: A solver for string constraints. In Interna-tional Symposium on Software Testing and Analysis, 2009.

[17] A. Kiezun, P. J. Guo, K. Jayaraman, and M. D. Ernst.Automatic creation of SQL injection and cross-site script-ing attacks. In 30th International Conference on SoftwareEngineering (ICSE), May 2009.

[18] A. Klein. DOM based cross site scripting or XSS of the thirdkind. Technical report, Web Application Security Consortium,2005.

[19] M. Martin and M. S. Lam. Automatic generation of XSS andSQL injection attacks with goal-directed model checking. In17th USENIX Security Symposium, 2008.

[20] A. B. Matos. Periodic sets of integers. Theoretical ComputerScience, 127(2):287–312, May 1994.

[21] A. Mesbah, E. Bozdag, and A. van Deursen. Crawling ajaxby inferring user interface state changes. In Proceedings ofthe International Conference on Web Engineering, 2008.

[22] Samy. I’m popular. Description of the MySpace worm bythe author, including a technical explanation., Oct 2005.

[23] P. Saxena, D. Akhawe, S. Hanna, F. Mao, S. McCamant, andD. Song. A symbolic execution framework for javascript.Technical Report UCB/EECS-2010-26, EECS Department,University of California, Berkeley, 2010.

[24] P. Saxena, S. Hanna, P. Poosankam, and D. Song. FLAX:Systematic discovery of client-side validation vulnerabilitiesin rich web applications. In 17th Annual Network & Dis-tributed System Security Symposium, (NDSS), 2010.

[25] TwitPwn. DOM based XSS in Twitterfall. 2009.[26] M. Veanes, P. de Halleux, and N. Tillmann. Rex: Symbolic

regular expression explorer. In International Conference onSoftware Testing, Verification and Validation, 2010.

[27] W3C. HTML 5 specification. http://www.w3.org/TR/html5/.[28] G. Wassermann, D. Yu, A. Chander, D. Dhurjati, H. Inamura,

and Z. Su. Dynamic test input generation for web appli-cations. In Proceedings of the International symposium onSoftware testing and analysis, 2008.

[29] XML path language 2.0. http://www.w3.org/TR/xpath20/.[30] F. Yu, T. Bultan, and O. H. Ibarra. Symbolic string verifica-

tion: Combining string analysis and size analysis. In Toolsand Algorithms for the Construction and Analysis of Systems(TACAS), pages 322–336, Mar. 2009.

APPENDIX

A. Expressiveness

Each of the following constraint types allows new sets ofstrings to be defined:

• The theory of just concatenation constraints v1 = v2◦v3

(plus constant strings) is equivalent to the theory ofword equations, which is known to be neither a subsetnor a superset of the theory of regular expressions interms of expressiveness [15]. For instance, the set ofstrings that consist of a single string repeated {ww :w ∈ Σ∗} can be easily expressed with a word equation,but is not regular.

• Conversely, the regular languages expressed by regularexpressions include sets of strings that cannot be ex-pressed solely with word equations. An example givenin [15] is the set of strings consisting of a and b overa three-letter alphabet {a,b,c}, which is representedby the regular expression (a|b)*.

• The constraint length(v1) = length(v2) also addsexpressiveness, since the language {u$v : |u| = |v|} isnot regular, and the relation of two strings having equallength also can not be expressed in word equations forany non-unary alphabet [15].

528


Documents

A Symbolic Execution Framework for JavaScriptdawnsong/papers/2010 symbolic execution.pdf · DOI 10.1109/SP.2010.38 513 Authorized licensed use limited to: Univ of Calif Berkeley