Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
PMP: Cost-effective Forced Execution with Probabilistic Memory Pre-planning
Wei You1, Zhuo Zhang1, Yonghwi Kwon2, Yousra Aafer1, Fei Peng1, Yu Shi1, Carson Harmon1, Xiangyu Zhang11Department of Computer Science, Purdue University, Indiana, USA
2Department of Computer Science, University of Virginia, Virginia, USAEmail: {you58, zhan3299, yaafer, pengf, shi442, harmon35, xyzhang}@purdue.edu, [email protected]
Abstract—Malware is a prominent security threat and exposingmalware behavior is a critical challenge. Recent malware oftenhas payload that is only released when certain conditions aresatisfied. It is hence difficult to fully disclose the payload bysimply executing the malware. In addition, malware samplesmay be equipped with cloaking techniques such as VM detectorsthat stop execution once detecting that the malware is beingmonitored. Forced execution is a highly effective method topenetrate malware self-protection and expose hidden behavior, byforcefully setting certain branch outcomes. However, an existingstate-of-the-art forced execution technique X-Force is very heavy-weight, requiring tracing individual instructions, reasoning aboutpointer alias relations on-the-fly, and repairing invalid pointersby on-demand memory allocation. We develop a light-weight andpractical forced execution technique. Without losing analysis pre-cision, it avoids tracking individual instructions and on-demandallocation. Under our scheme, a forced execution is very similarto a native one. It features a novel memory pre-planning phasethat pre-allocates a large memory buffer, and then initializesthe buffer, and variables in the subject binary, with carefullycrafted values in a random fashion before the real execution.The pre-planning is designed in such a way that dereferencingan invalid pointer has a very large chance to fall into thepre-allocated region and hence does not cause any exception,and semantically unrelated invalid pointer dereferences highlylikely access disjoint (pre-allocated) memory regions, avoidingstate corruptions with probabilistic guarantees. Our experimentsshow that our technique is 84 times faster than X-Force, has6.5X and 10% fewer false positives and negatives for programdependence detection, respectively, and can expose 98% moremalicious behaviors in 400 recent malware samples.
I. INTRODUCTION
The proliferation of new strains of malware every yearposes a prominent security threat. Recently reported attacksdemonstrate the emergence of new attacking trends, wheremalware authors are designing for stealth and leaving lighterfootprints. For example, Fileless malware [5] infects a targethost through exploiting built-in tools and features, withoutrequiring the installation of malicious programs. Clicklessinfections [1] avoid end-user interaction through exploitingshared access points and remote execution exploits. Cryptocur-rency malware [4] allow attackers to generate huge revenuesby illegally running mining algorithms using victim’s systemresources. According to [3], a massive cryptocurrency miningbotnet has generated $3 million revenue in 2018. Under thisnew threatscape, malicious payloads have evolved and lookmuch different than traditional ones. Thus, a critical challengethe security community is facing today is to understand andanalyze emerging malware’s behavior in an effort to preventpotentially epidemic consequences.
A popular approach to understanding malware behavior is torun it in a sandbox. However, a well-known difficulty is thatthe needed environment or setup may not be present (e.g.,C&C server is down and critical libraries are missing) suchthat the malware cannot be executed. In addition, recent mal-ware often makes use of time-bomb and logic-bomb that definevery specific temporal and contextual conditions to releasepayload, and some samples even use cloaking techniques suchas packing, and VM/debugger detectors that prevent executionwhen the malware is being monitored.
Researchers in [32] proposed a technique called forced-execution (X-Force) that penetrates these malware self-protection mechanisms and various trigger conditions. It worksby force-setting branch outcomes of some conditional instruc-tions. (e.g., those checking trigger conditions). As forcingexecution paths could lead to corrupted states and henceexceptions, X-Force features a crash-free execution modelthat allocates a new memory block on demand upon anyinvalid pointer dereference. However, X-Force is a veryheavy-weight technique that is difficult to deploy in practice.Specifically, in order to respect program semantics, when X-Force fixes an invalid pointer variable (by assigning a newlyallocated memory block to the variable), it has to updateall the correlated pointer variables (e.g., those have constantoffsets with the original invalid pointer). To do so, it hasto track all memory operations (to detect invalid accesses)and all move/addition/subtraction operations (to keep track ofpointer variable correlations/aliases). Such tracking not onlyentails substantial overhead, but also is difficult to implementcorrectly due to the complexity of instruction set and thenumerous corner situations that need to be considered (e.g., incomputing pointer relations). As a result, the original X-Forcedoes not support tracing into library functions.
In this paper, we propose a practical forced executiontechnique. It does not require tracking individual memory orarithmetic instructions. Neither does it require on demandmemory allocation. As such, the forced execution is veryclose to a native execution, naturally handling libraries anddynamically generated code. Specifically, it achieves crash-free execution (with probabilistic guarantees) through a novelmemory pre-planning phase, in which it pre-allocates a regionof memory starting from address 0, and fills the region withcarefully crafted random values. These values are designed insuch a way that (1) if they are interpreted as addresses andfurther dereferenced, the addresses fall into the pre-allocatedregion and do not cause exception; (2) they have diverse
1
random values such that semantically unrelated pointer vari-ables unlikely dereference the same random address and avoidcausing bogus program dependencies and corrupted states. Anexecution engine is developed to systematically explores dif-ferent paths by force-setting different sets of branch outcomes.For each path, multiple processes are spawned to execute thepath with different randomized memory pre-planning schemes,further reducing the probability of coincidental failures. Theresults of these processes are aggregated to derive the resultsfor the particular path. The engine then moves forward to thenext path.
Our contributions are summarized as follows.
• We develop a practical forced-execution engine that doesnot entail any heavy-weight instrumentation.
• We propose a novel memory pre-planning scheme thatprovides probabilistic guarantees to avoid crashes andbogus program dependencies. The execution under ourscheme is very similar to a native execution. Once thememory is pre-planned and initialized at the beginning,the execution just proceeds as normal, without requiringany tracking or on the fly analysis (e.g., pointer correla-tion analysis).
• We have implemented a prototype called PMP and eval-uated it on SPEC2000 programs (which include gcc),and 400 recent real-world malware samples. Our resultsshow that PMP is a highly effective and efficient forcedexecution technique. Compared to X-Force, PMP is 84time faster, and the false positive (FP) and false negative(FN) rates are 6.5X and 10% lower, respectively, regard-ing dependence analysis; and detect 98% more maliciousbehaviors in malware analysis. It also substantially super-sedes recent commercial and academic malware analysisengines Cuckoo [2], Habo [10] and Padawan [8].
II. MOTIVATION
In this section, we use an example to motivate the problem,explain the limitations of existing techniques, and illustrate ouridea. The code snippet in Figure 1 simulates the command andcontrol (C&C) behavior of a variant of Mirai [7], a notoriousIoT malware that launches distributed denial of service attackswhen receiving commands from the remote C&C server. Inparticular, it reads the maximum number of destination hosts(to attack) from a configuration file (line 9), and allocatesa Cmd object with sufficient memory to store destinationinformation in the Dest objects (lines 10-12). When theC&C server is connectable (line 15), the malware scans thelocal network for the destination hosts (line 16), receives therequested command (line 17), and performs the correspondingactions on the destination hosts (lines 18-22).
To expose such malicious behavior, analysts could run thesample in a sandbox and monitor its system call sequencesand network flows [8]. Unfortunately, a naive execution-basedanalysis is incomplete and hence cannot reveal all the mali-cious payloads, especially those that are condition-guarded andenvironment-specific. In our example, if the configuration file
does not exist or the C&C server is not connectable, the mali-cious behavior will not be exposed at all. One may consider toconstruct an input file and simulate the network data. However,such a task is time-consuming and not practical for zero-day malware whose input format and network communicationprotocol are unknown. In addition, recent malware samples areincreasingly equipped with anti-analysis mechanism, whichprevents these samples from execution even if they are givenvalid inputs (please refer to Section IV for real-world cases).This poses great difficulties for dynamic analysis.
Forced execution [32] provides a practical solution to sys-tematically explore different execution paths (and, hence revealdifferent program behaviors) without any input or environmentsetup. It works by force-setting branch outcomes of a small setof predicates and jump tables. One critical problem faced byforced execution is invalid memory accesses due to the absenceof necessary memory allocations and initializations, whichare present in normal execution. Without appropriate handlingof invalid memory accesses, the program is most likely tocrash before reaching any malicious payload. In our example,the malicious behaviors were supposed to be exposed, if thepredicate in line 15 is forced to take the true branch, andthe jump table in line 18 is forced to iterate different entries.However, the forced execution fails in line 30, because cmd isnot properly allocated and its dests field is not initialized.
X-Force. In X-Force [32], researchers show that simply ignor-ing exceptions does not work as that leads to cascading failures(i.e., more and more crashes), they propose to recover frominvalid memory accesses by performing on-demand memoryallocation. In particular, X-Force monitors all memory oper-ations (i.e., allocate, free, read and write) to maintain a listof valid memory addresses. If an accessed memory address isnot in the valid list, a new memory block will be allocatedon demand for the access. To respect program semantics,when a pointer variable holding an invalid address x is setto the address of the allocated memory, all the other pointervariables that hold a value denoting the same invalid addressor its offset (e.g., x + c with c some constant) need to beupdated. X-Force achieves this through linear set tracing,which identifies linearly correlated pointer variables that areinduced by address offsetting. When a pointer variable isupdated, all the correlated pointers in its linear set need tobe updated accordingly based on their offsets.
Assume in an execution instance, line 8 takes the falsebranch and line 15 is forced to take the true branch. In thisexecution, cmd is a NULL pointer, hence the dests pointerin line 27 points to 0x8 (the offset of dests field is 8). Therounded rectangle in Figure 1 illustrates what X-Force doesfor the memory access of dests[0]->ip in line 30. Linearsets are maintained for each register and each memory address.In particular, SR(r) and SM(a) are used to denote the linearset of register r and address a, respectively. After executinginstruction α, the linear set of register rbx is updated to bethe same as that of &dests, i.e., SR(rbx)← SM (&dests)such that SR(rbx)=SM (&dests)={0x7ffdfffffed0}, which
2
01 typedef struct{char ip[16]; long port;} Dest;02 typedef struct{long act; Dest* dests[0];} Cmd;0304 int main(int argc, char *argv[]) {05 Cmd *cmd = NULL;06 int max = 0;0708 if (config_file_exists()) {09 max = read_from_config_file();10 cmd = malloc(sizeof(Cmd) + max*sizeof(Dest*));11 for (int i = 0; i < max; i++)12 cmd->dests[i] = malloc(sizeof(Dest));13 }14 ...15 if (cnc_server_connectable()) {16 scan_intranet_hosts(cmd, max);17 cmd->act = get_action_from_cc_server();18 switch (cmd->act) {19 case 1: do_action_1(cmd->dest, max); break;20 case 2: do_action_2(cmd->dest, max); break;21 ...22 }23 }24 ...25 }
26 void scan_intranet_hosts(Cmd *cmd, int max) {27 Dest **dests = cmd->dests;28 for (int i = 0; i < max; i++) {29 struct sockaddr_in *host = iterate_host();
30 inet_ntop(host->ip, dests[i]->ip);
31 dests[i]->port = ntohl(host->port);32 }33 }
α. mov rbx, [rbp - 0x10] // rbx = [rbp - 0x10] = [0x7ffdfffffed0] = 0x8/* Validate Memory Address: get accessible(0x7ffdfffffed0) = true *//* Update Linear Set: SR(rbx) ← SM (&dests) = {0x7ffdfffffed0} */
β. mov ecx, [rbp - 0x14] // ecx = [rbp - 0x14] = [0x7ffdfffffecc] = 0x0/* Validate Memory Address: get accessible(0x7ffdfffffed4) = true *//* Update Linear Set: SR(rcx) ← SM (&i) = {0x7ffdfffffecc} */
γ. lea rdx, [rbx + 8*rcx] // rdx = rbx + 8*rcx = 0x8/* Update Linear Set: SR(rdx) ← SR(rbx) = {0x7ffdfffffed0} */
δ. mov rax, [rdx] // rax = [rdx] = [0x8]/* Validate Memory Address: get accessible(0x8) = false (invalid read on 0x8) *//* Allocate Memory Block: malloc(BLOCK SIZE) = 0x2531000 *//* Update Reference: rdx = *(0x7ffdfffffed0) = 0x2531000 + 0x8 = 0x2531008 */
ε. mov rax, [rax] // rax = [rax] = [0x0]/* Validate Memory Address: get accessible(0x0) = false (invalid read on 0x0) *//* Allocate Memory Block: malloc(BLOCK SIZE) = 0x2532000 *//* Update Reference: rdx = *(0x7ffdfffffed0) = 0x2532000 + 0x8 = 0x2532008 */
Fig. 1: Motivation example. The assembly code here is functionally equivalent with the original one for easy understanding.
is the address of dests. Intuitively, the pointer value in rbxis linearly correlated to that in dests. Hence, fixing eitherone entails updating the other. The linear correlation is furtherpropagated to register rdx after executing instruction γ, sinceits value is derived from rbx by address offsetting (i.e.,&dests[0] = &dests + 0). When executing instruction δ,X-Force detects an invalid access through the pointer denotedby rdx (i.e., &dests[0]), holding an invalid address 0x8.Hence, it allocates a memory block with address 0x2531000and initializes it with zero values. Register rdx is thenupdated to 0x2531008. The value of &dest should also beupdated, since it linearly correlates with rdx. Similar memoryrecovery operations are needed for instruction ε that accessesdests[0]->ip through an invalid memory address 0x0.
As we can see that each memory operation should beintercepted by X-Force for memory address validation andlinear set tracing. Upon the recovery of an (invalid) pointervariable, all the linearly correlated variables need to be updatedaccordingly. This causes substantial performance degradation.It was reported that X-Force has 473 times runtime overheadover the native execution [32]. Furthermore, since many libraryfunctions such as string functions in glibc can lead to linearset explosion (due to substantial heap array operations), X-Force chose not to trace into library functions to update linearsets. As a result, its memory recovery is incomplete (seeSection IV for a real-world example).
Our technique. We propose a novel randomized memory pre-planning technique (called PMP) to handle invalid memoryaccesses with probabilistic guarantees. Instead of allocatingnew memory blocks on demand, PMP pre-allocates a largememory block with a fixed size (e.g., 16KB) when theprogram is loaded. The pre-allocated memory area (PAMA)is filled with carefully crafted random values such that if thesevalues are interpreted as memory addresses, the corresponding
accesses still fall into PAMA. We call this self-containedmemory behavior (SCMB). In addition, these random val-ues are designed in a way that they are self-disambiguated.That is, it is highly unlikely that two semantically unrelatedmemory operations access the same random address, causingbogus dependencies. We call this self-disambiguated memorybehavior (SDMB). For example, the simplest way to achieveSCMB is to pre-allocate a chunk of memory starting at 0x00and fill it with 0x00. As such, dereferences of null pointers(e.g., ∗p with p = 0) or pointers with some offset from null(e.g., ∗(p+8)), yield value 0x00 due to the initialization.If the yielded value 0x00 is further interpreted as a pointer,its dereference continues to yield 0x00, without causing anymemory exception. However, such a scheme leads to sub-stantial bogus program dependencies as semantically unrelatedmemory operations through uninitialized/invalid pointer vari-ables all end up accessing address 0x00. For example, assumep and q are not properly initialized and both have a null valuedue to forced execution and there are two pointer dereferencestatements “1. ∗ p = ...; 2. ... = ∗q”. A bogus dependencewill be introduced between 1 and 2. Such bogus dependenciesfurther lead to highly corrupted program states. SDMB is toensure that unrelated pointer variables have a high likelihoodto contain disjoint addresses such that it is like they were allproperly allocated and initialized. Intuitively, PMP diversifiesthe values filled in the pre-allocated large memory region suchthat dereferences at different offsets yield different values.Consequently, follow-up dereferences (of these values) cancontinue to disambiguate themselves.
In addition to the aforementioned pre-planning, duringexecution, PMP also initializes global, local variables, andheap regions allocated by the original program logic withrandom values pointing to PAMA. Note that otherwise theyare initialized to 0 by default. As such, when these variablesare interpreted as pointers and dereferenced without being
3
0 1 2 3 4 5 6 7 8 9 a b c d e f0 x 0 0 0 0 8 0 f e 0 0 0 0 0 0 0 0 0 0 0 0 5 0 3 8 0 0 0 0 0 0 0 0 0 0 0 00 x 0 0 1 0 4 8 7 4 0 0 0 0 0 0 0 0 0 0 0 0 f 8 0 4 0 0 0 0 0 0 0 0 0 0 0 00 x 0 0 2 0 d 0 f f 0 0 0 0 0 0 0 0 0 0 0 0 0 8 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 x f f d 0 8 8 1 9 0 0 0 0 0 0 0 0 0 0 0 0 3 0 3 0 0 0 0 0 0 0 0 0 0 0 0 00 x f f e 0 4 0 f c 0 0 0 0 0 0 0 0 0 0 0 0 9 8 2 0 0 0 0 0 0 0 0 0 0 0 0 00 x f f f 0 2 0 5 0 0 0 0 0 0 0 0 0 0 0 0 0 e 8 a 7 0 0 0 0 0 0 0 0 0 0 0 0
......
Fig. 2: Pre-allocated memory area. The data is presented inthe little-endian format for the x86 64 architecture. The bytesin gray are free to be filled with 8-multiple random values.
properly initialized along some forced path, the accesses stillfall in PAMA and also have low likelihood to collide (onthe same address). Through SCMB, PMP enables crash-freememory operations, which are critical for forced execution.Since it does not require tracing memory operations or per-forming on-demand allocation, it is 84 times faster than X-Force (Section IV). Through SDMB, PMP respects programsemantics such that it can faithfully expose (hidden) programbehaviors with probabilistic guarantees. As shown in ourevaluation (Section IV), PMP has fewer false positives (FP)and false negatives (FN) than X-Force as well.
Figure 2 illustrates a 64-KB pre-allocated memory areamapped in the address space from 0x0 to 0xffff. Note thatalthough this memory region may overlap with some reservedaddress ranges, we leverage QEMU’s address mapping toavoid such overlap (see Section III-E). It is filled with craftedrandom values that ensure both SCMB and SDMB. For ourmotivation example, instruction δ reads the memory unit ataddress 0x8 (i.e., &dests[0]) and gets the value 0x3850.Subsequently, the instruction ε uses 0x3850 as the addressto access dests[0]->ip. These two accessed addresses(0x8, 0x3850) are contained in the PAMA, hence no memoryexception occurs. The data dependence between these twoaddresses are also faithfully exposed, without undesirableaddress collision. Observe that there is no memory validationand linear set tracing required.
We want to point out while SCMB and SDMB can beeffectively ensured in forced execution, they may not be aseffective in regular execution. Otherwise, dynamic memoryallocation could be completely avoided. The reason is thatforced execution aims to achieve good coverage to exposeprogram behaviors such that it bounds loop iterations [32].As a result, linear scannings of large memory regions aremostly avoided, allowing to establish SCMB and SDMBeffectively and efficiently. Intuitively, one can consider thatour design is equivalent to pre-allocating many small regionsthat are randomly distributed. This is particularly suitable forheap accesses in forced-execution as they tend to happen insmaller memory regions. Even if overflows might happen, thelikelihood of critical data being over-written is low due to therandom distribution.
III. DESIGN
A. Overview
Figure 3 presents the architecture of PMP, which consistsof three components: the path explorer, the dispatcher and the
Executor nlow address(0x0)
end of PAMA
high address(0x7fffffffffff)
Executor 1
Path Explorer Dispatcher
…path
scheme
execution result
memory scheme n
memory scheme 1
stack
heap
…
.bss
.text
Pre-AllocatedMemory Area
(PAMA)
80 fe 00 00 00 00 00 00 50 38 00 00 00 00 00 0048 74 00 00 00 00 00 00 f8 04 00 00 00 00 00 00d0 ff 00 00 00 00 00 00 08 00 00 00 00 00 00 00
88 19 00 00 00 00 00 00 30 30 00 00 00 00 00 0040 fc 00 00 00 00 00 00 98 20 00 00 00 00 00 0020 50 00 00 00 00 00 00 e8 a7 00 00 00 00 00 00
......
Fig. 3: Architecture of PMP.
executors. Given a target binary, the path explorer systemat-ically generates a sequence of branch outcomes to enforce,including the PCs of the conditional instructions and theirtrue/false values. We call it a path scheme. Note that likeX-Force, PMP does not enforce the branch outcome of allpredicates, but rather just a very small number of them (e.g.,less than 20). The other predicates will be evaluated as usual.PMP operates in rounds, each round executing a path scheme.For each path scheme, PMP further generates multiple versionsof variable initializations, each having different initial valuesbut satisfying both SCMB and SDMB. We call them memoryschemes. The reason of having multiple memory schemes isto reduce the likelihood of coincidental address collisions.A process is forked for each path and memory scheme anddistributed to an executor for execution. At the end of a round,the dispatcher aggregates the results from the executors (e.g.,coverage). Another path scheme is then computed by the pathexplorer to get into the next round, based on the results fromprevious rounds.
Path Explorer. In essence, path exploration is a search processthat aims to cover different parts of the subject binary. In eachround, a new path scheme is determined by switching ad-ditional/different predicates, or enforcing additional/differentjump table entries, to improve code coverage. Since the searchspace of all possible paths is prohibitively large for real-worldbinaries, PMP follows the same path exploration strategies inX-Force [32], including the linear search, the quadratic searchand the exponential search. In particular in each round, thelinear search selects a new predicate or jump table entry toenforce, which is usually the last one that does not have all itsbranches covered in previous rounds. The exponential strategyaims to explore all combinations of branch outcomes and ishence the most expensive. It is only used to explore somecritical code regions. Quadratic search falls in between thetwo. Since these are not our contributions, interested readersare referred to the X-Force project [32].
Dispatcher. The dispatcher aggregates execution results (e.g.,code coverage and program dependencies) of multiple ex-ecutors in a conservative fashion. Specifically, it considersa result valid if and only if it is agreed by n executors,with n configurable. In our experience, n = 2 is goodenough in practice. Such aggregation further improves our
4
Program Loading
address spacecra�ed file
PAMA
mmap PAMA Preparation
During Executionprogram entry Global Variable Init
call instructions Local Variable Init
memory allocation Heap Init
Fig. 4: Workflow of Memory-preplanning.
probabilistic guarantees. Intuitively, assume PMP ensures thata reported result has lower than p ∈ [0, 1] probability to beincorrect during a single execution (on an executor), due tothe inevitable accidental violations of SCMB or SDMB. Theaggregation further reduces the probability to pn if the memoryschemes on the various executors are truly randomized (andhence independent).
Executors. All executors are forked from the same mainprocess with the same initialized PAMA. Each executor thenenforces a given path and memory scheme assigned to it. Sucha design avoids the redundant initialization of PAMA. Notethat all memory accesses must start from some variable, whosevalue is fully randomized across executors.
The rest of this section will explain in details the memorypre-planning step and the probability analysis for SCMB andSDMB guarantees. Execution result aggregation is omitted dueto its simplicity.
B. Memory Pre-planning
Overview. Figure 4 presents the workflow of memory pre-planning. When a program is loaded, a pre-allocated memoryarea (PAMA) is prepared by invoking the mmap system callto map a crafted file to the program address space. The filecontent is randomly generated beforehand. During execution,program variables (including global, local variables and heapregions) are initialized by PMP with random eight-multiplevalues pointing to PAMA. Specifically, PMP intercepts: 1) theprogram entry point for initializing global variables; 2) callinstructions for initializing local variables; and 3) memoryallocations for initializing heap regions. Note that PAMApreparation happens a priori and incurs negligible runtimeoverhead, while variable initialization occurs on-the-fly duringexecution. Both are generic and do not require case-by-casecrafting. We further discuss these steps in the following.
PAMA Preparation. PAMA is mapped at the lower part ofthe address space starting from 0x0, in order to accommodatenull pointers or pointers with invalid small values. The word-aligned addresses within PAMA (i.e., those having 0 at thelowest three bits) are filled with carefully crafted randomvalues, such that if these values are interpreted as addresses,they fall within PAMA. As such, the range of random valuesthat we can fill is dependent on the size of PAMA. For a64-KB PAMA (i.e., in the address range of [0, 0xffff]), thefirst two least-significant bytes of a filling value are free tobe set with a random eight-multiple value. Other bytes arefixed to zero. Note that such a value is essentially a valid
word-aligned address in PAMA. For a 64-MB PAMA, thefirst three least-significant bytes of a filling value can be setrandomly, providing better SDMB. The maximum PAMA canbe as large as 128 TB, as a larger PAMA would overlap withthe kernel space. While a feasible design is to change the entirevirtual space layout (by changing kernel), it would hinder theapplicability of our technique. In practice, we find that 4-MBof PAMA provides a good balance of SCMB and SDMB.
Global Variable Initialization. In an ELF binary, the unini-tialized or zero-initialized global variables are stored in the.bss segment. During loading, PMP reads the offset and sizeinformation of the .bss segment from the ELF header. PMPthen initializes the segment like a heap region.
Heap Initialization. Pre-planning heap regions that are dy-namically allocated by instructions in the subject binary isrelatively easier. PMP intercepts all memory allocations andset the allocated regions to contain random word-alignedPAMA addresses. Note that PMP writes these values to eachword-aligned address in the heap region. If a regular compileris used to generate the subject binary, the compiler wouldenforce pointer-related memory accesses to be word-alignedthrough padding. However, malware may intentionally intro-duce pointer accesses that are not word-aligned. Section III-Ewill discuss how PMP handles such cases. In the followingdiscussion, we always assume word alignment.
Local Variable Initialization. Initializing local variables ismore complex. After initializing PAMA and before spawningthe executors, PMP initializes the entire stack region like aheap region. Note that stack frames are pushed and poppedfrequently and the same stack address space may be used bymany function calls. As such, the stack space may need to bere-initialized. A plausible solution is to identify stack frameallocations (e.g., updates of rsp register) and conduct initial-ization after each allocation. However, due to the flexibilityof stack allocations, it is difficult to precisely identify them.Inspired by stack canaries used to detect stack overflows, PMPuses the following design to initialize stack regions. It inter-cepts each function invocation. Then starting from the currentaddress denoted by rsp, it randomly checks eight 1 unevenlydistributed addresses lower than the rsp address (i.e., thepotential stack space to be allocated), in the order from highto low, to see if they are PAMA addresses (meaning that theywere not overwritten by previous function invocations). Wealso call these addresses canaries without causing confusion inour context and use Ci to denote the ith canary. PMP identifiesthe lowest (last) canary that is not PAMA address, say Ct, andthen re-initializes [Ct+1, rsp] (note that stack grows from highaddress to low address). If all eight canaries are overwritten,PMP continues to check the next eight. Observe that sincestack writes may not be continuous, the detection scheme hasonly probabilistic guarantees. In practice, our scheme is highly
1Eight is an empirical choice and works well in our evaluation. The numberand the distribution of canaries are configurable.
5
01 typedef struct{double *f1; long *f2;} T;02 typedef struct{char f3; long *f4; long *f5;} G;03 G *g;0405 void case3() {06 long *e = NULL, *f = NULL;07 if (cond1()) init(e, f);08 if (cond2()) {09 *e = 0x6038; // [0x0000] = 0x603810 long tmp = *f; // tmp = [0x0000]: bogus dep!11 }12 }1314 void case4() {15 if (cond1()) init(g);16 if (cond2()) {17 *(g->f4) = 0x0830;18 long tmp = *(g->f5); // &(g->f5) = 0x1000019 }20 }
21 void case1() {22 long **a = malloc(...);23 T *b;24 if (cond1()) init(b);25 if (cond2()) {26 long *alias = b->f2;27 *(b->f2) = **a; // [0x0008] = [0x0010]28 *(b->f1) = 0.1; // [0xffd0] = 0.129 long tmp = *alias;30 }31 }3233 void case2() {34 long *c; double **d;35 if (cond1()) init(c, d);36 if (cond2()) {37 *c = 0xdeadbeef; // [0xffd8] = 0xdeadbeef38 double tmp = **d; // [0xdeadbeef]: error!39 }40 }
(a) code snippet.
0 1 2 3 4 5 6 7 8 9 a b c d e f0 x 0 0 0 0 8 0 f e 0 0 0 0 0 0 0 0 0 0 0 0 5 0 3 8 0 0 0 0 0 0 0 0 0 0 0 00 x 0 0 1 0 4 8 7 4 0 0 0 0 0 0 0 0 0 0 0 0 f 8 0 4 0 0 0 0 0 0 0 0 0 0 0 00 x 0 0 2 0 d 0 f f 0 0 0 0 0 0 0 0 0 0 0 0 0 8 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 x f f d 0 8 8 1 9 0 0 0 0 0 0 0 0 0 0 0 0 3 0 3 0 0 0 0 0 0 0 0 0 0 0 0 00 x f f e 0 4 0 f c 0 0 0 0 0 0 0 0 0 0 0 0 9 8 2 0 0 0 0 0 0 0 0 0 0 0 0 00 x f f f 0 2 0 5 0 0 0 0 0 0 0 0 0 0 0 0 0 e 8 a 7 0 0 0 0 0 0 0 0 0 0 0 0
...
0 x 1 e d 7 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 a:0x01ed7010b:0x20
...canary C1
...canary C2
...canary C3
...
g:0xfff0
Local Variable (case1)
Local Variable (case2)
Global Variable
PAMA
Heap Region
ef be ad de 00 00 00 00 after line 37
c:0xf1d8d:0xffd8
(b) memory scheme.
Fig. 5: Memory pre-planning.
effective and we haven’t encountered any problems caused byincorrect stack initialization.
Example. We use the code snippet shown in Figure 5a asan example to explain the memory pre-planning process. Inthe code, a global variable g is defined at line 3, two localvariables a, b are defined in function case1(). Assume inan execution instance, line 24 takes the false branch and bis not allocated and initialized; and line 25 is forced to takethe true branch. Although a is initialized by the originalprogram code with an allocated heap region, the data in theheap region is not initialized. Without memory pre-planning,the program would have exception at any of the memoryoperations in lines 26-29.
In this example, the global variable g is set to a randomPAMA address at the beginning. Upon calling case1(),PMP checks the canaries at C1, C2, and so on (see the stackframe in the top-left corner of Figure 5b), and then identifies,say, the region from [C3,rsp] needs re-initialization, whichincludes local variables a and b. Inside the function body,a is set to a dynamically allocated heap region at line 22,but other variables such as g and b keep their initial PAMAaddress value (as line 24 is not executed). Specifically, g and bpoint to 0xfff0 and 0x20 (in PAMA), respectively. Consider theread operation at line 28 that triggers pointer dereferences on
b and then b->f1. The former dereferences address 0x20 andyields value 0xffd0, which is further interpreted as an addressin the follow-up dereference of b->f1, yielding another validPAMA address. Observe that any following dereferences willbe within PAMA and do not cause any exceptions, illustratingthe SCMB property. The value of b->f1 (i.e., 0xffd0) deref-erenced at line 28 is different from that of b->f2 (i.e. 0x08)dereferenced at line 27, and hence disambiguate themselves,illustrating SDMB.
C. Other PAMA Memory Behavior and Interference withRegular Memory Operations.
Memory pre-planning is particularly designed to handleexceptional memory operations (caused by forced execution).As such, all the values filled in PAMA are essentially inpreparation for these values being interpreted as addresses andfurther dereferenced. It is completely possible that the subjectbinary does not interpret values from PAMA as addresses.For example, it may interpret a PAMA region as a stringand access individual bytes in the region. In such cases, theaccessed values are just random values. This is equivalent tohow X-Force handles uninitialized/undefined buffers.
A PAMA location can be written to and later read fromby instructions in the subject binary, dictated by the programsemantics. Program dependencies induced by PAMA are no
6
different from those induced through regular memory regions.For example, the code at line 26 in Figure 5a establishesan alias between variable alias and b->f2. At line 27, amemory write is conducted on b->f2. At line 29, a memory-read is conducted on alias. PMP can correctly establish thedependence between line 27 and line 29, since they both pointto the same memory address 0x8.
It may happen that a PAMA location is written to by thesubject binary and then read through a semantically unrelatedinvalid pointer dereference later. As the written value may notbe a legitimate PAMA address, the later read causes exception.For example, line 37 at function case2() of Figure 5awrites a value 0xdeadbeef that is not a word-aligned addresswithin PAMA to the address indicated by pointer c. Assumec happens to have the same value 0xffd8 as an unrelatedpointer d. The write to *c also changes the value in *d to0xdeadbeef. As such at line 38, an exception is triggered forthe read of **d. In the next subsection, our probability anal-ysis shows that such cases rarely happen as the likelihood fortwo semantically unrelated pointers are initialized to the samerandom value is very low. Furthermore, PMP employs differentmemory schemes in multiple executors, further reducing suchpossibility.
In the worst situation, the subject binary uses its own in-structions to set semantically unrelated pointers to null. In nor-mal execution, these pointers would point to different properlyallocated memory regions. However in forced execution, theymay not be allocated, and all point to address 0. In such cases,PMP cannot disambiguate the accesses of these variables, andlead to bogus dependencies. For example, the local variablese and f in function case3 () of Figure 5a are explicitlyset to null by the original program code. In forced executionwhere line 7 is not executed, they point to the same address0x0, resulting in bogus dependence (e.g., between lines 9 and10). Our experimental results in Section IV show that suchcases rarely happen.
D. Probability AnalysisIn this section, we study the probabilistic guarantee of
PMP for the SCMB and SDMB properties. Violations ofSCMB lead to exceptions whereas violations of SDMB lead tobogus dependences and corrupted variable values. To facilitatediscussion, we introduce the following definitions. Let PA bethe set of all possible addresses within PAMA, and WA be itsword-aligned subset. Assume the size of PAMA is S. Then,on a 64-bit architecture, we have equation (1).
S = |PA| = |WA| × 8 (1)
In addition, let FV be a random subset of WA, called thefilling value set, whose elements are used as the values tobe filled in PAMA. Without loss of generality, we assume 0belongs to FV. We define the ratio between the size of FVand the size of WA as diversity, denoted as d. Then, we haveequation (2).
|FV| = |WA| × d =d·S8
(2)
The initialization of PAMA can be formulated as a mappingf : WA 7→ FV, which assigns each word (with 8 bytesalignment) in PAMA (i.e., denoted by addresses in WA) witha random value selected from FV. Intuitively, a more diverseFV leads to a more random memory scheme. The initializationthat fills the whole PAMA with value 0 can be considered anextremal case where FV contains only a single element 0. Notethat in this case, SCMB is fully respected, while SDMB issubstantially violated as all invalid memory operations collideon address 0.
Probabilistic Guarantee of SCMB. When a pointer variableis initialized (by PMP) with a value indicating an address closeto the end of PAMA, dereference of its offset may result in anaccess out of the bound of PAMA. As an example, considerthe dereference of g->f5 at line 18 of function case4() inFigure 5a. Recall that g is set to be 0xfff0 by PMP. The addressof g->f5 is hence 0x10000, out of the bound of PAMA with16 KB size.
Theorem 1. Let x be a filling value selected from FV, α be anoffset. The probability Perr1 of x+α being out of the boundof PAMA is calculated by equation (3).
Perr1 = P ((x+α) 6∈PA | x∈FV) =α
S−8·(1− 8
d · S
)(3)
Proof. For PMP to access an out-of-bound address x +α, x must belong to an address set IA = WA ∩{S−α, S−α+1, . . . , S−1}. To simplify discussion, let α′=|IA|= α/8 , S′= |WA| and N= |FV|. Let the size of IA ∩ FVbe i. We can infer conditional probabilityP (x∈IA |x∈FV) =i/N , denoted as Pi1. Additionally, because there are
(S′−1N−1)
possible FVs that could be uniformly chosen from (recall0∈FV always holds) and
(α′
i
)·(S′−α′−1N−i−1
)FVs have i common
elements with IA, P (|FV ∩ IA|= i)=(α′
i
)·(S′−α′−1N−i−1
)/(S′−1N−1)
,denoted as Pi2. Enumerating size i ∈ {1, . . . , α′}, Perr1 =∑α′
i=1 Pi1 ·Pi2=(α′/N)·((S′−2N−2)/(
S′−1N−1))= α
S−8 ·(1− 8
d·S)
Intuitively, the larger the pre-allocated memory area (i.e.,S) and the lower the diversity (i.e., d), the lower the Perr1. Inparticular, the Perr1 of a naive initialization that fills PAMAwith value 0 is 0. In a typical setting of S=0x400000, α=8and d=1, Perr1=1.9073e−06, illustrating a very low chanceof exception. A plausible way to completely avoid SCMBviolation is to avoid using address values close to the endof PAMA. However this requires knowing the largest possibleoffset, which is difficult in practice.
Probabilistic Guarantee of SDMB. SDMB will be compro-mised when two unrelated pointers are initialized to the samevalue by chance. Taking local variables c and d for case2()in Figure 5a as an example, both of them are initialized to0xffd8, causing invalid pointer dereference at line 38.
Theorem 2. Let x and y be two filling values independently se-lected from FV. The probability Perr2 of coincidental addresscollision, when x and y have the same value, is calculated byequation (4).
7
Perr2 = P (x = y | x∈FV, y∈FV) =8
d·S(4)
Proof. Recall x and y are independently selected from FV.Thus, fixing x=v0 as a constant, we can infer Perr2=P (y=v0 |y∈FV)= 1/|FV|= 8/(d·S) .
With a typical setting d = 1 and S = 0x400000, Perr2 =1.9073e−06, a very low probability.
Perr3 =P (l (x, β) ∩ l (y, γ) 6= ∅ | x∈FV, y∈FV)
≤64
d2 ·S2+(1−
8
d·S)2 ·
β+γ− 8
S−8(5)
Proof is elided due to space limitations. With a setting ofβ = 0x1000, γ = 0x1000, and the rest as the same before,Perr3 = 0.00195, still reasonably low. Note that one canalways improve the guarantee by having more executors withdifferent pre-plans.
E. Implementation
PMP is implemented based on the QEMU user-mode em-ulator [9]. Specifically, PMP instruments conditional jumpsand indirect jumps to enforce path scheme. A path scheme isa sequence of branch outcomes that need to be enforced. Asan instance, “401a4c:T, 4094fc:F, 40a322#40a566” is a pathscheme that contains three branch outcomes to be enforced inorder. Particularly, the predicates at 0x401a4c and 0x4094fcshould take the true branch and false branch respectively,the jump table at 0x40a322 should take the entry at 0x40a566.Currently, PMP supports ELF binary on the x86 64 platform.It can be easily extended to support other architectures due tothe cross-platform feature of QEMU. We leave it as our futurework. In the rest of the subsection, we discuss a number ofpractical challenges faced by PMP.
Handling File and Network I/O, Infinite Loop and Re-cursion. Forced execution may result in exceptional programbehaviors, such as invalid file/network access, infinite loopand infinite recursion. To make PMP applicable to real-worldexecutables, these issues need to be handled. PMP followssimilar solutions to X-Force regarding these problems. Thedifference lies in that we implement them on QEMU whileX-Force was on PIN. We briefly discuss these solutions forthe completeness of discussion.
To handle invalid file access, PMP wraps file open functions(e.g., open and fopen). If the file to be opened does notexist, a file padded with random values will be used. Tohandle infinite loop, PMP adopts the profiling-based approachproposed in [31] to dynamically identify loop structures. Foreach identified loop structure, PMP resets the loop boundto a pre-define constant. This is more sophisticated than X-Force, which uses a fixed global loop bound. To handleinfinite recursion, PMP intercepts call and return instructionsto maintain a call stack. At each function invocation, PMPchecks whether the appearances of the target function in thecall stack exceed a pre-defined threshold. If so, PMP skipsthe function invocation. Note that while maintaining a faithful
shadow call stack is very challenging due to the various strangecalling conventions, PMP does not require a precise shadowstack.
Allocation of Large PAMA. PAMA is located at the lowerpart of the address space starting from 0x0. The default loadaddress for non-position-independent executables is usually0x400000. If the size of PAMA is larger than 4MB, therewill be overlap between PAMA and the text/data segment ofthe subject executable, which is problematic.
To support large-size PAMA, we enable the address map-ping mechanism provided by QEMU, which translates a guestaddress (denoted as GA) used by the subject executable to ahost address (denoted as HA) used by QEMU. In the user-mode emulation, QEMU and the subject executable share thesame address space. The address mapping g2h is flattened toessentially an offsetting operation, such that ha = g2h(ga) =ga+ base, where ga∈GA, ha∈HA, and base is a pre-definedbase address. We set the base address to the size of PAMA toavoid any overlap. Consequently, we need to adjust the fillingvalues accordingly such that they are mapped to the addresseswithin PAMA (started from 0x0 in the host space). Formally,let FV
′be the set of the adjusted filling values. Then we have
FV′={x− base | x∈FV}.
Misaligned Memory Access. The memory pre-planning ofPMP assumes that any pointer field of a structure is word-aligned. It is a reasonable assumption for most real-worldapplications, since making pointer fields word-aligned (bypadding if needed) is the default behavior of compilers. Forexample, mainstream compilers will place a 7-byte paddingbetween the f3 field and the f4 field of the structure G inFigure 5a by default, such that the offset of f4 is word-aligned.
Although we didn’t find any real-world cases in our eval-uation, it is possible to disable word-alignment via a spe-cial compilation option. The misalignment of a pointer field(within PAMA) may result in invalid memory access. Forexample, assume the global variable g in Figure 5a pointsto 0xfff0 set by PMP. If its pointer field f4 is not word-aligned, its value will be loaded from 0xfff1, which would be0xe800000000000050. If this value is used as an address, theaccess falls out of PAMA (even out of the user address space)and causes exception.
We develop the following mechanism in the dispatcherto handle misaligned memory accesses in a demand drivenfashion. If a path scheme results in invalid memory access inall the executors (most likely induced by misaligned accesses),the dispatcher checks the QEMU exception log to acquirethe instruction i that accesses misaligned address. Then PMPadditionally intercepts the code generation of instruction ito mask the most-significant bytes of the accessed memoryaddress to make it fall within PAMA. Note that while ourdesign anticipates misaligned pointer field accesses are rare,which is true according to our experience (see Section IV), it ispossible future malware may purposely introduce lots of suchmisalignments. In this case, PMP would have to instrumentall memory operations to sanitize the addresses.
8
IV. EVALUATION
A. Experiment Setup
We evaluate PMP with the SPEC2000 benchmark set aswell as a set of malware samples provided by VirusTotal [12]and Padawan [8]. The experiment on SPEC2000 is conductedon a desktop computer equipped with an 8-core CPU (Intel R©CoreTM i7-8700 @ 3.20GHz) and 16G main memory. Theexperiment on the malware samples is conducted on a virtualmachine (to sandbox their malicious behaviors) hosted onthe same desktop. On both experiments, the configuration ofPMP is as follows: 4-MB pre-allocated memory area (i.e.,S = 0x400000), diversity d = 1, and 2 executors (i.e., n = 2).
B. SPEC2000
SPEC2000 is a well-known benchmark set contains 12 realworld programs, some of them are large (e.g., 176.gcc). Thelist of programs and the characteristics of their executables canbe found in Appendix A. We choose SPEC2000 for the pur-pose of comparison as it was used in X-Force. Table I presentsthe comparative results on different aspects, including forcedexecution outcomes, code coverage and memory dependence.
Forced Execution. In this experiment, both PMP and X-Forceuse the same linear path exploration strategy. Specifically, itfirst executes the binary once without forcing any branch out-come. Then it traverses the executed predicates in the reversetemporal order (the last predicate first) and finds the predicatethat has an uncovered branch. A new path scheme is thengenerated to force-set the uncovered branch. The procedurerepeats until there are no more schemes that can lead to newcoverage. Column 2 in Table I reports the total execution timewhen PMP finishes the exploration. Columns 3 and 4 presentthe number of executions that pass and fail (i.e., encountersan exception), respectively. The number in parentheses denotethe number of executions finished per second. Columns 11-13 show the corresponding results for X-Force. From theseresults, we have the following observations. (1) PMP canperform 12.6 forced executions per second on average, whichis 84 times faster than X-Force (0.15 execution per second).Since PMP uses 2 executors for each path scheme, one mayargue that X-Force can be parallelized to use two cores (for faircomparison). We want to point out that first it is unclear how toparallelize the linear search algorithm; and the second executorin PMP is just to provide better probabilistic guarantees. Inmost cases, such improvement may not have practical impact(see our next experiment). Hence in deployment, additionalexecutors may be turned off. (2) The execution failure rate ofPMP is 3.5%, which is reasonably low and comparative withX-Force. Note that the rate is higher than what we identified inthe SCMB probability analysis (Section III-D). The reason isthat the majority of failures reported by both PMP and X-Forceare not caused by memory exceptions, but rather inevitable asthe path explorer forces the execution to enter branches thatmust lead to failures (e.g., forcing the true branch of a stacksmash check inserted by the compiler).
Code Coverage. Columns 5∼7 and 14∼16 show the codecoverage of PMP and X-Force, respectively. Observe that onaverage PMP covers 83.8% instructions, 79.1% basic blocksand 91.8% functions, which is comparable to X-Force. Formost of the benchmark programs, PMP achieves more than80% code coverage. Specifically, for mcf and gzip, PMPachieves 100% code coverage.
The worst cases are eon and gcc. Further manual inspectionshows that this is due to some inherent shortcoming of thelinear search strategy. To illustrate, consider the code snippetin Figure 6, which is extracted from gcc that validates functionarguments before proceeding. When the check_arg() func-tion is invoked for the first time at line 2, the true branch ofpredicate at line is taken by default. The linear path explorationwill force the next execution to take the false branch, since ithas not been covered before. At the second-time invocation ofcheck_arg() at line 3, the false branch of the predicateat line 8 will not be forced to execute again (hence take thetrue branch by default), since it has been covered before.That means, the code after line 3 will not get executed due tothe validation failure at line 3.
The essence of the problem is that linear search onlyfocuses on predicates, without considering their context. Forexample, function check_arg() may be invoked from mul-tiple places, and each calling context should be considereddifferently. That is, a branch being covered in a context shouldnot prevent it from being explored again in a different context.In our future work, we will explore a context-sensitive pathexploration method that can provide probabilistic guarantees.Specifically, we will explore a sampling algorithm that cansample a predicate, together with its unique context, in aspecific distribution (e.g., uniform distribution).
Memory Dependence. We also conducted an experiment,in which we detect the program dependencies exercised byforced execution. A dependence is exercised when an in-struction writes to some address, which is later read byanother instruction. This is to evaluate the SDMB propertyof PMP. Note that it is intractable to acquire the groundtruth of program dependencies, even with source code (dueto reasons such as aliasing). Therefore, we use two methodsto evaluate the quality of detected dependencies. First, we runthe SPEC programs on the inputs provided by the SPEC suite(some of them are large and comprehensive) and collect thedependencies observed. These must be true positive programdependencies. As such, forced execution is supposed to exposemost of them. Any missing one is an FN. Second, we built astatic type checker to check if the source and destination of a(detected) dependence must have the same type. We developedan LLVM pass to propagate symbolic information to individualinstructions, registers, and memory locations such that weknow the type of each binary operation and its operands. Notethat we need the symbolic information just for this experiment.PMP operates on stripped binaries. Ideally, force executionshould report as few mistyped dependencies as possible. Eachmistyped dependence must be an FP. Columns 8∼10 and
9
TABLE I: SPEC2000 Results
BenchmarkPMP X-Force
execution status code coverage memory dependence execution status code coverage memory dependencetime (s) # run # fail # insn # block # func # found # correct # mistyped time (s) # run # fail # insn # block # func # found # correct # mistyped
164.gzip 24.6 382(15.6/s)
11(3%)
7,650(100%)
699(99%)
61(100%) 3,529 2,824
(80%)0
(0%) 2,112 369(0.17/s)
10(3%)
7,420(97%)
669(95%)
61(100%) 3,662 2,343
(64%)28
(1%)
175.vpr 76.8 1,006(13.1/s)
82(8%)
26,783(83%)
2,007(71%)
226(89%) 13,418 8,983
(67%)333(2%) 9,436 1,000
(0.10/s)79
(8%)26,677(83%)
2,004(70%)
226(89%) 13,332 7,199
(57%)2,428(18%)
176.gcc 3490.2 26,524(7.6/s)
822(3%)
186,310(49%)
16,104(44%)
1,239(65%) 573,375 384,161
(67%)11,467(2%) 347,014 26,647
(0.08/s)799(3%)
183,280(48%)
16,098(43%)
1,221(64%) 573,926 332,303
(58%)63,131(11%)
181.mcf 8.6 144(16.7/s)
2(1%)
2,977(100%)
213(100%)
24(100%) 1,718 1,248
(73%)0
(0%) 374 164(0.43/s)
2(1%)
2,947(99%)
213(100%)
24(100%) 1,487 1,011
(68%)130(9%)
186.crafty 860.3 2,753(3.2/s)
15(0.5%)
40,404(96%)
4,237(96%)
104(100%) 22,437 14,300
(64%)20
(0.08%) 99,764 2,830(0.03/s)
13(0.4%)
41,685(99%)
4,381(99%)
104(100%) 22,816 12,092
(53%)2,749(12%)
197.parser 98.2 1,590(16.2/s)
68(4%)
22,093(90%)
2,688(92%)
279(94%) 9,958 6,664
(67%)887(9%) 6,340 1,685
(0.27/s)69
(4%)23,331(95%)
2,799(96%)
288(97%) 11,740 5,870
(50%)3,682(31%)
252.eon 37.2 707(19.0/s)
27(4%)
28,600(71%)
5,560(70%)
502(82%) 9,521 4,457
(47%)142(1%) 4,020 659
(0.16/s)26
(4%)27,622(69%)
5,413(68%)
501(81%) 9,121 3,557
(39%)5,669(62%)
253.perlbmk 1,189 10,318(8.7/s)
508(5%)
118,135(88%)
11,600(90%)
692(97%) 66,726 28,394
(43%)4,001(6%) 176,096 10,400
(0.06/s)502(4%)
119,467(89%)
11,676(90%)
696(97%) 70,611 24,713
(35%)18,866(27%)
254.gap 1,054 7,754(7.3/s)
310(4%)
49,869(54%)
4,519(50%)
401(88%) 38,243 20651
(54%)3,059(8%) 103,458 7,461
(0.07/s)298(4%)
49,920(54%)
4,521(50%)
401(88%) 38,784 18228
(47%)6,593(17%)
255.vortex 487.0 7,232(14.9/s)
157(2%)
100,718(92%)
15,513(91%)
577(92%) 55,205 19,939
(36%)630(1%) 58,646 7,223
(0.12/s)132(2%)
100,652(92%)
15,489(91%)
577(92%) 54,977 15,393
(28%)14,072(26%)
256.bzip2 16.0 249(15.6/s)
13(5%)
6,338(92%)
545(94%)
60(95%) 2,755 2,375
(86%)0
(0%) 842 258(0.19/s)
11(4%)
5,179(76%)
471(82%)
53(84%) 2,434 1,849
(76%)215(9%)
300.twolf 221.4 2,972(13.4/s)
97(3%)
52,351(91%)
3,682(86%)
165(99%) 24,032 10,333
(43%)528(2%) 21,308 2,997
(0.14/s)90
(3%)52,831(92%)
3,749(88%)
165(99%) 25,664 8,212
(32%)3,132(12%)
Average - 12.6/s 3.5% 83.8% 79.1% 91.8% - 60.6% 2.6% - 0.15/s 3.4% 82.7% 81.0% 90.9% - 50.6% 19.6%
01 int some_func(char *arg1, char *arg2) {02 check_arg(arg1);03 check_arg(arg2);04 do_something(); // do nothing05 ...06 }07 void check_arg(char *arg) {08 if (strlen(arg) == 0) exit(-1);09 ...10 }
Fig. 6: Explaining problem of linear search using gcc.
17∼19 show the memory dependence results for PMP andX-Force, respectively.
Observe that X-Force has 6.5 times more mis-typed memorydependences compared to PMP (19.6% versus 2.6%), thatis, 6.5X more FPs. In addition, the must-be-true memorydependences reported by X-Force are 10% fewer than those byPMP. That is, X-Force has 10% more FNs. The main reasonis that X-Force does not trace into library execution such thatpointer relations are incomplete. We will use a case studyto explain this in the next paragraph. Mis-typed dependences(FPs) in PMP are mostly caused by violations of SDMB. Theresults are consistent with our analysis in Section III-D. Notethat our probabilistic guarantee for SDMB was computed fora pair of accesses, whereas the reported value is the expectedvalue over a large number of pairs.
Case Study. We use 181.mcf as a case study to demonstratethe advantages of PMP over X-Force, as well as over a naivememory pre-planning that fills the pre-allocated region andvariables with 0. To reduce the interference caused by thepath exploration algorithm, we use the execution traces of theruns on the provided test cases as the path schemes. That is,we enforce the branch outcomes in a way that strictly followsthe traces. The test cases fall into three categories: training,test, and reference, with difference sizes (reference tests are
01 long suspend_impl(..){..02 if (is_valid(arc)) {..03 memcpy(new_arc, arc, 0x40);..04 *(arc->tail) = node1;..05 node2 = *(new_arc->tail);..06 }07 }
Fig. 7: Explaining FPs and FNs by X-Force using mcf.
the largest). We use the memory dependences reported whileexecuting the test cases normally as the ground truth to identifythe false positives and false negatives for PMP and X-Force.Since both the forced and unforced executions of a test inputfollow the same path, the comparison particularly measuresthe effectiveness of the memory schemes. To be more fair, weonly run PMP on a single executor.
The results are shown in Table II. The 2nd and 3rd columnscompare the execution speed. Observe that PMP is muchfaster, consistent with our earlier observation. For the memorydependences, PMP has no FPs or FNs while the naive planningmethod has some; and X-Force has the largest number of FPsand FNs. The former is because SDMB is violated. The latteris due to the incompleteness of pointer relation tracking (i.e.,missing the library part). Note that the numbers of FPs andFNs are smaller compared to the previous experiment as theseare results for a small number of runs, without exploring paths.
Consider the code snippet from mcf shown in Figure 7.Variable arc is a buffer that contains many pointer fields. Asit is copied to new_arc at line 3, the pointer fields in arc andnew_arc are linearly correlated. However, X-Force missessuch correlations as it does not trace into memcpy() at line 2.This could lead to missing dependences such as that betweenlines 4 and 5; and also bogus dependences. For example, theread *(new_arc->tail) at line 5 must falsely depend onsome write that happened earlier.
10
Cuckoo
Habo
Padawan
Cuckoo++
X-Force
PMP
0 50 100 150 200 250
(a) number of exposed syscall sequences.
0.00
0.20
0.40
0.60
0.80
1.00
1.20
1.40
1.60
1.80
2.00 PMP X-Force
(b) executions per second.
0
5
10
15
20
25
30 PMP X-Force
(c) length of path scheme.
Fig. 8: Overall result of malware analysis.
TABLE II: Experiment with mcf.
ItemExecution Time (s) Memory Dependence
PMP X-Force ground PMP Naive X-Forcefound fp fn found fp fn found fp fn
test 0.0305 1.987 1847 1847 0 0 1848 5 4 1858 28 17train 0.0348 2.578 2065 2065 0 0 2069 13 9 2088 45 22ref 0.0609 4.390 2062 2062 0 0 2068 14 8 2080 37 19
C. Malware Analysis
We use 400 malware samples. Half of them are acquiredfrom VirusTotal under an academic license, and the other halffall into the set of malware used in the Padawan project.Note that the authors of Padawan cannot share their samplesdue to licensing limitations. Hence, we crawled the Internetfor these samples based on a set of hash values providedby the Padawan’s authors through personal communication.Many samples could not be found and are hence elided. The400 samples cover up-to-date malware of different familiescaptured from year 2016 to 2018. We compare the malwareanalysis result of PMP with that of Cuckoo [2] (a well-knownsandbox for automatic malware analysis), Padawan [8] (anacademic multi-architecture ELF malware analysis platform),Habo [10] (a commercial malware analysis platform used byVirusTotal for capturing behaviors of ELF malware samples)as well as X-Force [32].
In order to compare our technique with the state-of-the-art anti-evasion measures, we implemented two popularanti-evasion methods [19] (i.e. system time fast-forwardingand anti-virtualization-detection) as extensions to Cuckoo.We name the extended system Cuckoo++. Specifically inthe first method, we modify the kernel to make the sys-tem clock much faster (e.g., 100 times faster), mainly forthe following two reasons. First, a malware analysis VMoften has a very short uptime since it restarts for eachmalware execution. As such, advanced malware may checkthe system uptime to determine the presence of sandboxVM. Second, advanced malware samples often sleep for aperiod of time before executing their payload (in order todefeat dynamic analysis). In the other method, we inter-cept file system operations to conceal the artifacts producedby virtual machine (e.g., /sys/class/dmi/id/product name and/sys/class/dmi/id/sys vendor).
The detailed comparison results are shown in Appendix C.Note that the malware behaviors of Padawan are provided byits authors. We set up an execution environment similar toPadawan (Ubuntu 16.04 with Linux kernel version 4.4) for
TABLE III: Analysis on malware samples used for case study.
Case ID Cuckoo Habo Padawan Cuckoo++ X-Force PMP1 031 12 17 12 12 283 3012 004 27 29 28 27 32 2163 225 49 49 166 165 183 2204 309 153 169 292 221 274 705
the other tools, including PMP, X-Force, Habo, Cuckoo andCuckoo++, so that the results can be comparable. We set 5minutes timeout for each malware sample.
Result Summary. Figure 8 presents the overall result ofmalware analysis. Specifically, the number of unique systemcall sequences exposed by different tools are show in Fig-ure 8a. To avoid considering similar system call sequences thathave only small differences on argument values as differentsequences, we consider sequences that have more than 90%similarity as identical. As we can see that the executions withanti-evasion measures enabled (i.e., Cuckoo++ and Padawan)expose more system call sequences than the native executions(i.e., Cuckoo and Habo), but disclose fewer than the forcedexecution methods (i.e., X-Force and PMP). On average, PMPreports 220%, 243%, 150%, 151% and 98% more system callsequences over Cuckoo, Habo, Cuckoo++, Padawan and X-Force, respectively. Details can be found in Appendix C.
The comparison of execution speed and length of pathschemes between PMP and X-Force are shown in Figure 8band Figure 8c respectively. Note that Cuckoo and Padawanonly runs each sample once (instead of multiple executionson different path schemes as force execution tools do). Hencewe do not compare their execution speeds and length of pathscheme. On average, PMP is 9.8 times faster than X-Forceand yields path schemes with the length 1.5 times longer thanX-Force. The longer the path scheme, the deeper the code wasexplored. The second case studies in this subsection show thatwith the longer path schemes, PMP can expose some maliciousbehavior in deep program paths that could not be exposed byX-Force.
Case Studies. Next, we use four case studies from differentmalware families to illustrate the advantages of PMP.
Case1: 1e19b857a5f5a9680555fa9623a88e99. It isa ransom malware that uses UPX packer [11] to pack itsmalicious payload in order to evade static analysis. Figure 9ashows a constructed code snippet to demonstrate part of itsmalicious logic. It mmaps a writable and executable memoryarea (line 2), then unpacks itself (line 3) and transfers control
11
01 int main(int argc, char **argv) {02 void *code_area = map_exec_write_mem();03 upx_unpack(code_area);04 transfer_control(code_area, argc, argv);05 }0607 void code_area(int argc, char **argv) {08 if (!is_cmdline_valid(argc, argv)) exit();09 char *action = argv[1], *key = argv[2];10 delete_self();11 if (strcmp(action, encrypt) == 0) {12 for (FILE *file: traverse_directory()) {13 FILE *encrypted_file = encrypt(file, key);14 replace_file(encrypted_file, file);15 }16 }17 }
(a) simplified code.
a. mmap(0x400000,,PROT_EXEC|PROT_READ|PROT_WRITE,)b. unlink("/root/Malware/1e19b857a5f5a9680555fa9623a88e99")c. open("/etc",O_RDONLY|O_DIRECTORY|O_CLOEXEC)d. getdents64(0,)e. open("/etc/passwd",O_RDONLY)f. open("/etc/passwd.encrypted",O_WRONLY|O_CREAT,0666)g. unlink("/etc/passwd")
(b) captured system call sequence.
Fig. 9: Case 1: the ransom malware sample.
(line 4) to the unpacked payload (lines 7-17). The maliciouspayload checks the validity of command line parameters (line8) and deletes itself from the file system (line 10). If thecommand line parameter specifies the encrypt action, themalware traverses the file system to replace each file with itsencrypted copy (lines 13-14).
The comparison of different tools on this malware is shownin the second row of Table III. Triggering payload requiresthe correct command line parameters. Hence directly runningthe malware using Cuckoo, Habo, Cuckoo++ and Padawanfail to expose the malicious behavior. Both X-Force andPMP expose the payload. Figure 9b shows the capturedsystem call sequence. Observe the unlink syscall b thatremoves the malware itself and the encryption and removalof “/etc/passwd” by syscalls e-g.
Case2: 03cfe768a8b4ffbe0bb0fdef986389dc. It isa bot malware that receives command from a remote server.Figure 10a shows the simplified code of its processing logic. Itchecks whether a file exists that indicates the right executionenvironment (line 2) and whether the remote server is con-nectable (line 4). If both conditions are satisfied, the malwarecommunicates with the remote server. The remote server willvalidate the identity of the malware by its own communicationprotocol (lines 4-7). If the validation is successful, a commandreceived from the remote server will be executed on the victimmachine (lines 8-9).
The comparison of different tools on this malware is shownin the third row of Table III. The malicious payload of thismalware sample is hidden in a deeper path, which requires amuch longer path scheme. Figure 10b shows the path schemeenforced by PMP to expose the malicious behaviors. Thelength is 28, which is larger than the longest path schemethat is enforced by X-Force within the 5 minutes limit. Theseforced branches are to get through the ID validation protocol.
01 int main(int argc, char **argv) {02 if (!files_exist("/tmp/ReV1112")) exit(0);03 if (!connectable("ka3ek.com")) exit(0);04 Info *info = get_system_info();05 Greet *greet = get_validation(info);06 Reply *reply = compute_reply(greet);07 Cmd *cmd = get_command(reply);08 if (!cmd) exit(0);09 execute_cmd(cmd);10 }
(a) simplified code.
40492b:T | 404aec:T | 404e07:T | 401f3f:F | 401ee3:T |404fdc:F | 404fea:T | 405118:F | 40513a:F | 405144:F |40517b:F | 40517f:F | 40523e:F | 405254:T | 40523e:F |405254:T | 40523e:F | 405254:T | 40523e:F | 405254:T |40523e:F | 405254:F | 4044be:T | 4044e9:F | 40454b:F |404565:T | 404596:T | 404794:F
(b) path scheme.
Fig. 10: Case 2: the bot malware sample.
Case3: 14b788d4c5556fe98bd767cd10ac53ca. It isan enhanced variant of Mirai, which is equipped with a time-based cloaking technique. Figure 11 shows a simplified versionof its code snippet. At line 4, it checks whether the system up-time is short, which indicates a potential analysis environment.If the system uptime is long enough, it checks whether thereexists any initialization script in the “/etc/init.d” directory (line8) 2. If both conditions are satisfied, the malware sample addsitself to an initialization script for launching at system reboot.
Cuckoo and Habo cannot expose the aforementioned be-haviors. Cuckoo++ and Padawan can expose the traversalof the “/etc/init.d” directory (line 6), by passing though theuptime check via fast-forwarding system time and using along-running VM snapshot, respectively. However, they cannotexpose the modification of initialization script (line 9), dueto the failure of the initialization script check, as the defaultOS environment does not have any initialization script. PMPand X-Force can expose both behaviors by forcing the branchresults.
Case4: 8ab6624385a7504e1387683b04c5f97a. Thisis a sniffer equipped with a vm-detection-based cloaking tech-nique. Figure 12 shows a simplified version of its code snippet.If a VM environment is detected, the malware sample deletesitself and exits (lines 2-3). Otherwise, it enters a sniffing loop,which randomly selects an intranet IP address and a knownvulnerability and checks whether the host with the IP containsthe vulnerability (lines 5-7). If so, the information about thevulnerable host is sent to the server and the payload is sent tothe vulnerable host (lines 8-9).
Cuckoo and Habo cannot expose the aforementioned be-haviors. Cuckoo++ and Padawan can expose the networkcommunication to the selected IP address, since they areenhanced to conceal VM-generated artifacts. However, theycannot expose sending the vulnerable host information andpayload, since the analysis environment is often offline andthere may not exist a vulnerable host on the intranet. PMPcan expose both behaviors. X-Force can expose both in theory
2An initialization script has a file name that starts with ‘S’, followed by anumber indicating the priority.
12
01 int main(int argc, char **argv) {02 struct sysinfo info;03 sysinfo(&info);04 if (info.uptime < 128) exit(0);05 DIR *dir = opendir("/etc/init.d");06 while (struct dirent *ent = readdir(dir)) {07 char name = ent->d_name;08 if (name[0] == ’S’ && is_num(name[1]))09 add_to_init_script("/etc/init.d/S99");10 }11 }
Fig. 11: Case 3: the enhanced variant of Mirai.
but fails within the timeout limit due to its substantially largerruntime cost.
D. Time Distribution
We measure the runtime overhead of different components.The distribution is shown in Appendix B. As we can seethat most of the time (84%) is spent on code execution,while only 13% and 3% of time are spent on memory pre-planning and path exploration, respectively. In memory pre-planning, 2%, 5%, 69% and 24% of time are spent on PAMApreparation, initialization of global variables, local variablesand heap variables. Observe that PAMA preparation takes verylittle time as most work is done offline.
V. RELATED WORK
Forced Execution. Most related to our work is X-Force [32].The technical differences between the two were discussedin the introduction section. As shown by our results, PMPis 84 times faster than X-Force, has 6.5X, and 10% fewerFPs and FNs of dependencies, respectively, and exposes 98%more payload in malware analysis. Following X-Force, otherforced-execution tools are developed for different platforms,including Android runtime [33] and JavaScript engine [25],[21]. Compared to these techniques, PMP targets x86 bina-ries and addresses the low level invalid memory operations.Additionally, PMP is based on novel probabilistic memorypre-planning instead of demand driven recovery.
Memory Randomization. Memory randomization has beenleveraged for different purposes, such as reducing vulnerabilityto heap-based security attacks through randomizing the baseaddress of heap regions [14] and randomly padding alloca-tion requests [15]. DieHard [13] tolerates memory errors inapplications written in unsafe languages through replicationand randomization. It features a randomized memory managerthat randomizes objects in a “conceptual heap” whose size isa multiple of the maximum real size allowed. PMP shares asimilar probabilistic flavor to DieHard. The difference lies inthat PMP pre-plans the memory by pre-allocation and fillingthe pre-allocated space and variables with crafted values. Inaddition, PMP aims to survive memory exceptions caused byforced-execution whereas DieHard is for regular execution.
Malware Analysis. The proliferation of Malware in the pastdecades provide strong motivation for research on detecting,analyzing and preventing malware, on various platforms suchas Windows [16], [23], Linux [19], [20], as well as Web
01 char *data = read_file("/sys/class/dmi/id/product_name");02 if (contains(data, "VirtualBox", "VMware"))03 remove_self_and_exit();04 while (1) {05 char *ip = select_intranet_ip(ip_list);06 char *vuln = select_known_vuln(vuln_list);07 if (connect_and_check(ip, vuln)) {08 send_info_to_server(ip, vuln);09 send_payload(ip, vuln);10 }11 }
Fig. 12: Case 4: the sniffer malware sample.
browsers [24], [22]. Traditional malware analysis fall intotwo categories: signature-based scanning and behavioral-basedanalysis. The former [12], [28] detects malware by matchingextracted features with known signatures. Although commonlyused by anti-malware industry, signature-based approaches aresusceptible to evasion through obfuscation. To address this,behavioral-based approaches [34], [26], [17] execute a subjectprogram and monitor its behavior to observe any maliciousbehavior. However, traditional behavioral-based approachesare limited to observing code that is actually executed.
Anti-targeted Evasion. Modern sophisticated malware sam-ples are equipped with various cloaking techniques (e.g.,stalling loop [27] and VM detection [6]) to evade detection.To fight against evasion, unpacking techniques [18], [29] areapplied to enhance signature-based scanning, and dynamicanti-evasion methods [26], [30] are developed to hide dynamicfeatures of analysis environment such as execution time andfile system artifacts. These techniques are very effective forknown targeted evasion methods. Compared to these tech-niques, PMP is more general. More importantly, PMP andforced execution type of techniques allow exposing payloadguarded by complex conditions that are irrelevant to cloaking.
VI. CONCLUSION
We develop a lightweight and practical force-executiontechnique that features a novel memory pre-planning method.Before execution, the pre-planning stage pre-allocates a mem-ory region and initializes it (and also variables in the subjectbinary) with carefully crafted values in a random fashion. As aresult, our technique provides strong probabilistic guaranteesto avoid crashes and state corruptions. We apply the prototypePMP to SPEC2000 and 400 recent malware samples. Ourresults show that PMP is substantially more efficient andeffective than the state-of-the-art.
ACKNOWLEDGMENT
The authors would like to thank the anonymous reviewersand Dr. William Robertson (the PC contact) for their con-structive comments. Also, the authors would like to expresstheir thanks to VirusTotal and the authors of Padawan fortheir kindness in sharing malware samples and the analysisresults. The Purdue authors were supported in part by DARPAFA8650-15-C-7562, NSF 1748764, 1901242 and 1910300,ONR N000141410468 and N000141712947, and Sandia Na-tional Lab under award 1701331. The UVA author was sup-ported in part by NSF 1850392.
13
REFERENCES
[1] Clickless powerpoint malware installs when users hover over alink. https://blog.barkly.com/powerpoint-malware-installs-when-users-hover-over-a-link.
[2] Cuckoo. https://cuckoosandbox.org/.[3] Cybersecurity statistics. https://blog.alertlogic.com/10-must-know-
2018-cybersecurity-statistics/.[4] Evil clone attack. https://gbhackers.com/evil-clone-attack-legitimate-
pdf-software.[5] Fileless malware. https://www.cybereason.com/blog/fileless-malware.[6] Linux anti-vm. https://www.ekkosec.com/blog/2018/3/15/linux-anti-
vm-how-does-linux-malware-detect-running-in-a-virtual-machine-.[7] Mirai malware. https://en.wikipedia.org/wiki/Mirai (malware).[8] Padawan. https://padawan.s3.eurecom.fr/about.[9] Qemu user emulation. https://wiki.debian.org/QemuUserEmulation.
[10] Tencent habo. https://blog.virustotal.com/2017/11/malware-analysis-sandbox-aggregation.html.
[11] Upx. https://upx.github.io/.[12] Virustotal. https://www.virustotal.com/gui/home/upload.[13] Emery D. Berger and Benjamin G. Zorn. Diehard: Probabilistic memory
safety for unsafe languages. In Proceedings of the 27th ACM SIGPLANConference on Programming Language Design and Implementation,PLDI ’06. ACM, 2006.
[14] Sandeep Bhatkar, Daniel C. DuVarney, and R. Sekar. Address obfusca-tion: An efficient approach to combat a board range of memory errorexploits. In Proceedings of the 12th Conference on USENIX SecuritySymposium - Volume 12, SSYM’03. USENIX Association, 2003.
[15] Sandeep Bhatkar, R. Sekar, and Daniel C. DuVarney. Efficient tech-niques for comprehensive protection from memory error exploits. InProceedings of the 14th Conference on USENIX Security Symposium -Volume 14, SSYM’05. USENIX Association, 2005.
[16] Leyla Bilge, Davide Balzarotti, William Robertson, Engin Kirda, andChristopher Kruegel. Disclosure: detecting botnet command and controlservers through large-scale netflow analysis. In Proceedings of the 28thAnnual Computer Security Applications Conference. ACM, 2012.
[17] Ahmet Salih Buyukkayhan, Alina Oprea, Zhou Li, and William Robert-son. Lens on the endpoint: Hunting for malicious software throughendpoint data analysis. In International Symposium on Research inAttacks, Intrusions, and Defenses. Springer, 2017.
[18] Binlin Cheng, Jiang Ming, Jianmin Fu, Guojun Peng, Ting Chen,Xiaosong Zhang, and Jean-Yves Marion. Towards paving the way forlarge-scale windows malware analysis: generic binary unpacking withorders-of-magnitude performance boost. In Proceedings of the 2018ACM SIGSAC Conference on Computer and Communications Security.ACM, 2018.
[19] Emanuele Cozzi, Mariano Graziano, Yanick Fratantonio, and DavideBalzarotti. Understanding linux malware. In Proceedings of the 39thIEEE Symposium on Security and Privacy, 2018.
[20] Yanick Fratantonio, Antonio Bianchi, William Robertson, Engin Kirda,Christopher Kruegel, and Giovanni Vigna. Triggerscope: Towardsdetecting logic bombs in android applications. In 2016 IEEE symposiumon security and privacy (SP). IEEE, 2016.
[21] Xunchao Hu, Yao Cheng, Yue Duan, Andrew Henderson, and Heng Yin.Jsforce: A forced execution engine formalicious javascript detection. InXiaodong Lin, Ali Ghorbani, Kui Ren, Sencun Zhu, and Aiqing Zhang,editors, Security and Privacy in Communication Networks. SpringerInternational Publishing, 2018.
[22] Alexandros Kapravelos, Chris Grier, Neha Chachra, ChristopherKruegel, Giovanni Vigna, and Vern Paxson. Hulk: Eliciting maliciousbehavior in browser extensions. In 23rd {USENIX} Security Symposium({USENIX} Security 14), 2014.
[23] Amin Kharraz, William Robertson, Davide Balzarotti, Leyla Bilge,and Engin Kirda. Cutting the gordian knot: A look under the hoodof ransomware attacks. In International Conference on Detection ofIntrusions and Malware, and Vulnerability Assessment. Springer, 2015.
[24] Amin Kharraz, William Robertson, and Engin Kirda. Surveylance:automatically detecting online survey scams. In 2018 IEEE Symposiumon Security and Privacy (SP). IEEE, 2018.
[25] Kyungtae Kim, I Luk Kim, Chung Hwan Kim, Yonghwi Kwon, YunhuiZheng, Xiangyu Zhang, and Dongyan Xu. J-force: Forced execution onjavascript. In Proceedings of the 26th International Conference on WorldWide Web, WWW ’17. International World Wide Web ConferencesSteering Committee, 2017.
[26] Clemens Kolbitsch, Paolo Milani Comparetti, Christopher Kruegel,Engin Kirda, Xiaoyong Zhou, and Xiaofeng Wang. Effective andefficient malware detection at the end host. In USENIX 2009, 18thUsenix Security Symposium, 2009.
[27] Clemens Kolbitsch, Engin Kirda, and Christopher Kruegel. The power ofprocrastination: detection and mitigation of execution-stalling maliciouscode. In Proceedings of the 18th ACM conference on Computer andcommunications security. ACM, 2011.
[28] Christopher Kruegel, Engin Kirda, Darren Mutz, William Robertson,and Giovanni Vigna. Polymorphic worm detection using structuralinformation of executables. In International Workshop on RecentAdvances in Intrusion Detection. Springer, 2005.
[29] Lorenzo Martignoni, Mihai Christodorescu, and Somesh Jha. Omniun-pack: Fast, generic, and safe unpacking of malware. In 23rd AnnualComputer Security Applications Conference (ACSAC 2007), 2007.
[30] Kirti Mathur and Saroj Hiranwal. A survey on techniques in detectionand analyzing malware executables. International Journal of AdvancedResearch in Computer Science and Software Engineering, 3(4), 2013.
[31] Tipp Moseley, Dirk Grunwald, Daniel A Connors, Ram Ramanujam,Vasanth Tovinkere, and Ramesh Peri. Loopprof: Dynamic techniquesfor loop detection and profiling. In Proceedings of the 2006 Workshopon Binary Instrumentation and Applications (WBIA), 2006.
[32] Fei Peng, Zhui Deng, Xiangyu Zhang, Dongyan Xu, Zhiqiang Lin, andZhendong Su. X-force: Force-executing binary programs for securityapplications. In Proceedings of the 23rd USENIX Security Symposium,2014.
[33] Zhenhao Tang, Juan Zhai, Minxue Pan, Yousra Aafer, Shiqing Ma,Xiangyu Zhang, and Jianhua Zhao. Dual-force: Understanding web-view malware via cross-language forced execution. In Proceedings ofthe 33rd ACM/IEEE International Conference on Automated SoftwareEngineering, ASE 2018. ACM, 2018.
[34] Heng Yin, Dawn Song, Manuel Egele, Christopher Kruegel, and EnginKirda. Panorama: Capturing system-wide information flow for malwaredetection and analysis. In Proceedings of the 14th ACM Conference onComputer and Communications Security, CCS ’07. ACM, 2007.
APPENDIX
A. Spec2000 Benchmark
Benchmark source lines binary size # insn # block # func164.gzip 8,643 143,760 7,650 707 61175.vpr 17,760 435,888 32,218 2,845 255176.gcc 230,532 4,709,664 378,261 36,931 1,899181.mcf 2,451 62,968 2,977 213 24
186.crafty 21,195 517,952 42,084 4,433 104197.parser 11,421 367,384 24,584 2,911 297252.eon 41,188 3,423,984 40,119 7,963 615
253.perlbmk 87,070 1,904,632 133,755 12,933 717254.gap 71,461 1,702,848 91,608 9,020 458
255.vortex 67,257 1,793,360 109,739 16,970 624256.bzip2 4,675 108,872 6,859 577 63300.twolf 20,500 753,544 57,460 4,280 167
B. Time Distribution
PathExploration
3%GlobalVar Init
5% Memory
Pre-Planning
13% PAMA Preparation
2%
Heap Init
24%
69%
LocalVar Init
CodeExecution
84%
C. Details of Malware Analysis Result
Cuckoo Habo Padawan Cuckoo++ X-Force PMPAvg. 41.65 38.88 53.15 53.28 67.40 133.36
14
IDM
D5
Cuc
koo
Hab
oPa
daw
anC
ucko
o++
X-F
orce
PMP
001
0005
6adf
d698
2498
c184
f429
d7af
61d4
2922
2831
4292
002
0054
49f2
6bb0
033c
8ba5
cfbb
5c2c
6f6b
1515
2527
3474
003
0191
642a
fcab
b6cb
2e94
4982
2ea1
0d37
7065
6998
126
128
004
03cf
e768
a8b4
ffbe
0bb0
fdef
9863
89dc
2729
2827
3221
600
504
5136
430e
dac1
24ea
134b
f2a3
2a4a
6015
1526
1516
3300
605
7857
3024
9052
1bd5
2d25
a141
bbbd
fb14
1427
2734
591
007
0686
a745
9152
174f
821c
8c63
5cfb
da8a
4753
4749
6790
008
08af
b611
1b6b
3d57
4036
cf10
fe78
7063
1414
2527
3345
009
09e4
b26d
f6b4
99a8
1453
766c
1722
6106
2232
2737
4680
010
0b85
5d8d
6a3c
3ac8
d5fd
6931
570e
02ae
4853
4855
6992
011
0bc2
cbb5
be3e
6513
55a5
0c07
8854
64bf
1515
2629
1733
012
0c0d
2ed3
3316
dc5a
92a2
7850
07db
cb50
710
1719
2527
013
0c1a
a91e
8cae
4352
eb16
d93f
17c0
da2b
1515
2523
3474
014
0cfe
8985
c56d
a5a8
21ff
9bf3
5aa3
dbd4
2235
3035
4458
015
0d18
6ccf
5829
dd5b
ffdc
2aff
944f
e2f6
2321
3437
4561
016
10c4
7191
922e
efcf
ae39
bf5b
e540
bd44
1516
2327
3535
017
10f5
beac
257a
9266
5866
cdc9
9550
b7bb
2017
1925
3134
701
811
3c07
9464
639b
4a12
826b
42c1
d96a
c724
2235
3546
7101
911
c489
ddea
8580
30b2
3f7a
c184
9944
3948
5447
5569
8702
012
26e4
36e5
e830
c9fb
e580
43fa
4f9f
3b43
4042
5976
8302
113
21bd
12e1
64aa
7c8b
7e39
afe7
bc8a
6220
1831
2742
5602
213
2397
a7e7
93fb
4052
f8d4
4634
a155
8236
4036
5063
7302
313
7c15
20b3
7dfc
3ce5
072b
e799
5c96
fc14
1424
2633
4502
413
f2bb
2af1
6f51
3b4a
35a2
6c6f
8f5c
bc40
4142
4058
6402
517
5793
13f1
4995
e2bf
a75a
7035
62de
bf17
1827
2640
5602
617
9c76
48bb
6071
4797
3c2f
ccbc
c0e5
3021
2531
3443
4302
719
9c8f
fc24
8a35
d99e
1f26
ff79
bd93
9814
2014
1418
3202
81a
7e8d
dc31
7806
db05
3c47
2e12
99fe
3315
1525
2634
7402
91b
5054
939e
e601
d89f
daa4
4c10
9943
cf29
2228
2942
9103
01b
74e8
a749
948d
2fbf
2f90
486c
e63f
cf17
1827
3040
5503
11e
19b8
57a5
f5a9
6805
55fa
9623
a88e
9912
1712
1228
330
103
220
7716
6b21
e971
7df7
06ca
897e
5bfc
9414
1424
1415
4403
321
0e42
43c8
edc8
7499
ce7c
aa40
76d4
3322
4541
4060
6903
422
dc1d
b1a8
7672
1727
cca3
7c21
d316
555
85
717
135
035
23c4
2760
5322
7011
3de5
7b97
346e
dff0
2018
3030
4256
036
24bf
1279
bc8f
fe0c
8380
675c
b8c1
b94a
1717
2730
3940
037
25c3
64af
9d80
25dc
aa8f
6ac1
0c82
83af
1718
2832
4055
038
2825
5eb4
c29e
f042
0572
126d
8bc0
e481
2018
3030
4254
039
28c8
6684
3a94
6211
3eb2
6aef
1024
db08
1717
2829
3655
040
28fe
d854
eead
d32a
bfd9
46e0
692c
9ae4
2119
3233
4242
041
2ad2
8d99
4083
eb88
d56e
ded3
61d7
e381
2245
4041
5380
042
2d66
f629
e000
42de
8662
b384
b3c7
c3bb
2030
2524
4346
043
2e64
53a7
eac4
07db
e47b
70b7
2082
490c
2018
3033
4256
044
31c5
5141
1291
51ee
4728
a406
13b9
3eca
2117
2121
3155
045
3544
c1e6
82d9
7dc5
e5db
ef68
98f1
7fcf
1718
2832
4055
046
3626
3d91
d726
dcdb
93b9
7ea0
5ae8
656a
3640
3636
6369
047
36a3
32f5
a8dc
058f
df43
7fa6
7ecc
06cf
3936
3840
5874
048
39d4
6a0c
d603
93e5
571b
720c
915d
b30d
4854
4754
6993
049
3ad6
f8a2
57cf
a2d1
1292
cb64
20ed
884a
1819
2826
4157
050
3b0d
923c
f179
2151
e654
0ca3
8b3d
6d19
2017
1920
3274
IDM
D5
Cuc
koo
Hab
oPa
daw
anC
ucko
o++
X-F
orce
PMP
051
3dec
f1b4
e5e8
21c1
59e0
51a0
4fbf
0452
77
2019
2727
052
3e21
a608
b643
41e9
7a73
861f
a0b2
4ec2
2018
3249
6175
053
3f19
3286
767c
269b
786e
117c
4380
7f7b
1718
2732
4055
054
3fb8
5717
3602
6538
61b4
d054
7a49
b395
1414
2727
3448
705
540
845a
4a90
24e1
a44b
f245
3c11
dc40
0318
1829
2841
5505
640
8737
6ef7
2170
f248
eb2f
0665
a267
9622
1922
2533
4005
742
4f94
d07b
45ea
b1bd
3249
4cde
b4d6
7b22
4540
3522
6605
846
eaf3
f07c
2a59
e0bb
284a
7aac
b41d
c440
3739
4462
136
059
483b
322b
4283
5227
d98f
523f
9df5
c6fc
307
2630
4351
060
49c1
7897
6c50
cf77
db3f
6234
efce
5eeb
1718
2725
4040
061
4b1e
9e8c
cf91
9983
9350
9290
d436
ede3
6061
5960
7722
406
24e
593a
f1ab
2587
3681
c62c
a4f4
9e31
e321
1931
3543
4306
34f
5d0e
d102
de7c
171d
1df4
989c
4cdc
d015
1625
2936
5506
44f
a426
9b7c
e44b
fce5
ef57
4e6a
37c3
8f25
1621
2537
7906
550
2a90
ed7a
851b
01b3
40ad
ed82
2c4d
e028
2227
3341
119
066
5242
87dd
a3d6
d8e5
9ebe
2494
76ed
8181
2723
2632
4274
067
53ad
943f
e07b
e315
d908
c6b8
fe30
5a08
2422
3540
4669
068
54b0
f140
da40
e571
3377
f4d4
a8f1
43ad
2417
2526
3415
906
955
9169
cd81
67dc
baaf
065d
6a12
2a28
9d20
4338
3340
5707
055
e0a8
737b
091d
a7bd
a706
0b75
b2e1
1960
6162
6010
122
707
156
cb1c
4e78
8e63
325b
bb53
1da1
87e6
0931
2759
6470
9607
257
b1ff
91b5
9aad
a9a1
c566
940d
b4d4
6a27
2928
2749
9007
357
b4d2
1080
51db
e43d
7b35
777b
a76d
4015
1526
2734
7807
458
2f47
ec97
5b0b
a8ca
fe5a
39cc
cbd5
5224
2234
3646
7107
558
af33
baf6
8feb
637b
59a2
0ba4
ea0c
0326
5348
5165
9607
65a
6fd6
3f4f
fc60
37dc
192b
6c3f
456e
8755
3258
5876
123
077
5b36
aebe
d504
b731
23e1
0de2
1529
b638
2119
3134
4343
078
5c1d
d20f
74da
c823
0686
4a41
1f96
171c
140
149
8714
015
718
307
95c
47f0
9a37
376d
9b6a
4e97
518c
435d
c917
1827
2739
3908
05c
f611
0f21
b801
23f5
77e8
5bf8
1af8
2f22
4540
4760
9308
15d
6aa6
7ce3
4270
3f67
3592
5d35
9c30
4943
4042
4377
8308
25e
890c
b3f6
cba8
168d
078f
dede
0909
9629
2528
2944
7608
35f
1332
6e2c
90b7
0593
b645
540f
2521
3f17
1828
3240
5508
45f
b565
eee5
336c
0b30
451a
0a02
3036
b811
520
2029
3008
55f
d2ed
4f42
f0cc
e701
482f
bdb7
8a00
b136
736
3739
7608
65f
eaa8
5c62
d111
7a79
31df
0bf8
b62d
d321
2431
3343
8208
760
25e1
4c04
a7c3
5e8a
0498
85f0
35b9
7b15
1523
2532
3208
861
3965
7db0
8c3e
9d5d
2399
259e
8eaa
a028
2427
3343
7408
962
c2d2
9606
0d14
061f
5c54
f316
62da
c931
2757
6188
105
090
6355
f0ea
6c19
090e
0bae
dc57
016b
eb6c
1920
1919
3131
091
6643
78d1
0f61
0552
d17e
97cc
06ad
e139
2043
4839
4857
092
6b0b
d959
9779
c3a4
899a
6ee9
fd2e
ee03
2824
2835
4568
093
6dc1
f557
eac7
093e
e9e5
8073
85db
cb05
1515
2622
3474
094
705d
f7bc
13a3
fc1b
bfc7
9735
455f
da68
2421
2528
3445
095
70ad
6b0a
94a0
ef3f
f974
833d
d729
6b8d
2925
2829
4517
309
671
7dfa
0468
33da
c608
b6f1
a274
a479
388
723
2329
444
097
72af
ccb4
55fa
a4bc
1e5f
16ee
67c6
f915
362
362
291
362
423
505
098
7412
4dae
8fdb
b903
bece
57d5
be31
246b
3640
3638
4084
099
74f0
ec75
b6bc
ed0b
e2ed
e454
55fc
90a5
417
4041
4458
100
75e0
4ad8
2835
9d2d
2571
8430
bc5f
3dd3
1414
2426
3354
15
IDM
D5
Cuc
koo
Hab
oPa
daw
anC
ucko
o++
X-F
orce
PMP
101
7707
56fd
aed2
3e4e
f3c0
a17f
26bc
22b6
1717
2828
3556
102
7dad
01f2
6f01
992d
24d0
f8e6
d08d
042e
1718
2824
4055
103
81b6
ee21
6e10
e171
0470
6536
c21a
479a
3936
3948
5915
710
481
ea37
9c23
7724
249c
137f
c83e
f21e
9a6
96
619
3510
585
0177
156d
5a01
0254
bba5
7466
64a3
c715
1526
2734
3410
686
2cfa
928c
8edf
d50e
d22e
08bb
b14c
6117
1827
2540
5610
789
8dde
6afb
3142
e607
5283
59b0
935e
9e48
5447
4869
8810
88b
d0c5
f369
8721
8a95
dc56
677c
40f8
8017
1727
2936
5710
98b
fed4
ef10
67ca
119d
4d71
a66a
84e0
6e8
725
2329
2911
08c
5c1e
62d7
37ff
d0dc
36b2
c125
2ddd
7537
4038
5063
266
111
8dba
0738
910e
f345
90ce
a87a
3c1a
c538
2732
4851
7474
112
8df9
ec7c
d1de
7895
7ea8
00fd
63d6
6051
3936
3939
5813
211
38f
1948
4738
7186
899c
c8d9
f9ca
903e
0710
311
213
914
314
719
011
490
1cbf
f407
84ee
4051
8fda
6471
e70b
aa15
1426
2633
7011
591
2bca
5947
944f
dcd0
9e96
20d7
aa8c
4a15
1525
2634
8311
693
53a0
60cc
5fc8
f26c
e8a0
105d
fac4
8f15
1426
2432
3411
793
61a4
d5b4
bf30
4175
9bd4
f727
920d
f214
1428
3334
587
118
93c2
f1ca
9949
435c
ffe8
1572
d3d2
1d5e
1515
2527
3434
119
942e
a0c4
cb72
9d48
78eb
5b89
9898
1228
4854
4854
6992
120
9680
4156
396b
ce25
d49c
4ea4
f058
d569
4753
4847
5983
121
96de
2982
978e
a899
ba4a
97ff
73e7
f466
1515
2627
3476
122
97ba
48a2
562e
856d
8eef
15e1
c9f6
585e
1718
2825
4257
123
9941
36a3
c183
9990
0f73
d085
bf42
a330
9794
4010
913
814
212
49d
2b50
7212
c19a
9dcf
9516
8745
e793
ea39
3638
3959
145
125
a254
70a5
b305
fc5e
7c80
b688
10e1
32b2
4340
4246
7683
126
a278
9638
8f0f
0dad
493e
7d78
6e48
eaab
1414
2321
1595
127
a3ab
4dfb
3e3b
160f
ed14
d923
db29
daec
2030
2523
4253
128
a440
4be6
7a41
f144
ea86
a783
8f35
7c26
4743
4655
6982
129
a494
4230
d620
8301
9d13
af86
1b47
6f33
1415
2425
3554
130
a4ee
cf76
f4c9
0fb8
0658
00d4
cad3
91df
2940
5956
7830
313
1a5
8fb8
3be4
0987
4271
fa04
7090
12b5
ad19
1729
3241
5513
2a6
2f2b
ca5c
0a5d
239c
6a37
32a2
f424
ab22
822
935
935
540
862
113
3a6
617c
5cb5
9135
e057
9949
8d26
4564
c775
7172
7595
129
134
a664
df72
a34b
863f
c0a6
e04c
9686
6d4c
1718
2833
3917
213
5a7
1079
102c
6f70
53a9
402f
72ce
c798
2522
1821
2222
3313
6a8
cd63
8e13
b184
8f34
7fc7
24e9
386e
a815
1523
2533
7913
7a8
f782
41bd
7b7c
ad50
e054
bcb4
dfa0
1b27
3248
4774
7413
8a9
6fc6
e018
d771
932b
70aa
f9eb
8b74
8434
735
335
941
552
371
313
9ab
2b93
6e95
da49
1789
caa8
02ec
4948
cf22
1921
2634
6514
0ab
40be
a438
fbf8
09b5
786d
52b3
8ea3
1839
3639
3959
144
141
abbf
052d
0c9d
84c5
a30b
f734
8e22
5b31
1819
2831
4041
142
ac2c
9ce2
b3ed
f070
4502
4d60
f9b4
e53e
2732
4860
7575
143
ad76
e4b7
470d
f936
8380
b2b5
3754
10b4
4037
3949
6214
814
4ae
c2df
8a6c
b35a
a5b0
1b0d
9f1f
879a
a120
3025
2342
4914
5b4
088d
aeb3
11c2
4d8f
9a20
b5ec
223b
c921
1720
2431
5514
6b7
5462
2e81
6fb2
2814
02b8
6f75
fa9c
cf26
2225
2642
337
147
b8f6
cdb7
360d
d241
1fcb
ed86
cf77
b775
1514
2522
3434
148
b91f
ed81
7500
f9c3
77ca
9c79
9e98
7c74
2732
4860
7474
149
be0d
b913
011e
51e3
424b
e784
1b13
fd05
1515
2626
3477
150
bf82
8780
5afd
fc72
ca6b
7c6e
76d5
b04a
347
355
352
351
521
716
IDM
D5
Cuc
koo
Hab
oPa
daw
anC
ucko
o++
X-F
orce
PMP
151
c276
4861
cacf
73cd
a222
7bfe
b67f
707d
88
710
1312
915
2c2
a5b7
5c72
73b3
b4d4
bf0a
234e
ea35
f228
2427
2943
6415
3c3
2a5d
9b0c
78b3
35af
5197
d383
1966
a941
4241
4150
6115
4c3
6625
389c
b473
9518
472d
e429
8536
fb54
3258
6075
120
155
c38d
08b9
04d5
e1c7
c798
e840
f1d8
f1ee
8284
8786
110
137
156
c533
1421
8033
7d02
f5e2
a6ee
2bf9
e099
1414
2727
3458
715
7c6
3cef
04d9
31d8
171d
0c40
b752
1855
e915
1526
3134
7915
8c6
4919
c972
36dc
ef4e
9714
0c11
53b2
7414
735
3146
327
159
c80b
8f2a
2d6a
9e15
00bf
a52f
864e
a46d
1718
2831
4055
160
c83b
5e8b
4782
4392
082c
8424
0bf2
f8b4
1718
2731
4055
161
c8c1
f2da
51fb
d0ae
a60e
11a8
1236
c9dc
2922
2829
4191
162
c97a
cd1f
ad05
a0b0
a782
5f56
47d4
244a
1718
2832
4055
163
cb04
7744
5fef
9c5f
1a5b
6689
bbfb
941e
7066
7099
125
127
164
cb3d
93f6
5c64
e48e
f812
74a4
9a74
8ce7
2925
2836
4511
616
5cc
29a2
24e3
2741
2e0d
b7f3
ce5c
4f4e
006
86
1418
3516
6cd
60f7
42fc
71f9
8b34
a264
c5f3
e55a
4214
1429
2534
3416
7cf
cd51
53e7
3940
6baa
7b35
4dd5
b28e
0417
1828
3140
5516
8d0
4c49
2a5b
7851
6a7a
36cc
2e1e
8bf5
2170
6669
9912
512
716
9d0
874b
a34c
fbdf
714f
cf2c
0a11
7cc8
e238
3537
3858
135
170
d0b9
d58f
3a45
4ad6
df2e
4d05
5858
c1e5
69
614
1833
171
d1a1
9e83
4ef3
a4f7
ecfc
d8af
04c6
ebe4
1414
2823
3458
717
2d2
1fb7
ed52
ba13
2942
4035
4c1f
528d
2f3
53
1013
269
173
d2cd
482b
a82e
592c
1dc5
ded7
db79
ec70
1515
2524
3474
174
d3a8
94f6
052e
cee1
ca87
b69e
619c
a0cb
3127
3031
4912
217
5d4
93af
745d
e315
c698
9355
a49d
21b2
a320
1831
3142
5617
6d7
21e7
efb5
d63e
af85
5407
4894
2f30
1d42
4243
4248
5217
7d7
d730
62d2
defe
111b
6ba3
bdcf
5e4e
1817
1728
3336
5517
8d9
79d2
dce9
7978
8c0c
e9cc
c72b
4456
1727
3248
6074
7417
9da
e9fd
1c16
b6fe
e713
f531
82cb
2d4e
1017
628
3140
4018
0db
1676
5a02
efbe
75ae
569c
5901
744c
1934
635
846
046
552
271
218
1dc
4db3
8f6d
3c1e
751d
cf06
bea0
72ba
9c15
1526
2934
7518
2dd
77f7
4445
d61c
8d80
335b
15d4
32c2
7b13
721
2633
110
183
de2e
4104
8e3a
54ac
1e6b
bae9
1ae9
99ab
2043
3942
3059
184
de57
98b6
9df9
2163
cdd2
5f36
2565
c521
2732
4944
7474
185
dff0
9a1a
31fa
dad5
18a6
760c
3cfb
dc17
2846
4251
6417
018
6e3
7ff9
a3fc
89bf
29ea
9633
3f3a
a7f2
9640
3740
4060
135
187
e3d8
0f2c
d1de
02c7
4f19
8189
aba3
3052
2922
2829
4291
188
e6ff
a02a
63c9
51e4
e8a1
31e4
3d9f
ea6a
1515
2520
3476
189
ec3d
e135
5a20
56a7
eb5e
799b
5e98
9d0b
2447
4449
6392
190
ec67
3fed
d528
23da
1eba
e701
9e04
2382
2119
3128
4282
191
ed62
ce1a
406b
2a0b
9d6d
79ca
4e35
72b6
1818
2833
3759
192
ee11
c233
77f5
3631
93b2
6dba
566b
9f5c
3127
5743
5587
193
f277
51af
292f
252f
1cc5
5f90
f15b
d30b
1422
284
162
203
423
194
f2b0
0b27
e6e8
d10d
3c27
525e
cd9a
f120
4753
4853
6992
195
f3e8
a50f
0c1c
3a51
0f88
2d0f
db12
1960
1414
2426
3333
196
f8cf
c2b7
f01c
3a26
f0a9
db32
b8c5
f51c
1716
2727
3636
197
fa68
eb45
4b37
401b
b047
6428
a3ae
84a5
2017
1924
3165
198
fa7a
3c25
7428
b4c7
fda9
f6ac
6731
1eda
2414
2027
3415
919
9fd
75a8
7293
ca32
15f3
c033
f64f
eefd
0f18
1729
3037
5820
0ff
02a1
6427
e320
0526
2203
50fa
8c9b
4f30
2630
3544
55
16
IDM
D5
Cuc
koo
Hab
oPa
daw
anC
ucko
o++
X-F
orce
PMP
201
011b
b615
de58
263b
483c
8fb0
4d04
525c
2016
1923
3051
920
202
7aaa
b9a6
c3a3
d94d
7885
8821
555a
8b31
2630
3136
107
203
02fc
2315
2110
db73
763d
50fa
2c9b
f8f9
1514
4243
3470
204
0356
1dd3
5406
b403
d854
0297
9b9d
05a2
4339
4243
7682
205
03b2
5978
73ba
0f0e
28e3
dc78
343d
d968
1716
2828
3552
206
03ed
77d8
a342
473b
ee10
0850
e42c
d11c
715
2820
2613
920
704
9d71
3e78
33ac
6fa0
cdf1
b632
dce1
dd15
1426
2733
7020
805
266e
c1f4
c998
1e70
2768
1563
fc88
6759
558
5962
112
209
0632
ef98
ee12
a475
4e7c
9142
8562
5ab0
216
178
299
216
269
344
210
0673
2943
0589
b374
c35e
1b69
6ada
34f9
2123
3121
2778
211
06a3
5dd4
6bae
273b
b428
5056
3c9f
51fe
3834
3745
5613
821
207
ce3c
632e
2399
c1b3
218a
7759
9ea7
7170
3999
102
9026
021
307
f5bb
c7f4
14bc
b25b
bb80
1424
0e8c
0f28
2427
2844
5621
408
dbfa
cee7
a4a7
7f25
f159
bc86
66a9
7420
1619
2330
563
215
0a44
d707
8bc1
c5f1
217f
f503
f2f3
ebc8
87
2017
2865
216
0b26
005c
71ce
a142
c87f
8e97
6cf7
04e0
7267
8972
9022
221
70b
9835
fd94
b8a9
6749
7835
cb13
e212
b126
1726
2633
3421
80d
4de5
0a28
c429
4576
aa83
4f13
d4f9
5915
2025
1516
7021
910
8079
ccf8
8556
2a92
cb36
3add
b418
2c7
1017
77
138
220
11f6
f1bb
81a8
37fa
b5b5
7835
2150
a7be
1818
2818
2356
221
125d
ca58
b815
61fa
fe56
7972
52d0
a39e
6863
7073
7172
222
135f
b83a
2a1f
ad99
4ac2
98da
a9a4
27bd
2823
2728
4255
223
13e0
645b
a42c
32bb
0494
19b8
3f2d
c804
1718
2830
3957
224
1408
f779
af2a
5ed4
e736
af10
7da2
9ec8
2017
1924
3046
225
14b7
88d4
c555
6fe9
8bd7
67cd
10ac
53ca
4949
166
165
183
220
226
15b0
9361
3803
80d3
bdcf
ec7d
316b
6951
306
306
337
340
350
457
227
1963
60a0
6bbe
f80d
5a9a
ae11
f589
4a34
2016
1923
3063
228
1a71
3da3
360a
3451
6ad8
2b15
23ab
f6d1
1717
2826
4064
229
1d21
a6d8
8e50
e371
e8bd
e993
d733
3d89
4846
4749
8486
230
1d54
16ae
2474
aedf
d68f
79e4
aacd
1b14
2017
3127
4170
231
1dbf
b9de
8ddd
9480
3969
3054
fe83
459c
7889
9183
104
227
232
1ed9
7c5d
e81a
7a90
3772
7c63
9faf
9bfe
2320
2227
3454
233
1f79
632b
b62b
3497
492e
c6fa
366d
98fc
349
407
422
419
495
655
234
21c7
5019
e965
cfa6
ca34
a670
c238
c379
137
2227
3414
723
523
6160
5b95
afa6
514d
d856
b218
54dd
2648
4547
5284
8623
623
70ef
9dbf
483c
20f4
8b4d
1a2a
6ab3
b234
620
835
434
636
058
223
725
6ad8
6b8c
ea17
b514
2304
97d6
2b89
0715
1526
2623
7123
825
a528
4bcd
99e2
4656
6e0a
927f
da27
fa17
1828
3139
5723
929
2d12
4aa5
8579
e182
3995
1f63
c38d
a714
412
920
516
620
830
024
029
5370
e5a3
afdb
8f6b
abdf
ff74
837f
0b49
4553
6684
8524
129
9557
4af0
3023
ed91
99bd
c54d
e34d
f013
1224
2614
3824
229
f518
d6fe
7de8
df67
91d1
1066
8b91
2d29
4466
6943
199
243
2b79
e388
966b
b783
ba81
e56b
490f
3b93
5247
5865
8114
124
42e
940a
e965
d9ff
64a0
b225
718e
7652
9021
1832
3343
5424
532
370b
31ab
6b2e
23e9
ab4a
dd4f
2819
aa8
820
2127
6424
633
1b1c
ca79
f04e
3ba0
c907
bcf0
7224
d138
3138
4046
6824
733
af29
cb0d
eee7
ee22
f994
f4a4
d23a
7420
1619
2431
5224
834
98ca
6576
a3ec
21cf
2884
0ffd
4db5
e717
1728
3035
5224
934
a4c3
3ba5
e445
1c57
96bb
4476
724d
6a15
1526
1516
7225
035
18cd
0ceb
ef50
798a
cda3
38f2
43f1
6c4
1315
1520
111
IDM
D5
Cuc
koo
Hab
oPa
daw
anC
ucko
o++
X-F
orce
PMP
251
356c
e264
ae08
67f6
0f34
cd78
a2f9
3ff0
3935
3941
5363
252
35bc
0e96
dec5
d36f
5533
2ea6
49c3
73d6
1413
2122
1578
253
364f
f454
dcf0
0420
cff1
3a57
bcb7
8467
314
152
362
314
341
385
254
38c9
40d0
37d6
5327
5b72
c9de
1b64
2727
103
6011
611
916
117
425
53e
c866
180f
9cac
1bcb
1d60
37d2
8465
6731
931
331
731
941
341
725
63f
037e
9dd4
4b74
b13d
6791
c6a2
d69f
1029
2528
3544
5325
742
7289
af22
c461
74ec
af98
7d21
7862
6d20
1619
2430
537
258
4547
60fe
8180
c3c3
bb06
2f8f
c4aa
1b7b
350
191
421
406
497
653
259
455c
a632
06d5
88da
68c0
7d7b
c2a6
eeeb
3935
3845
5814
326
045
8fe2
4395
25b3
f6b4
7ed4
ba9d
56f2
8e15
1523
1516
5926
145
a02f
b927
2e3a
cb5c
9a6c
65bf
41d7
6817
1627
3235
5226
245
a943
ce94
b89d
e26e
c923
dd79
b67c
6249
4649
6784
8526
348
1d0b
aa98
0493
79ab
7713
8273
93dc
317
1417
2127
139
264
487b
b61b
3eee
cb39
88bb
1d96
2b39
1470
2119
2021
3544
265
4996
9f44
8439
3afc
1e1f
4115
1512
e1b4
1916
1819
2943
266
4c78
c0b1
5048
a657
2136
9ec3
b076
a4d3
2624
3234
4456
267
4f46
355e
3b52
5340
dba5
4aae
f375
13b9
6059
6571
8915
426
851
bba8
09f6
6c8d
8df3
71f2
c5ec
690d
6814
1130
3037
486
269
51f5
16f9
1d06
a0ea
22b1
6a14
9901
9784
1717
2717
2263
270
528d
ded1
1385
d5f6
f0f2
cd1a
ed76
7612
1718
2833
2254
271
5512
7fe3
361c
858f
792c
1ed2
9397
9405
1816
2828
3685
272
5588
9bba
8c38
037b
6435
3664
e71e
4de2
1915
1822
2942
273
55a4
1048
7b1b
3332
0db1
89c7
330d
1d27
168
2623
3574
274
5835
a68f
0a6a
ca46
219e
2c3d
d67b
b08b
85
5342
5213
027
55a
8285
4f4c
17fd
eb96
d757
3775
d5c1
f726
2547
3645
5527
65d
5c68
9616
635c
7f1f
70e1
1f56
0cd7
a915
1427
2734
4727
761
c382
9b71
be53
cf53
1359
f117
9278
f843
4042
6076
8227
862
e8fa
e326
7ca4
77b5
bcf6
e20b
08db
5c70
569
970
270
581
882
027
967
e278
1ab7
6e0f
df90
e16f
eda6
f9bb
9218
1728
2836
5828
06b
dbf2
3cef
66b6
87d8
770c
dbb9
7515
2a51
6569
8811
311
428
16d
b508
7356
5946
688a
dbc2
95b7
1df7
9217
1730
3039
5528
26f
0182
8bff
7489
d754
3092
2d88
2802
ac7
1921
2330
119
283
7058
a6ff
263e
337c
28d0
2555
d4d5
d840
193
150
266
194
243
323
284
706c
0b48
c890
88fa
b58c
b1ea
a5cc
8481
2823
2733
4266
285
70da
56d8
1aac
fdd9
8303
2de8
d153
b134
1915
1823
2941
286
7191
1c87
0331
7d85
550f
b2c8
434c
ba2e
115
2016
2911
828
771
9b1b
9f69
1458
af3b
0da9
7464
9f42
bf8
724
2228
486
288
71f0
165f
8f32
3fab
eabb
6e78
99bd
82d9
1818
2931
4056
289
73e2
2cbf
6931
32f1
8efd
7de3
70b2
c649
1415
284
162
203
434
290
76f0
a6e2
e2b0
041e
b99f
d38b
e1a1
0d30
1817
2925
2052
291
7705
b32a
c794
8398
5284
4bb9
9d49
4797
215
180
282
285
266
340
292
7731
bca7
a293
3660
73a9
6bbe
ff46
ef1e
2622
2526
4359
293
78b3
573a
0b1c
48e1
ce76
8159
0729
b933
3436
4146
5469
294
78fa
cb6f
ed49
3a21
4931
b38d
a717
e0c7
1717
2832
2264
295
797c
5c00
edd1
b91c
c97c
c37d
dc0e
fd4a
2921
2529
3349
296
7b11
921e
962d
d58a
2a0d
91c1
3f35
8e6f
2118
3135
4382
297
80ea
54e6
b09a
879a
0049
6113
146b
9fe4
1717
2727
3552
298
826c
991f
c57c
b3ca
5938
54c2
6b0e
90d9
3025
3331
4428
029
983
c57d
b78a
4114
3f99
52f4
dfa0
be4e
8012
262
141
148
186
265
300
8416
c4a8
4f49
5fe4
7f5c
ddec
e8af
bb74
1716
2824
3551
17
IDM
D5
Cuc
koo
Hab
oPa
daw
anC
ucko
o++
X-F
orce
PMP
301
85c7
e24b
1c61
0e95
a00e
67de
4530
6475
2118
3232
4274
302
867a
c455
ee27
fbd7
872d
3aef
ac72
9bf3
7771
7877
139
236
303
8711
3c55
c539
8c65
c8aa
7157
b5b6
4f1a
85
5332
4177
304
88ec
b917
21cf
a62e
7243
17aa
0029
3278
8932
9894
110
252
305
8932
bd03
aeaf
81b1
b6d6
a7c9
7ea2
da1a
87
2415
2016
830
689
3b1e
ddfc
d390
b26b
8ddd
a3ac
725f
c651
4750
5590
9730
78a
4dab
eef4
e887
49a6
abe1
d272
003d
1572
6773
7712
512
630
88a
8248
3ea3
4fd1
5601
0d9e
a8a8
234a
4943
3942
6175
8130
98a
b662
4385
a750
4e13
8768
3b04
c5f9
7a15
316
929
222
127
470
531
08d
3ea7
5f16
0fdd
9ff8
efed
ab7e
4368
5151
4850
7189
9531
18d
68e2
86eb
caf2
a3e7
6e78
14bc
4084
fd18
1929
1823
5531
28f
184c
2e09
d6e5
c19e
1ede
a508
50c3
4742
4762
5762
8331
391
88f0
ff60
70eb
28b6
5aa1
c396
d898
3535
3535
4153
7831
491
f5d4
5b7a
24d6
9e9d
2d0b
8887
0c8c
4027
2326
2740
9831
593
30d7
d311
4fc7
bfa2
ee8d
05ad
6882
ec17
1727
3129
5431
693
7e25
c1c0
5915
0dab
0ec9
5a3a
7152
6221
2531
2127
7531
796
1c82
4f20
8dbd
57c2
c489
9558
30b1
9530
2529
3045
5631
896
9dd7
0e0d
ba7d
f04f
c935
4822
4ba8
a217
1728
3339
5531
998
4a05
24e3
3306
0a33
7c5a
6cea
e06b
4218
1828
3241
6432
098
579b
2885
81d0
2dcb
2e65
81f9
be5a
2f68
6369
7072
7432
19b
0ee3
bf1f
a0a2
b5e3
d07c
0b52
dab1
a643
4042
4375
8132
29b
6acc
30fc
9e22
4fea
7459
06ed
8f88
8923
2133
3544
5832
39b
7f5a
1228
fa66
cbd3
5e75
fb77
4fdc
8e15
320
324
616
119
965
532
49b
b32c
8115
e3c6
43ee
55dc
41e7
54da
737
1017
1225
139
325
a21e
5260
be78
4afb
d01b
93b2
0932
ce8c
1818
2827
2356
326
a273
83a4
644c
8f25
db5d
fbd6
496a
b5d7
1717
2730
3955
327
a27e
e2b8
f214
dfbb
5e15
7417
51c0
9bf7
144
129
166
144
177
285
328
a4d4
b5a8
4268
22a6
e261
41f0
a997
81a4
1414
2525
3245
329
a8c8
6a50
e561
3d22
84c7
e1a0
f18e
5bf2
1515
2522
3373
330
a976
deb5
1d29
5834
b033
609f
9d55
44ff
2823
2728
4261
331
a9b6
a5e7
044e
e975
dbdb
de90
245f
3938
1819
2918
2356
332
aa5b
ebb8
4c2b
aae8
2478
2a85
e2bd
e15a
5852
6058
7725
833
3aa
646e
4158
bc48
ecf4
c745
ef36
664f
1c26
2225
2642
131
334
ab27
fa9c
2b79
7eda
cfbe
961a
e013
72ad
2824
2728
4363
335
ac6d
0498
30db
2f68
ba01
425b
e8b6
d141
8783
8687
146
147
336
b03c
3233
0edd
83d1
0f23
c941
ce11
412f
2017
1923
3041
337
b04a
b29c
9a7a
4fb9
9c1a
8e60
aebc
5f38
1717
2826
2254
338
b0c2
3492
048f
6cd5
595c
f847
381f
d5b2
2017
1920
3054
433
9b1
598c
6f6e
9552
b8c0
7761
6379
3b52
9e18
1828
3140
5634
0b2
7d6f
a312
b314
d49a
7e4e
a7e8
5fc6
8515
1425
2033
7034
1b2
cd98
a0b6
f6ac
9de9
2c92
a702
ee5f
768
821
2027
6234
2b4
bc9b
6f1c
6898
1bad
1cb4
0e8c
f71e
974
615
1722
112
343
b58f
0433
67e6
057c
9c79
418d
332e
38c8
1416
284
162
204
420
344
b805
3ad0
8478
30c6
98b0
bdc3
5020
f0d8
1718
2730
3957
345
b855
20dd
2d64
d6d0
5fe7
5b61
1225
3fce
1413
2625
3248
034
6b9
3174
8458
cfb2
261c
f7c1
4fb0
441d
9524
2023
2734
4534
7ba
cee6
5f81
6151
2834
5982
051c
4a60
5f15
1426
1516
7034
8bc
225b
cbb8
0bef
9c0b
0d01
4305
a954
3d55
5454
7292
100
349
bf2d
e60d
4ddd
f43b
4313
668a
d04c
cafb
3835
3738
5613
935
0bf
ef69
6178
596e
2b80
1b39
6f8e
c4c2
0314
1326
1415
44
IDM
D5
Cuc
koo
Hab
oPa
daw
anC
ucko
o++
X-F
orce
PMP
351
c39c
03e2
76ac
9bf6
4a50
2aa9
7f41
87a9
2118
3534
2789
352
c425
54ae
f957
0285
5ecc
2c01
f01d
5cb9
7265
6799
124
125
353
c505
029f
2342
e045
2eed
10d7
7055
92fb
5147
5051
8995
354
c618
80e6
9964
0afb
bba3
e0ba
7a84
98b4
1718
2826
2254
355
c838
4a4b
1951
5354
48fe
3433
74e3
8629
2925
2833
4583
356
c87f
1455
ce2a
5d3b
68ce
4bd4
bb0f
2ffb
1718
2629
2254
357
c8d2
fbac
602f
a261
aa58
276a
2fd1
c1d9
2229
4651
4274
358
cba0
943d
3321
347d
28b2
93c1
4e2d
352f
88
2322
2816
835
9cc
acb9
6752
4b88
ec37
f977
9e82
6b89
ea24
2025
2734
4536
0cd
3f83
5f1e
f72f
9dc4
8be1
ea7f
912d
ee17
1727
1722
6336
1cd
9db4
3547
82ac
9a26
d927
7d2d
119e
c613
313
316
816
713
844
936
2ce
cd49
88e0
23f5
be02
ae9f
b8db
fd80
c318
1828
3240
5736
3cf
f22e
3737
8dbc
2800
72c7
51cd
13c6
1275
7174
103
130
131
364
d73f
ace1
dbd4
5383
e743
89a1
bb3a
2790
1515
2526
3372
365
d766
b045
d130
c0ab
c5d6
5be9
2548
66d2
2016
1923
3052
436
6db
5d47
8bdd
8c50
ee44
25c3
b7aa
7a03
4219
1220
2229
263
367
dbd1
c1eb
767a
4589
40a9
16a5
5e50
783b
2824
2728
4261
368
dc11
905d
b6d7
b885
d067
2836
690b
0789
1717
2722
3955
369
dced
35ba
29ce
e865
0406
4bf4
5c1f
dd34
88
2422
2949
037
0dd
1e01
91db
b0d9
e6c3
0f6a
17b9
6865
7e39
3539
3953
6337
1de
91ad
771b
54f7
3a92
4ac2
4a83
0c7b
d917
1628
1718
5337
2e4
1f79
65cb
a7e0
29c9
c803
274a
928e
f467
8610
882
102
198
373
e4be
b0ca
ef12
0a31
7c73
fc56
40ef
284b
1514
2526
3371
374
e5c6
6d51
421e
6f90
b8b7
095d
68f2
c9fa
1411
2929
3749
237
5e7
130e
2ca5
049b
e3ac
db4f
e013
06f9
5017
1827
2517
6337
6e8
b597
edd5
d41b
ce90
4b6d
4176
58c4
bf28
2327
2843
6337
7ea
d453
a063
15bf
c702
ad30
2821
337f
c220
3349
4565
7037
8ea
dfb2
b017
02d2
2f23
e1af
425f
2613
e917
1824
1722
6537
9eb
d879
0e97
fb14
03f7
2224
429d
6f89
e443
3942
4376
8338
0ec
5266
3c2e
836f
ab94
482c
345a
ab9c
5e24
1824
2931
4638
1ed
692a
dcc9
57fb
54a2
4fe6
e0c0
77c1
3267
6266
6711
711
838
2ee
14c8
b9fc
8578
f321
8cd1
da1b
a469
4020
2331
3241
5638
3ee
92d8
5933
b024
e8d8
2e03
ed6a
cbaa
f628
2327
2844
5638
4f0
b820
b966
02eb
7c63
821d
f7ce
fe4c
cd38
3437
3856
135
385
f335
f585
7f2d
30d0
d811
e1b7
32f0
890a
1514
2515
1669
386
f3c7
855a
2bc3
0b9d
02ba
a896
0a11
f2ca
5044
5250
6626
138
7f3
ff94
15de
6bab
4f4c
55d8
6e94
ea1e
8531
931
531
732
741
241
638
8f7
0d18
2ac7
bb3d
398a
e47d
3889
3dc1
e220
118
820
320
525
826
838
9f7
e9e3
3108
373f
9252
7c3a
fd8a
107a
ff23
1922
2937
4839
0f8
b421
94ec
19f3
f5a7
d7ca
edfb
4188
db8
723
2228
168
391
fa5c
5264
f466
8f7a
40f7
576a
27cf
e78b
1717
2531
3968
392
fb9c
492c
daaf
4a6b
e703
2919
c1f3
a8df
2016
1923
3055
039
3fc
b718
4960
449a
6163
21c1
4409
0b3a
a222
1921
2532
5439
4fc
bfb2
34b9
12c8
4e05
2a4a
393c
516c
7826
335
283
263
285
298
395
fdb5
9400
9e2a
a9f7
a70f
5e3c
0b78
cb86
1818
2832
4056
396
fe68
1844
0841
77d1
4a0a
2e5d
9ce9
893e
7788
8994
104
228
397
fe74
2579
bfbd
d885
a81f
a16c
57f7
dcf7
1514
2630
3373
398
febe
af98
1abc
f790
fb2f
77d6
c67c
ed7b
88
2124
3066
399
ff0c
5979
03c6
6d6c
5577
c86c
acde
0baf
3621
3540
5262
400
ff3a
b204
3c7a
9c8d
84ad
785b
b930
1f83
1514
2526
1569
18