Inside VMProtect - WebTV Université de Lille

Preview:

Citation preview

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

Inside VMProtect

Samuel Chevet

16 January 2015

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

Agenda

Describe what VMProtect isIntroduce code virtualization in software protectionMethods for circumventionVM logic

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

Warning

Some assumptions are made in this presentationOnly few binaries have been studiedMostly 64 bits target

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

Plan

1 Introduction

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

Software-based protection

Content of the executable’s sections is encryptedand/or compressedAppend new code for decrypting/decompressing thesectionsAdd all kinds of anti-debug, anti-vm, . . .Executable’s entrypoint is redirected into this newcodeExecution is transferred back to the originalentrypoint after decrypt/decomp

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

VMProtect

Memory protectionAllows protection of the file image in memory fromany changesIntegrity is checked before giving execution to theoriginal entry point

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

VMProtect

Import protectionAll entries used by the original binary are removedfrom Import TableAppend code redirection for API callReplace CALL DWORD PTR[@IAT] / CALLQWORD PTR[@IAT] (Encoded on 6 bytes)By CALL VMProtect.section (Encode on 5 bytes)

1 byte left: two variationsBefore: Fake push (Stack will be readjusted duringredirection)After: Dead code (Increment the return addressduring redirection)

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

VMProtect

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

VMProtect

Resource protectionEncrypt resources: except icons, manifest and someother system typesHook:

LoadStringA/WLdrFindResource_ULdrAccessResource

License managerTrack your sales online and manage serial numbersI have never worked on it

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

Code virtualization

In simple packer native code is simply encryptedand/or compressedDisassemble native code and compile it intoproprietary bytecodeExecuted in a custom interpreter at run-timeInterpreter: Fetch, Decode, ExecuteOriginal native code has disappearedEfficient way for anti-reverse engineering

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

Code virtualization

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

Code virtualization

VM must fully reproduce correctly CPU instructionsSave/Restore correctly the context of the applicationbefore/after emulationCare about correct result in EFLAGS|RFLAGSAny error in the emulation is not acceptable

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

VMProtect

VMProtect doesn’t decrypt the code at allNative code is compiled into a proprietarypolymorphic bytecodeFrom one binary to another one, VM will not be thesameOr even different VM inside the same binary!

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

Agenda

Questions?What is the architecture of the virtual CPUgenerated?Is the VM generated randomly?VM bytecode obfuscated?Difficult to recontruct original bytecode?. . .

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

Plan

2 Internal

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

Virtual machine architecture

Virtualization obfuscator is Reduced Instruction SetComputing (RISC)One Complex Instruction Set Computing (CISC)instruction will be translated in multiple virtualizedinstructions

lea ecx, [ecx + ebx * 4 + 42]

Translated into several virtual instructions1 Fetch ebx2 Multiply ebx by 43 Fetch ecx4 Add theses two registers5 Add 426 Store result in ecx

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

Virtual machine architecture

Language used by virtualization obfuscator isStack-BasedA stack machine implements registers with a stackThe operands of the arithmetic logic unit (ALU) arealways the top two registers of the stackResult from the ALU is stored in the top register ofthe stackReconstruction original native code will involveremoving stack machine feature

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

Virtual machine context

Before entering into the virtualization obfuscator,host’s registeres and flags must be saved into VM’scontext structure

VMProtect context structure8/16 for VM-registers2 for Relocation-Difference andSECURITY_CONSTANT6 for temporal usage (mostly EFLAGS|RFLAGS)0x80 bytes free for pushed variables

sub esp, 0C0h ; 32bitsub rsp, 140h ; 64bit

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

Interesting fact

Register EDI|RDI holds VM contextRegister EBP|RBP holds VM stack

Maximum value of RBP|EBP64 bit: RDI + 0xE0, 32 bit: EDI + 0x50If this value is reached, reserve more space on thestack and copy VM context and pushed variables

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

Interesting fact

Register EDI|RDI holds VM contextRegister EBP|RBP holds VM stack

Maximum value of RBP|EBP64 bit: RDI + 0xE0, 32 bit: EDI + 0x50If this value is reached, reserve more space on thestack and copy VM context and pushed variables

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

VMProtect Context

VM context is accessed by EDI|RDI (via memlocation)Index register is EDI|RDIIndex base is stored in opcode operands (can beencrypted, see later)From one VM to another, VM registers will not bestored at the same index!It makes VM context totally random

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

Virtual machine implementations

VM LoopRead the bytecode at instruction pointerCompute opcode handlerCall the handlerCan have two variations

Down-read VM-BytesUp-read VM-Bytes

ESI|RSI: VM instruction pointer

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

Bytecode encryption

If encryption key is presentStart of code virtualization depends on anencryption keyVM Loop depends on this key to decrypt opcodeHandler depends on this key to decrypt operandsKey is updated during VM Loop and opcodehandler executionImpossible to study code virtualization at a chosenpointEBX|RBX holds the encryption key

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

Logical & Arithmetic operations

Some logical and arithmetic opcode handler mustcare of EFLAGS|RFLAGSEach of them has code to store them after theoperation; ... operationpushfqpop qword ptr [rbp+0]

After such handler, VM will call an handler to POPthem in VM registerGUESS: there is VM opcode pairs

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

VM Block Start

Push all registers, and EFLAGS|RFLAGSOrder is totally randomPush SECURITY_CONSTANTPush Relocation-DifferenceDecrypt SECURITY_CONSTANTStore all pushed registers, flags & others into VMcontextIndex in VM context is totally random

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

VM Block End

Push from VM context registers to stackIF VM_EXIT

Pop all registers and EFLAGS|RFLAGS and returnELSE

Encrypt SECURITY_CONSTANTPush SECURITY_CONSTANTPush Relocation-DifferenceJump to next VM_Block

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

Internal registers

What we knowEBX|RBX: encryption keyEDI|RDI: VM contextESI|RSI: VM instruction pointerEBP|RBP: VM stackEDX|RDX: arithmetic/result operation of handleraddressEAX|RAX: opcode valueR13: relocation-differenceR12: opcode handler table

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

Plan

3 Analysis

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

Dynamic analysis

Now that we know how it "works"Before using symbolic execution to solve thisproblemWe have to write an "intelligent code tracer"So we will be sure our symbolic execution is notbuggy

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

Dynamic analysis

Trace full execution will take too much timeLocate the VM LoopInject DLL that setup a HBP on execution at VM LoopStore in DB:

VM StackVM Context

Make a local WebService to output result (Diffbetween two states on VM_STACK, VM_CONTEXT)Initialize VM Context with default value

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

Dynamic analysis

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

Dynamic analysis

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

Static analysis

Now we can know the VM context at any point(perfect for debugging)We want to be able to reconstruct original bytecodeAutomate taskUse metasm framework(https://github.com/jjyg/metasm)

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

Metasm

Ruby open source frameworkAssembler, disassembler, compiler, linker, . . .Description of the semantics for each instructionAllowing us to compute the semantic of a set ofinstructionscode_binding

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

Metasm

code_binding examplerax => (byte ptr [rsi-1]&0ffffff00h)|((((((byte ptr [rsi-1]>>1)&7fffff80h)|(((((byte ptr [rsi-1]>>1)&7fffff80h)|((((byte ptr [rsi-1]&0ffffffffh)-(rbx&0ffh))>>1)&7fh)^7fffffffh)&7fh))^30h)&7fh)|((((byte ptr [rsi-1]&0ffffffffh)-(rbx&0ffh))<<7)&80h))rdx => qword ptr [rbp]&0ffffffffffffffffhrbx => (rbx&0ffffffffffffff00h)|((rbx-((((((byte ptr [rsi-1]>>1)&7fffff80h)|(((((byte ptr [rsi-1]>>1)&7fffff80h)|((((byte ptr [rsi-1]&0ffffffffh)-(rbx&0ffh))>>1)&7fh)^7fffffffh)&7fh))^30h)&7fh)|((((byte ptr [rsi-1]&0ffffffffh)-(rbx&0ffh))<<7)&80h)))&0ffh)rbp => (rbp+8)&0ffffffffffffffffhrsi => (rsi-1)&0ffffffffffffffffh

Just need to replace (inject) inside expression theknown value, so expression can be reduced

RSI: bytecode_ptrRBX: encryption keyRBP: VM_stack

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

VM’s symbolic internal

VM symbolic is hugeAll VM registers must implem each size of operand(byte, word, dword, qword)VM context contains lot of internals registers

vm_symbolism = {:rax => :opcode,:rbx => :vmkey,:rsi => :bytecode_ptr,:rbp => :vm_stack,Indirection[[:vm_stack], 8, nil] => :QWORD_OP_1,...Indirection[[:vm_stack, :+, 0x8], 8, nil] => :QWORD_OP_02,Indirection[[:rdi], 8, nil] => :qword_vm_r0,Indirection[[:rdi, :+, 0x8], 1, nil] => :byte_vm_r0,...Indirection[[:rdi, :+, 0x8], 8, nil] => :qword_vm_r1,...Indirection[[:rdi, :+, 0x18], 8, nil] => :qword_vm_r3,...

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

Static analysis

Setup start context of the VMDisassemble and compute semantic of the currentopcode handlerCompute next state with solved semanticsLoop if not VM_EXIT

ProblemWe need to know the end address for thecode_bindingCheck list of basic block, if one basic block match thecheck on maximum value of VM_STACK (EBP) =>STOP ADDRIf not we are back to the VM_LOOP or RETN(VM_EXIT)

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

Static analysis

Setup start context of the VMDisassemble and compute semantic of the currentopcode handlerCompute next state with solved semanticsLoop if not VM_EXIT

ProblemWe need to know the end address for thecode_bindingCheck list of basic block, if one basic block match thecheck on maximum value of VM_STACK (EBP) =>STOP ADDRIf not we are back to the VM_LOOP or RETN(VM_EXIT)

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

Static analysis

VM Context requirement at startBytecode pointer start address: arg_00 ofVM_ENTRY (constant unfolding can be applied onit)Key stored in EBX|RBX necessary to decrypt bytcodeis equal to original PE ImageBase + RVA of bytecodepointerOpcode handler table (normally stored in r12)

With our dynamic analysis we know those 3parameters at any point!

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

Static analysis

Remove native register / not interesting VM registerfrom solved binding

Keep only operation on EDI|RDI or VM_STACK

Thanks to the RISC architecture and stack-basedlanguageCheck if VM_STACK has been incremented ordecremented

[+] solved bindingqword ptr [rsp] => 140087e62hdword ptr [rsp-8] => dword ptr [rsp-8]+0e4018d37hQWORD_IMM => 0ffffffffe4018d37h # Indirection[[:vm_stack, :+, -0x8], 8, nil] => :QWORD_IMMvirt_rax => 0ffffffffe4018d37hvm_stack => (vm_stack-8)&0ffffffffffffffffhbytecode_ptr => 140087e01h

########### After Remove registerQWORD_IMM => 0ffffffffe4018d37hvm_stack => (vm_stack-8)&0ffffffffffffffffh

[DISAS]: PUSH 0XFFFFFFFFE4018D37

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

Static analysis

With that we can start to disassemble the wholebytecode VMCheck "Pratical Reverse Engineering" (Chapter 5) forcomplete example on how to use metasm

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

Plan

4 VM Logic

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

VM Logic

ReminderLangage used is stack-basedNext opcode after logical or arithmetic operation willstore EFLAGS|RFLAGS inside VM context

For all the following slides we will use the followingsyntax:

QWORD_OP_1: [RBP + 0] ; operand 01QWORD_OP_2: [RBP + 8] ; operand 02

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

Disassembler

...[DISAS]: PUSH 0xC2666C77B83B1153[DISAS]: PUSH vm_r6[DISAS]: PUSH 0x000000014014A631[DISAS]: ADD QWORD_OP_1, QWORD_OP_2[DISAS]: POP vm_r2[DISAS]: MOV QWORD_OP_1, [QWORD_OP_1][DISAS]: ADD QWORD_OP_1, QWORD_OP_2[DISAS]: POP vm_r14...

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

Optimization

We need to remove stack machine "feature"Replace push ; pop by assignement statementTrack stack pointerCheck if the destination size match!

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

VM Logic

All push, pop with all different size & mem derefAddDiv, IdivMulRcl, RcrShl, ShrShld, Shrd

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

VM Logic

Inside VM handlers, operation likeAND|SUB|OR|NOT seems not supportedIn fact all those operations are managed by onehandler "NOR" logical gate:

Native semantic of this handlerNOT QWORD_OP_1NOT QWORD_OP_2AND QWORD_OP_1, QWORD_OP_2MOV QWORD_OP_2, QWORD_OP_1MOV QWORD_OP_1, RFLAGS

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

VM Logic

Lot of logical instruction will use this "NOR" logicalgate handler:

NOT(OP_00) = NOR(OP_00, OP_00)AND(OP_00, OP_01) = NOR(NOT(OP_00),NOT(OP_01))XOR(OP_00, OP_01) = NOR(NOR(OP_00, OP_01),AND(OP_00, OP_01))SUB(OP_00, OP_01) = NOR(ADD(OP_01,NOT(OP_01))). . .

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

VM Logic

VM_ADC(OP_00, OP_01) = VM_ADD(OP_00,(OP_01 + CARRY))VM_SUB(OP_00, OP_01) = VM_NOT(VM_ADD(B,VM_NOT(A)))VM_CMP = VM_SUBVM_NEG(OP_00) = VM_SUB(0, OP_00). . .Original bytecode is sometimes converted to morethan 50 VM opcodes . . .

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

VM block entry

VM block entryPOP REG ; relocation-differencePUSH IMMEDIATEADD QWORD_OP_1, QWORD_OP_2 ; compute security constantPOP REG ; flagsPOP REG ; pop result...; POP ALL HOST REGISTER (SAVE CONTEXT)...

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

VM conditional jump

VM jcc1 Push two vm_offset2 Push VM_stack3 Convert EFLAGS|RFLAGS for adjustement 0 or 4|84 Adjust pointer from result (ADD operation)5 Prepare to load next vm block from [VM_STACK]

We will have to reconstruct JCC correctly

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

VM CRC

VM CRCThere is a special opcode for making CRCOp_01: Mem pointer, Op_02: SizeCheck VM integrity, executable integrityCollision :)

rcx = rax = 0;for (i = 0; i < Size; i++) {

rcx = raxrcx = rcx >> 0x19rax = (rax << 0x07) | rcxrax = (rax & 0xFFFFFF00) | (rax & 0xFF) ^ buf[i]

}

Compared with SECURITY_CONSTANTFound the same checksum in all samples

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

VM CPUID

VM CPUIDThere is a special opcode for making CPUIDinstructionOp_01: ValueSave 0x0C on VM_STACK (EBP) for storing eax, ebx,ecx, edx

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

Listing opcodes

Try to compute set of all list of opcodes to reconstructthe correct original oneReally long task, I didn’t have finish it at this time

1 bored2 need more samples

The mapping between the set of VM bytecode andoriginal one will work directly on all binaries

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

Plan

5 Conclusion

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

Conclusion

Always the same architecture: RISC + stack machineVirtualMachine are generated in a random wayDifficult to make a static disassembler, prefer to usesymbolic executionBefore having the question: no toolz is going to bereleasedVMProtect is a cool challenge (start by 64 bits binary,"obfuscation" is not difficult)

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

Questions ?

Thank you for your attention

Recommended