65
Shilpi Goel [email protected] Department of Computer Science The University of Texas at Austin Computer Architecture and Program Analysis Formal Verification of x86 Machine-Code Programs

Formal Verification of x86 Machine-Code Programsbyoung/cs429/shilpi-slides.pdf · 2016-11-09 · Heavyweight Formal Verification 10 • Build a mathematical or formal model of programs

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Formal Verification of x86 Machine-Code Programsbyoung/cs429/shilpi-slides.pdf · 2016-11-09 · Heavyweight Formal Verification 10 • Build a mathematical or formal model of programs

Shilpi Goel [email protected]

Department of Computer Science The University of Texas at Austin

Computer Architecture and Program Analysis

Formal Verification of x86 Machine-Code Programs

Page 2: Formal Verification of x86 Machine-Code Programsbyoung/cs429/shilpi-slides.pdf · 2016-11-09 · Heavyweight Formal Verification 10 • Build a mathematical or formal model of programs

Software and Reliability

2

Can we rely on our software systems?

Recent example of a serious bug:

CVE-2016-5195 or “Dirty COW”

• Privilege escalation vulnerability in Linux

• E.g.: allowed a user to write to files intended to be read only

• Copy-on-Write (COW) breakage of private read-only memory mappings

• Existed since around v2.6.22 (2007) and was fixed on Oct 18, 2016

Page 3: Formal Verification of x86 Machine-Code Programsbyoung/cs429/shilpi-slides.pdf · 2016-11-09 · Heavyweight Formal Verification 10 • Build a mathematical or formal model of programs

Formal Verification of Software: Example 1

3

Software Formal Verification: proving or disproving that the implementation of a program meets its specification using mathematical techniques

Page 4: Formal Verification of x86 Machine-Code Programsbyoung/cs429/shilpi-slides.pdf · 2016-11-09 · Heavyweight Formal Verification 10 • Build a mathematical or formal model of programs

Formal Verification of Software: Example 1

3

Software Formal Verification: proving or disproving that the implementation of a program meets its specification using mathematical techniques

Suppose you needed to count the number of 1s in the binary representation of a natural number (population count).

Specification:popcountSpec(v): [v: natural number] if v <= 0 then return 0 else lsb = v & 1 v = v >> 1 return (lsb + popcountSpec(v)) endif

Page 5: Formal Verification of x86 Machine-Code Programsbyoung/cs429/shilpi-slides.pdf · 2016-11-09 · Heavyweight Formal Verification 10 • Build a mathematical or formal model of programs

Formal Verification of Software: Example 1

4Source: Sean Anderson’s Bit-Twiddling Hacks

popcountSpec(v): [v: natural number] if v <= 0 then return 0 else lsb = v & 1 v = v >> 1 return (lsb + popcountSpec(v)) endif

Specification:

Page 6: Formal Verification of x86 Machine-Code Programsbyoung/cs429/shilpi-slides.pdf · 2016-11-09 · Heavyweight Formal Verification 10 • Build a mathematical or formal model of programs

Formal Verification of Software: Example 1

4

Implementation:int popcount_32 (unsigned int v) { v = v - ((v >> 1) & 0x55555555); v = (v & 0x33333333) + ((v >> 2) & 0x33333333); v = ((v + (v >> 4) & 0xF0F0F0F) * 0x1010101) >> 24; return(v); }

Source: Sean Anderson’s Bit-Twiddling Hacks

popcountSpec(v): [v: natural number] if v <= 0 then return 0 else lsb = v & 1 v = v >> 1 return (lsb + popcountSpec(v)) endif

Specification:

Page 7: Formal Verification of x86 Machine-Code Programsbyoung/cs429/shilpi-slides.pdf · 2016-11-09 · Heavyweight Formal Verification 10 • Build a mathematical or formal model of programs

Formal Verification of Software: Example 1

4

Implementation:int popcount_32 (unsigned int v) { v = v - ((v >> 1) & 0x55555555); v = (v & 0x33333333) + ((v >> 2) & 0x33333333); v = ((v + (v >> 4) & 0xF0F0F0F) * 0x1010101) >> 24; return(v); }

Source: Sean Anderson’s Bit-Twiddling Hacks

popcountSpec(v): [v: natural number] if v <= 0 then return 0 else lsb = v & 1 v = v >> 1 return (lsb + popcountSpec(v)) endif

Specification:

Do the specification and implementation behave the same way for all inputs?

Page 8: Formal Verification of x86 Machine-Code Programsbyoung/cs429/shilpi-slides.pdf · 2016-11-09 · Heavyweight Formal Verification 10 • Build a mathematical or formal model of programs

Formal Verification of Software: Example 2

5

Suppose you needed to check if a given natural number is a power of 2.

Specification:

isPowerOfTwoSpec(x): [x: natural number] if x == 0 then return 0 else if x == 1 then return 1 else if remainder(x,2) == 0 then return isPowerOfTwoSpec(x/2) else return 0 endif endif endif

Page 9: Formal Verification of x86 Machine-Code Programsbyoung/cs429/shilpi-slides.pdf · 2016-11-09 · Heavyweight Formal Verification 10 • Build a mathematical or formal model of programs

Formal Verification of Software: Example 2

6

Can you trust your specification?

Source: Sean Anderson’s Bit-Twiddling Hacks

Page 10: Formal Verification of x86 Machine-Code Programsbyoung/cs429/shilpi-slides.pdf · 2016-11-09 · Heavyweight Formal Verification 10 • Build a mathematical or formal model of programs

Formal Verification of Software: Example 2

6

Can you trust your specification?

Correctness of isPowerOfTwoSpec:

1. If isPowerOfTwoSpec(v) returns 1, then there exists a natural number n such that v = 2n.

2. If v = 2n, where n is a natural number, then isPowerOfTwoSpec(v) returns 1.

Source: Sean Anderson’s Bit-Twiddling Hacks

Page 11: Formal Verification of x86 Machine-Code Programsbyoung/cs429/shilpi-slides.pdf · 2016-11-09 · Heavyweight Formal Verification 10 • Build a mathematical or formal model of programs

Formal Verification of Software: Example 2

6

Can you trust your specification?

Correctness of isPowerOfTwoSpec:

1. If isPowerOfTwoSpec(v) returns 1, then there exists a natural number n such that v = 2n.

2. If v = 2n, where n is a natural number, then isPowerOfTwoSpec(v) returns 1.

Implementation:bool powerOfTwo (long unsigned int v) { bool f; f = v && !(v & (v - 1)); return f; }

Source: Sean Anderson’s Bit-Twiddling Hacks

Page 12: Formal Verification of x86 Machine-Code Programsbyoung/cs429/shilpi-slides.pdf · 2016-11-09 · Heavyweight Formal Verification 10 • Build a mathematical or formal model of programs

Formal Verification of Software: Example 2

6

Can you trust your specification?

Correctness of isPowerOfTwoSpec:

1. If isPowerOfTwoSpec(v) returns 1, then there exists a natural number n such that v = 2n.

2. If v = 2n, where n is a natural number, then isPowerOfTwoSpec(v) returns 1.

Implementation:bool powerOfTwo (long unsigned int v) { bool f; f = v && !(v & (v - 1)); return f; }

Do the specification and implementation behave the same way for all inputs?

Source: Sean Anderson’s Bit-Twiddling Hacks

Page 13: Formal Verification of x86 Machine-Code Programsbyoung/cs429/shilpi-slides.pdf · 2016-11-09 · Heavyweight Formal Verification 10 • Build a mathematical or formal model of programs

Inspection of a Program’s Behavior

• Testing:

x����������� ������������������  ����������� ������������������  Exhaustive analysis is infeasible

• Formal Verification:

✓ Wide variety of techniques

‣ Lightweight: e.g., checking if array indices are within bounds

‣ Heavyweight: e.g., proving functional correctness

7

Page 14: Formal Verification of x86 Machine-Code Programsbyoung/cs429/shilpi-slides.pdf · 2016-11-09 · Heavyweight Formal Verification 10 • Build a mathematical or formal model of programs

8

Functional Correctness: RAX = popcountSpec(v)

specification function

popcountSpec(v): [v: unsigned int]

if v <= 0 then return 0 else lsb = v & 1 v = v >> 1 return (lsb + popcountSpec(v)) endif

popcount_64: 89 fa mov %edi,%edx 89 d1 mov %edx,%ecx d1 e9 shr %ecx 81 e1 55 55 55 55 and $0x55555555,%ecx 29 ca sub %ecx,%edx 89 d0 mov %edx,%eax c1 ea 02 shr $0x2,%edx 25 33 33 33 33 and $0x33333333,%eax 81 e2 33 33 33 33 and $0x33333333,%edx 01 c2 add %eax,%edx 89 d0 mov %edx,%eax c1 e8 04 shr $0x4,%eax 01 c2 add %eax,%edx 48 89 f8 mov %rdi,%rax 48 c1 e8 20 shr $0x20,%rax 81 e2 0f 0f 0f 0f and $0xf0f0f0f,%edx 89 c1 mov %eax,%ecx d1 e9 shr %ecx 81 e1 55 55 55 55 and $0x55555555,%ecx 29 c8 sub %ecx,%eax 89 c1 mov %eax,%ecx c1 e8 02 shr $0x2,%eax 81 e1 33 33 33 33 and $0x33333333,%ecx 25 33 33 33 33 and $0x33333333,%eax 01 c8 add %ecx,%eax 89 c1 mov %eax,%ecx c1 e9 04 shr $0x4,%ecx 01 c8 add %ecx,%eax 25 0f 0f 0f 0f and $0xf0f0f0f,%eax 69 d2 01 01 01 01 imul $0x1010101,%edx,%edx 69 c0 01 01 01 01 imul $0x1010101,%eax,%eax c1 ea 18 shr $0x18,%edx c1 e8 18 shr $0x18,%eax 01 d0 add %edx,%eax c3 retq

Example: Pop-Count Program

Page 15: Formal Verification of x86 Machine-Code Programsbyoung/cs429/shilpi-slides.pdf · 2016-11-09 · Heavyweight Formal Verification 10 • Build a mathematical or formal model of programs

9

Case Study: Pop-Count Program

(defthm x86-popcount-64-symbolic-simulation (implies (and (x86p x86) (equal (model-related-error x86) nil) (unsigned-byte-p 64 n) (equal n (read 'register *rdi* x86)) (equal *popcount-64-program* (read 'memory (address-range (read 'pc x86) (len *popcount-64-program*)) x86))) (equal (read 'register *rax* (x86-run *num-of-steps* x86)) (popcountSpec n))))

Page 16: Formal Verification of x86 Machine-Code Programsbyoung/cs429/shilpi-slides.pdf · 2016-11-09 · Heavyweight Formal Verification 10 • Build a mathematical or formal model of programs

Heavyweight Formal Verification

10

Page 17: Formal Verification of x86 Machine-Code Programsbyoung/cs429/shilpi-slides.pdf · 2016-11-09 · Heavyweight Formal Verification 10 • Build a mathematical or formal model of programs

Heavyweight Formal Verification

10

• Build a mathematical or formal model of programs

• Prove theorems about this model in order to establish program properties

Page 18: Formal Verification of x86 Machine-Code Programsbyoung/cs429/shilpi-slides.pdf · 2016-11-09 · Heavyweight Formal Verification 10 • Build a mathematical or formal model of programs

Heavyweight Formal Verification

10

• Build a mathematical or formal model of programs

• Prove theorems about this model in order to establish program properties

ISA model

Page 19: Formal Verification of x86 Machine-Code Programsbyoung/cs429/shilpi-slides.pdf · 2016-11-09 · Heavyweight Formal Verification 10 • Build a mathematical or formal model of programs

Heavyweight Formal Verification

10

• Build a mathematical or formal model of programs

• Prove theorems about this model in order to establish program properties

ISA model

Instruction Set Architecture: interface between hardware and software

- Defines the machine language

- Specification of state (registers, memory), machine instructions,

instruction encodings, etc.

Page 20: Formal Verification of x86 Machine-Code Programsbyoung/cs429/shilpi-slides.pdf · 2016-11-09 · Heavyweight Formal Verification 10 • Build a mathematical or formal model of programs

Heavyweight Formal Verification

10

• Build a mathematical or formal model of programs

• Prove theorems about this model in order to establish program properties

ISA model

Instruction Set Architecture: interface between hardware and software

- Defines the machine language

- Specification of state (registers, memory), machine instructions,

instruction encodings, etc.

• An ISA model specifies the behavior of each machine instruction in terms of effects made to the processor state.

Page 21: Formal Verification of x86 Machine-Code Programsbyoung/cs429/shilpi-slides.pdf · 2016-11-09 · Heavyweight Formal Verification 10 • Build a mathematical or formal model of programs

Heavyweight Formal Verification

10

• Build a mathematical or formal model of programs

• Prove theorems about this model in order to establish program properties

ISA model

Instruction Set Architecture: interface between hardware and software

- Defines the machine language

- Specification of state (registers, memory), machine instructions,

instruction encodings, etc.

• An ISA model specifies the behavior of each machine instruction in terms of effects made to the processor state.

• All high-level programs compile down to machine-code programs. - A program is just a sequence of machine instructions.

Page 22: Formal Verification of x86 Machine-Code Programsbyoung/cs429/shilpi-slides.pdf · 2016-11-09 · Heavyweight Formal Verification 10 • Build a mathematical or formal model of programs

Heavyweight Formal Verification

10

• Build a mathematical or formal model of programs

• Prove theorems about this model in order to establish program properties

ISA model

Instruction Set Architecture: interface between hardware and software

- Defines the machine language

- Specification of state (registers, memory), machine instructions,

instruction encodings, etc.

• An ISA model specifies the behavior of each machine instruction in terms of effects made to the processor state.

• All high-level programs compile down to machine-code programs. - A program is just a sequence of machine instructions.

• We can reason about a program by inspecting the cumulative effects of its constituent instructions on the machine state.

Page 23: Formal Verification of x86 Machine-Code Programsbyoung/cs429/shilpi-slides.pdf · 2016-11-09 · Heavyweight Formal Verification 10 • Build a mathematical or formal model of programs

Why Not Use Abstract Machine Models?

11

Page 24: Formal Verification of x86 Machine-Code Programsbyoung/cs429/shilpi-slides.pdf · 2016-11-09 · Heavyweight Formal Verification 10 • Build a mathematical or formal model of programs

Why Not Use Abstract Machine Models?

11

Page 25: Formal Verification of x86 Machine-Code Programsbyoung/cs429/shilpi-slides.pdf · 2016-11-09 · Heavyweight Formal Verification 10 • Build a mathematical or formal model of programs

Why x86 Machine-Code Verification?

• Why not high-level code verification?

x����������� ������������������  ����������� ������������������  Sometimes, high-level code is unavailable (e.g., malware)

x����������� ������������������  ����������� ������������������  High-level verification frameworks do not address compiler bugs

✓ Verified/verifying compilers can help

x����������� ������������������  ����������� ������������������  But these compilers typically generate inefficient code

x����������� ������������������  Need to build verification frameworks for many high-level languages

• Why x86?

✓ x86 is in widespread use

12

Page 26: Formal Verification of x86 Machine-Code Programsbyoung/cs429/shilpi-slides.pdf · 2016-11-09 · Heavyweight Formal Verification 10 • Build a mathematical or formal model of programs

Overview

Goal: Specify and verify properties of x86 programs - E.g., correctness w.r.t. behavior, security, resource usage, etc.

13

Page 27: Formal Verification of x86 Machine-Code Programsbyoung/cs429/shilpi-slides.pdf · 2016-11-09 · Heavyweight Formal Verification 10 • Build a mathematical or formal model of programs

Overview

Goal: Specify and verify properties of x86 programs - E.g., correctness w.r.t. behavior, security, resource usage, etc.

13

• Program property: statement about a program’s behavior - One state, set of states, relationship between a set of final & initial states

x860 x861

specify: in terms of states of

computation

Page 28: Formal Verification of x86 Machine-Code Programsbyoung/cs429/shilpi-slides.pdf · 2016-11-09 · Heavyweight Formal Verification 10 • Build a mathematical or formal model of programs

Overview

Goal: Specify and verify properties of x86 programs - E.g., correctness w.r.t. behavior, security, resource usage, etc.

13

• Program property: statement about a program’s behavior - One state, set of states, relationship between a set of final & initial states

x860 x861

specify: in terms of states of

computation

• Program’s computation: how the execution of each instruction transforms one state to another

Page 29: Formal Verification of x86 Machine-Code Programsbyoung/cs429/shilpi-slides.pdf · 2016-11-09 · Heavyweight Formal Verification 10 • Build a mathematical or formal model of programs

Overview

Goal: Specify and verify properties of x86 programs - E.g., correctness w.r.t. behavior, security, resource usage, etc.

13

• Program property: statement about a program’s behavior - One state, set of states, relationship between a set of final & initial states

x860 x861

specify: in terms of states of

computation

⤻verify:

reason about symbolic executions

• Program’s computation: how the execution of each instruction transforms one state to another

• Symbolic Executions: a final (or next) x86 state is described in terms of symbolic updates made to the initial x86 state

- Allows consideration of many, if not all, possible executions at once

Page 30: Formal Verification of x86 Machine-Code Programsbyoung/cs429/shilpi-slides.pdf · 2016-11-09 · Heavyweight Formal Verification 10 • Build a mathematical or formal model of programs

Formal Tool Used: ACL2 Theorem-Proving System

14

• ACL2: A Computational Logic for Applicative Common Lisp ︎

- Programming Language

- Mathematical Logic

- ︎Mechanical Theorem Prover

• See ACL2 Home Page for more details.

- Extensive documentation!

- ACL2 Research Group located at GDC 7S

Page 31: Formal Verification of x86 Machine-Code Programsbyoung/cs429/shilpi-slides.pdf · 2016-11-09 · Heavyweight Formal Verification 10 • Build a mathematical or formal model of programs

x86 ISA Model

15

Page 32: Formal Verification of x86 Machine-Code Programsbyoung/cs429/shilpi-slides.pdf · 2016-11-09 · Heavyweight Formal Verification 10 • Build a mathematical or formal model of programs

x86 ISA Model

Interpreter-Style Operational Semantics: x86 ISA model is a machine-code interpreter written in ACL2’s formal logic

- x86 State: specifies the components of the ISA

- Run Function: takes n steps or terminates early if an error occurs

- Step Function: fetches, decodes, and executes one instruction

- Instruction Semantic Functions: specifies instructions’ behavior

16

x860 x861 x86k…Step 1

A Run of the x86 Interpreter that executes k instructions

Step 2 Step k

Page 33: Formal Verification of x86 Machine-Code Programsbyoung/cs429/shilpi-slides.pdf · 2016-11-09 · Heavyweight Formal Verification 10 • Build a mathematical or formal model of programs

Run Function

Recursively defined interpreter that specifies the x86 model

run(n, x86): if n == 0 then return x86 else if model-related error encountered then return x86 else run(n - 1, step(x86)) end if end if

17

Page 34: Formal Verification of x86 Machine-Code Programsbyoung/cs429/shilpi-slides.pdf · 2016-11-09 · Heavyweight Formal Verification 10 • Build a mathematical or formal model of programs

Step Function

State-transition function that corresponds to the execution of a single x86 instruction

step(x86): pc = rip(x86) [prefixes, opcode, ... , imm] = Fetch-and-Decode(pc, x86) case opcode: #x00 -> add-semantic-fn(prefixes, ... , imm, x86) ... ... #xFF -> inc-semantic-fn(prefixes, ... , imm, x86)

18

Page 35: Formal Verification of x86 Machine-Code Programsbyoung/cs429/shilpi-slides.pdf · 2016-11-09 · Heavyweight Formal Verification 10 • Build a mathematical or formal model of programs

Instruction Semantic Functions

• A semantic function describes the effects of executing an instruction. - Input: x86 state and decoded parts of the instruction - Output: next x86 state

• Every instruction has its own semantic function.

19

add-semantic-fn(prefixes, ... , imm, x86): operand1 = getOperand1(prefixes, ... , imm, x86) operand2 = getOperand2(prefixes, ... , imm, x86) resultSum = fix(operand1 + operand2, ...) resultFlags = computeFlags(operand1, operand2, result, x86) x86 = updateState(resultSum, dst, resultFlags) return x86

Page 36: Formal Verification of x86 Machine-Code Programsbyoung/cs429/shilpi-slides.pdf · 2016-11-09 · Heavyweight Formal Verification 10 • Build a mathematical or formal model of programs

Obtaining the x86 ISA Specification

20

~3000 pages~3400 pages

__asm__ volatile ("stc\n\t" // Set CF. "mov $0, %%eax\n\t" // Set EAX = 0. "mov $0, %%ebx\n\t" // Set EBX = 0. "mov $0, %%ecx\n\t" // Set ECX = 0. "mov %4, %%ecx\n\t" // Set CL = rotate_by. "mov %3, %%edx\n\t" // Set EDX = old_cf = 1. "mov %2, %%eax\n\t" // Set EAX = num. "rcl %%cl, %%al\n\t" // Rotate AL by CL. "cmovb %%edx, %%ebx\n\t" // Set EBX = old_cf if CF = 1. // Otherwise, EBX = 0. "mov %%eax, %0\n\t" // Set res = EAX. "mov %%ebx, %1\n\t" // Set cf = EBX. : "=g"(res), "=g"(cf) : "g"(num), "g"(old_cf), "g"(rotate_by) : "rax", "rbx", "rcx", "rdx");

Running tests on x86 machines

Page 37: Formal Verification of x86 Machine-Code Programsbyoung/cs429/shilpi-slides.pdf · 2016-11-09 · Heavyweight Formal Verification 10 • Build a mathematical or formal model of programs

x86 State

21

Vol. 1 3-5

BASIC EXECUTION ENVIRONMENT

• Debug registers — Debug registers expand to 64 bits. See Chapter 17, “Debug, Branch Profile, TSC, and Quality of Service,” in the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A.

• Descriptor table registers — The global descriptor table register (GDTR) and interrupt descriptor table register (IDTR) expand to 10 bytes so that they can hold a full 64-bit base address. The local descriptor table register (LDTR) and the task register (TR) also expand to hold a full 64-bit base address.

3.3 MEMORY ORGANIZATIONThe memory that the processor addresses on its bus is called physical memory. Physical memory is organized as a sequence of 8-bit bytes. Each byte is assigned a unique address, called a physical address. The physical address space ranges from zero to a maximum of 236 − 1 (64 GBytes) if the processor does not support Intel

Figure 3-2. 64-Bit Mode Execution Environment

0

2^64 -1

Sixteen 64-bit

64-bits

64-bits

General-Purpose Registers

Segment Registers

RFLAGS Register

RIP (Instruction Pointer Register)

Address Space

Six 16-bitRegisters

Registers

Eight 80-bitRegisters

Floating-PointData Registers

Eight 64-bitRegisters MMX Registers

XMM RegistersSixteen 128-bitRegisters

16 bits Control Register

16 bits Status Register

64 bits FPU Instruction Pointer Register

64 bits FPU Data (Operand) Pointer Register

FPU Registers

MMX Registers

XMM Registers

32-bits MXCSR Register

Opcode Register (11-bits)

Basic Program Execution Registers

16 bits Tag Register

Focus: Intel’s 64-bit modex860 x861

Source: Intel Manuals

Page 38: Formal Verification of x86 Machine-Code Programsbyoung/cs429/shilpi-slides.pdf · 2016-11-09 · Heavyweight Formal Verification 10 • Build a mathematical or formal model of programs

Vol. 3A 2-3

SYSTEM ARCHITECTURE OVERVIEW

2.1.1 Global and Local Descriptor TablesWhen operating in protected mode, all memory accesses pass through either the global descriptor table (GDT) or an optional local descriptor table (LDT) as shown in Figure 2-1. These tables contain entries called segment descriptors. Segment descriptors provide the base address of segments well as access rights, type, and usage information.

Each segment descriptor has an associated segment selector. A segment selector provides the software that uses it with an index into the GDT or LDT (the offset of its associated segment descriptor), a global/local flag (deter-mines whether the selector points to the GDT or the LDT), and access rights information.

Figure 2-2. System-Level Registers and Data Structures in IA-32e Mode

Local DescriptorTable (LDT)

CR1CR2CR3CR4

CR0 Global DescriptorTable (GDT)

Interrupt DescriptorTable (IDT)

IDTR

GDTR

Interrupt Gate

Trap Gate

LDT Desc.

TSS Desc.

CodeStack

CodeStack

CodeStack

Current TSSCode

Stack

Interr. Handler

Interrupt Handler

Exception Handler

Protected Procedure

TR

Call-GateSegment Selector

Linear Address

PML4

PML4.

Linear Address Space

Linear Addr.

0

Seg. Desc.Segment Sel.

Code, Data or Stack Segment (Base =0)

InterruptVector

Seg. Desc.

Seg. Desc.

NULL

Call Gate

Task-StateSegment (TSS)

Seg. Desc.

NULL

NULL

Segment Selector

Linear Address

Task Register

CR3*

Page

LDTR

This page mapping example is for 4-KByte pagesand 40-bit physical address size.

Register

*Physical Address

Physical Address

CR8Control Register

RFLAGS

OffsetTableDirectory

Page Table

Entry

PhysicalAddr.Page Tbl

Entry

Page Dir.Pg. Dir. Ptr.

PML4 Dir. Pointer

Pg. Dir.Entry

Interrupt GateIST

XCR0 (XFEM)

x86 State

21

Vol. 1 3-5

BASIC EXECUTION ENVIRONMENT

• Debug registers — Debug registers expand to 64 bits. See Chapter 17, “Debug, Branch Profile, TSC, and Quality of Service,” in the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A.

• Descriptor table registers — The global descriptor table register (GDTR) and interrupt descriptor table register (IDTR) expand to 10 bytes so that they can hold a full 64-bit base address. The local descriptor table register (LDTR) and the task register (TR) also expand to hold a full 64-bit base address.

3.3 MEMORY ORGANIZATIONThe memory that the processor addresses on its bus is called physical memory. Physical memory is organized as a sequence of 8-bit bytes. Each byte is assigned a unique address, called a physical address. The physical address space ranges from zero to a maximum of 236 − 1 (64 GBytes) if the processor does not support Intel

Figure 3-2. 64-Bit Mode Execution Environment

0

2^64 -1

Sixteen 64-bit

64-bits

64-bits

General-Purpose Registers

Segment Registers

RFLAGS Register

RIP (Instruction Pointer Register)

Address Space

Six 16-bitRegisters

Registers

Eight 80-bitRegisters

Floating-PointData Registers

Eight 64-bitRegisters MMX Registers

XMM RegistersSixteen 128-bitRegisters

16 bits Control Register

16 bits Status Register

64 bits FPU Instruction Pointer Register

64 bits FPU Data (Operand) Pointer Register

FPU Registers

MMX Registers

XMM Registers

32-bits MXCSR Register

Opcode Register (11-bits)

Basic Program Execution Registers

16 bits Tag Register

Focus: Intel’s 64-bit modex860 x861

Source: Intel Manuals

Page 39: Formal Verification of x86 Machine-Code Programsbyoung/cs429/shilpi-slides.pdf · 2016-11-09 · Heavyweight Formal Verification 10 • Build a mathematical or formal model of programs

Model Validation

How can we know that our model faithfully represents the x86 ISA?

Validate the model to increase trust in the applicability of formal analysis

22

Page 40: Formal Verification of x86 Machine-Code Programsbyoung/cs429/shilpi-slides.pdf · 2016-11-09 · Heavyweight Formal Verification 10 • Build a mathematical or formal model of programs

Symbolic Execution

23

Page 41: Formal Verification of x86 Machine-Code Programsbyoung/cs429/shilpi-slides.pdf · 2016-11-09 · Heavyweight Formal Verification 10 • Build a mathematical or formal model of programs

Supporting Symbolic Execution

24

Rules (theorems) describing interactions between these reads and writes to the x86 state enable symbolic execution of programs.

add %edi, %eax je 0x400304

1. read instruction from mem

2. read operands

3. write sum to eax

4. write new value to flags

5. write new value to pc

1. read instruction from mem

2. read flags

3. write new value to pc

Page 42: Formal Verification of x86 Machine-Code Programsbyoung/cs429/shilpi-slides.pdf · 2016-11-09 · Heavyweight Formal Verification 10 • Build a mathematical or formal model of programs

25

y

memory

non-interference

Program Order

i j

Read-over-Write Theorem #1

Page 43: Formal Verification of x86 Machine-Code Programsbyoung/cs429/shilpi-slides.pdf · 2016-11-09 · Heavyweight Formal Verification 10 • Build a mathematical or formal model of programs

25

y

Wi(x) memory

non-interference

Program Order

x

i j

Read-over-Write Theorem #1

Page 44: Formal Verification of x86 Machine-Code Programsbyoung/cs429/shilpi-slides.pdf · 2016-11-09 · Heavyweight Formal Verification 10 • Build a mathematical or formal model of programs

25

y

Wi(x)

Rj: y

memory

non-interference

Program Order

x

i j

Read-over-Write Theorem #1

Page 45: Formal Verification of x86 Machine-Code Programsbyoung/cs429/shilpi-slides.pdf · 2016-11-09 · Heavyweight Formal Verification 10 • Build a mathematical or formal model of programs

26

memory

i

overlap

Read-over-Write Theorem #2

Program Order

Page 46: Formal Verification of x86 Machine-Code Programsbyoung/cs429/shilpi-slides.pdf · 2016-11-09 · Heavyweight Formal Verification 10 • Build a mathematical or formal model of programs

26

Wi(x) memory

x

i

overlap

Read-over-Write Theorem #2

Program Order

Page 47: Formal Verification of x86 Machine-Code Programsbyoung/cs429/shilpi-slides.pdf · 2016-11-09 · Heavyweight Formal Verification 10 • Build a mathematical or formal model of programs

26

Wi(x)

Ri: x

memory

x

i

overlap

Read-over-Write Theorem #2

Program Order

Page 48: Formal Verification of x86 Machine-Code Programsbyoung/cs429/shilpi-slides.pdf · 2016-11-09 · Heavyweight Formal Verification 10 • Build a mathematical or formal model of programs

27

memory

independent writes commute safely

i j

Program Order

Write-over-Write Theorem #1

Page 49: Formal Verification of x86 Machine-Code Programsbyoung/cs429/shilpi-slides.pdf · 2016-11-09 · Heavyweight Formal Verification 10 • Build a mathematical or formal model of programs

27

memory

independent writes commute safely

Wi(x)

i j

x

Program Order

Write-over-Write Theorem #1

Page 50: Formal Verification of x86 Machine-Code Programsbyoung/cs429/shilpi-slides.pdf · 2016-11-09 · Heavyweight Formal Verification 10 • Build a mathematical or formal model of programs

27

memory

independent writes commute safely

Wi(x)

i j

x y

Wj(y)

Program Order

Write-over-Write Theorem #1

Page 51: Formal Verification of x86 Machine-Code Programsbyoung/cs429/shilpi-slides.pdf · 2016-11-09 · Heavyweight Formal Verification 10 • Build a mathematical or formal model of programs

27

=

memory

independent writes commute safely

memory

Wi(x)

i j

x y

Wj(y)

i j

Program Order

Write-over-Write Theorem #1

Program Order

Page 52: Formal Verification of x86 Machine-Code Programsbyoung/cs429/shilpi-slides.pdf · 2016-11-09 · Heavyweight Formal Verification 10 • Build a mathematical or formal model of programs

27

=

memory

independent writes commute safely

memory

Wi(x)

i j

x y

Wj(y)

i j

Wj(y)

y

Program Order

Write-over-Write Theorem #1

Program Order

Page 53: Formal Verification of x86 Machine-Code Programsbyoung/cs429/shilpi-slides.pdf · 2016-11-09 · Heavyweight Formal Verification 10 • Build a mathematical or formal model of programs

27

=

memory

independent writes commute safely

memory

Wi(x)

i j

x y

Wj(y)

i j

Wj(y)

Wi(x)

x y

Program Order

Write-over-Write Theorem #1

Program Order

Page 54: Formal Verification of x86 Machine-Code Programsbyoung/cs429/shilpi-slides.pdf · 2016-11-09 · Heavyweight Formal Verification 10 • Build a mathematical or formal model of programs

28

memory

visibility of writes

i

Write-over-Write Theorem #2

Program Order

Page 55: Formal Verification of x86 Machine-Code Programsbyoung/cs429/shilpi-slides.pdf · 2016-11-09 · Heavyweight Formal Verification 10 • Build a mathematical or formal model of programs

28

memory

visibility of writes

Wi(x)

i

x

Write-over-Write Theorem #2

Program Order

Page 56: Formal Verification of x86 Machine-Code Programsbyoung/cs429/shilpi-slides.pdf · 2016-11-09 · Heavyweight Formal Verification 10 • Build a mathematical or formal model of programs

28

memory

visibility of writes

Wi(x)

i

Wi(y)

y

Write-over-Write Theorem #2

Program Order

Page 57: Formal Verification of x86 Machine-Code Programsbyoung/cs429/shilpi-slides.pdf · 2016-11-09 · Heavyweight Formal Verification 10 • Build a mathematical or formal model of programs

28

=

memory

visibility of writes

memory

Wi(x)

i

Wi(y)

i

y

Write-over-Write Theorem #2

Program Order

Program Order

Page 58: Formal Verification of x86 Machine-Code Programsbyoung/cs429/shilpi-slides.pdf · 2016-11-09 · Heavyweight Formal Verification 10 • Build a mathematical or formal model of programs

28

=

memory

visibility of writes

memory

Wi(x)

i

Wi(y)

i

Wi(y)

y

y

Write-over-Write Theorem #2

Program Order

Program Order

Page 59: Formal Verification of x86 Machine-Code Programsbyoung/cs429/shilpi-slides.pdf · 2016-11-09 · Heavyweight Formal Verification 10 • Build a mathematical or formal model of programs

Symbolic Execution

29

(implies (preconditions loc val x86) (let ((old-rbx (read 'register *rbx* x86)) (old-pc (read 'pc x86))) (equal (x86-run (clk) x86) (write 'register *rax* old-rbx (write 'pc (+ 18 old-pc) (write 'memory loc val x86))))))

These read-over-write and write-over-write lemmas operate on symbolic expressions that describe the program’s behavior.

Also, we can project out relevant parts of the resulting state.

Page 60: Formal Verification of x86 Machine-Code Programsbyoung/cs429/shilpi-slides.pdf · 2016-11-09 · Heavyweight Formal Verification 10 • Build a mathematical or formal model of programs

Conclusions

30

Page 61: Formal Verification of x86 Machine-Code Programsbyoung/cs429/shilpi-slides.pdf · 2016-11-09 · Heavyweight Formal Verification 10 • Build a mathematical or formal model of programs

What I Haven’t Talked About Today…

31

1. How to prove theorems using a mechanical theorem prover - Useful to reason about both hardware and software - Fall’16 Grad-level Course: Programming Languages - Spring’17 Grad-level Course: Recursion and Induction

2. Supervisor-mode features of the x86 ISA - Useful for developing and analyzing kernel programs - An advanced architecture class - An OS class

Page 62: Formal Verification of x86 Machine-Code Programsbyoung/cs429/shilpi-slides.pdf · 2016-11-09 · Heavyweight Formal Verification 10 • Build a mathematical or formal model of programs

Opportunities for Future Research

32

Operating System Verification

detect reliance on non-portable or undefined behaviors

User-friendly Program Analysis

automate the discovery of preconditions

Multi-process/threaded Program Verification

reason about concurrency-related issues

Reasoning about the Memory System

determine if caches are (mostly) transparent, as intended

Firmware Verification

formally specify software/hardware interfaces

Micro-architecture Verification

x86 ISA model serves as a build-to specification

Page 63: Formal Verification of x86 Machine-Code Programsbyoung/cs429/shilpi-slides.pdf · 2016-11-09 · Heavyweight Formal Verification 10 • Build a mathematical or formal model of programs

Resources

33

• See ACL2 Home Page • Talk to people on GDC 7S • See some publications

We have exciting research and engineering projects in this area! Please feel free to email if you want to know more.

Page 64: Formal Verification of x86 Machine-Code Programsbyoung/cs429/shilpi-slides.pdf · 2016-11-09 · Heavyweight Formal Verification 10 • Build a mathematical or formal model of programs

Publications

Shilpi Goel, Warren A. Hunt, Jr., and Matt Kaufmann. Abstract Stobjs and Their Application to ISA Modeling. In Proceedings of the ACL2 Workshop 2013, EPTCS 114, pp. 54-69, 2013

Shilpi Goel and Warren A. Hunt, Jr. Automated Code Proofs on a Formal Model of the x86. In Verified Software: Theories, Tools, Experiments (VSTTE’13), volume 8164 of Lecture Notes in Computer Science, pages 222– 241. Springer Berlin Heidelberg, 2014

Shilpi Goel, Warren A. Hunt, Jr., Matt Kaufmann, and Soumava Ghosh. Simulation and Formal Verification of x86 Machine-Code Programs That Make System Calls. In Proceedings of the 14th Conference on Formal Methods in Computer-Aided Design (FMCAD’14), pages 18:91–98, 2014

Shilpi Goel, Warren A. Hunt, Jr., and Matt Kaufmann. Engineering a Formal, Executable x86 ISA Simulator for Software Verification. In Provably Correct Systems (ProCoS), 2015

Shilpi Goel. Formal Verification of Application and System Programs Based on a Validated x86 ISA Model. Ph.D. Dissertation, The University of Texas at Austin, 2016

34