Self-defending software: Automatically patching security vulnerabilities Michael Ernst University of...

Preview:

Citation preview

Self-defending software:Automatically patching security vulnerabilities

Michael Ernst

University of Washington

Demo

Automatically fix previously unknown

security vulnerabilities in COTS software.

Goal: Security for COTS software

Requirements:

1. Preserve functionality

2. Protect against unknown vulnerabilities

3. COTS (commercial off-the-shelf) software

Key ideas:

4. Learn from failure

5. Leverage the benefits of a community

1. Preserve functionality

• Maintain continuity: application continues to operate despite attacks

• For applications with high availability requirements– Important for mission-critical applications– Web servers, air traffic control, communications

• Technique: create a patch (repair the application)• Previous work: crash the program when attacked

– Loss of data, overhead of restart, attack recurs

2. Unknown vulnerabilities

• Proactively prevent attacks via unknown vulnerabilities– “Zero-day exploits”– No pre-generated signatures– No time for human reaction– Works for bugs as well as attacks

3. COTS software

• No modification to source or executables

• No cooperation required from developers– No source information (no debug symbols)– Cannot assume built-in survivability features

• x86 Windows binaries

4. Learn from failure

• Each attack provides information about the underlying vulnerability

• Detect all attacks (of given types)– Prevent negative consequences– First few attacks may crash the application

• Repairs improve over time– Eventually, the attack is rendered harmless– Similar to an immune system

5. Leverage a community

• Application community = many machines running the same application (a monoculture)– Convenient for users, sysadmins, hackers

• Problem: A single vulnerability can be exploited throughout the community

• Solution: Community members collaborate to detect and resolve problems– Amortized risk: a problem in some leads to a solution

for all– Increased accuracy: exercise more behaviors during

learning– Shared burden: distribute the computational load

Evaluation

• Adversarial Red Team exercise– Red Team was hired by DARPA– Had access to all source code, documents, etc.

• Application: Firefox (v1.0)– Exploits: malicious JavaScript, GC errors, stack

smashing, heap buffer overflow, uninitialized memory• Results:

– Detected all attacks, prevented all exploits– For 7/10 exploits, generated a patch that maintained

functionality• After an average of 8 attack instances

– No false positives– Low steady-state overhead

Collaborative learning approach

Monitoring: Detect attacksPluggable detector, does not depend on learning

Learning: Infer normal behavior from successful executions

Repair: Propose, evaluate, and distribute patches based on the behavior of the attack

All modes run on each member of the communityModes are temporally and spatially overlapped

Monitoring

RepairLearning

normal execution attack (or bug)

Model of behavior

Collaborative learning approach

Monitoring: Detect attacksPluggable detector, does not depend on learning

Learning: Infer normal behavior from successful executions

Repair: Propose, evaluate, and distribute patches based on the behavior of the attack

All modes run on each member of the communityModes are temporally and spatially overlapped

Monitoring

RepairLearning

normal execution attack (or bug)

Model of behavior

Collaborative learning approach

Monitoring: Detect attacksPluggable detector, does not depend on learning

Learning: Infer normal behavior from successful executions

Repair: Propose, evaluate, and distribute patches based on the behavior of the attack

All modes run on each member of the communityModes are temporally and spatially overlapped

Monitoring

RepairLearning

normal execution attack (or bug)

Model of behavior

Collaborative learning approach

Monitoring: Detect attacksPluggable detector, does not depend on learning

Learning: Infer normal behavior from successful executions

Repair: Propose, evaluate, and distribute patches based on the behavior of the attack

All modes run on each member of the communityModes are temporally and spatially overlapped

Monitoring

RepairLearning

normal execution attack (or bug)

Model of behavior

Collaborative learning approach

Monitoring: Detect attacksPluggable detector, does not depend on learning

Learning: Infer normal behavior from successful executions

Repair: Propose, evaluate, and distribute patches based on the behavior of the attack

All modes run on each member of the communityModes are temporally and spatially overlapped

Monitoring

RepairLearning

normal execution attack (or bug)

Model of behavior

Structure of the system

(Server may bereplicated,

distributed, etc.)

Encrypted, authenticated communication

Server

Community machines

Threat model does not (yet!) include malicious nodes

LearningCommunity machines

Server distributes learning tasks

Observe normal behavior

Server

Learning

…copy_len ≤ buff_size…

Clients send inference results

Server

Community machines

Server generalizes(merges results)

Clients do local inference

Generalize observed behavior

copy_len < buff_size

copy_len ≤ buff_size

copy_len = buff_size

Monitoring

Detector collects information and terminates application

Server

Community machinesDetect attacksAssumption: few false positives

Detectors used in our research:– Code injection (Memory Firewall)– Memory corruption (Heap Guard)

Many other possibilities exist

Monitoring

Server

Violated: copy_len ≤ buff_size

Instrumentation continuously evaluates learned behavior

What was the effect of the attack?

Community machines

Clients send difference in behavior: violated constraints

Server correlates constraints to attack

Repair

Candidate patches:1. Set copy_len = buff_size2. Set copy_len = 03. Set buff_size = copy_len4. Return from procedure

Server

Propose a set of patches for eachbehavior correlated to attack

Community machines

Correlated: copy_len ≤ buff_size

Server generatesa set of patches

Repair

Server Patch

1

Patch 3

Patch 2

Distribute patches to the community

Community machines

Ranking:Patch 1: 0Patch 2: 0Patch 3: 0…

Repair

Ranking:Patch 3: +5Patch 2: 0Patch 1: -5…

Server

Patch 1 failed

Patch 3 succeeded

Evaluate patchesSuccess = no detector is triggered

When attacked, clients send outcome to server

Community machines

Detector is still running on clientsServer ranks

patches

Repair

Server

Patch 3

Server redistributes the most effective patches

Redistribute the best patches Community machines

Ranking:Patch 3: +5Patch 2: 0Patch 1: -5…

Outline

• Overview

• Learning: create model of normal behavior

• Monitoring: determine properties of attacks

• Repair: propose and evaluate patches

• Evaluation: adversarial Red Team exercise

• Conclusion

Learning

Clients send inference results

Community machines

Server generalizes(merges results)

Clients do local inference

Generalize observed behavior

…copy_len ≤ buff_size…

Server

copy_len < buff_size

copy_len ≤ buff_size

copy_len = buff_size

Dynamic invariant detection

• Idea: generalize observed program executions

• Initial candidates = all possible instantiations of constraint templates over program variables

• Observed data quickly falsifies most candidates• Many optimizations for accuracy and speed

– Data structures, static analysis, statistical tests, …

copy_len < buff_sizecopy_len ≤ buff_sizecopy_len = buff_sizecopy_len ≥ buff_sizecopy_len > buff_sizecopy_len ≠ buff_size

copy_len: 22buff_size: 42

copy_len < buff_sizecopy_len ≤ buff_sizecopy_len = buff_sizecopy_len ≥ buff_sizecopy_len > buff_sizecopy_len ≠ buff_size

Candidate constraints: Remaining candidates:Observation:

Quality of inference results

• Not sound– Overfitting if observed executions are not

representative– Does not affect attack detection– For repair, mitigated by correlation step– Continued learning improves results

• Not complete– Templates are not exhaustive

• Useful!– Sound in practice– Complete enough

Enhancements to learning

• Relations over variables in different parts of the program– Flow-dependent constraints

• Distributed learning across the community

• Filtering of constraint templates, variables

Learning for Windows x86 binaries

• No static or dynamic analysis to determine validity of values and pointers– Use expressions computed by the application– Can be more or fewer variables (e.g., loop invariants)– Static analysis to eliminate duplicated variables

• New constraint templates (e.g., stack pointer)• Determine code of a function from its entry point

Outline

• Overview

• Learning: create model of normal behavior

• Monitoring: determine properties of attacks– Detector details– Learning from failure

• Repair: propose and evaluate patches

• Evaluation: adversarial Red Team exercise

• Conclusion

Detecting attacks (or bugs)

Code injection (Determina Memory Firewall)– Triggers if control jumps to code that was not

in the original executable

Memory corruption (Heap Guard)– Triggers if sentinel values are overwritten

These have low overhead and no false positives

Goal: detect problems close to their source

Other possible attack detectors

Non-application-specific:– Crash: page fault, assertions, etc.– Infinite loop

Application-specific checks:– Tainting, data injection– Heap consistency– User complaints

Application behavior:– Different failures/attacks– Unusual system calls or code execution– Violation of learned behaviors

Collection of detectors with high false positivesNon-example: social engineering

Learning from failures

Each attack provides information about the underlying vulnerability– That it exists– Where it can be exploited– How the exploit operates– What repairs are successful

Monitoring

Server

Community machinesDetect attacksAssumption: few false positives

Detector collects information and terminates application

Monitoring

Server

Community machines

scanfread_inputprocess_recordmain

Detector collects information and terminates application

Client sends attack info to server

Where did the attack happen?

(Detector maintains a shadow call stack)

Monitoring

Clients install instrumentation

Server

scanfread_inputprocess_recordmain

Server generates instrumentation for targeted code locations

Server sends instru-mentation to all clients

Monitoring for main,

process_record, …

Extra checking in attacked codeEvaluate learned constraints

Community machines

Monitoring

Server

Violated: copy_len ≤ buff_size

Instrumentation continuously evaluates inferred behavior

What was the effect of the attack?

Community machines

Clients send difference in behavior: violated constraints

Server correlates constraints to attack

Correlated: copy_len ≤ buff_size

Correlating attacks and constraints

Evaluate only constraints at the attack sites– Low overhead

A constraint is correlated to an attack if:– The constraint is violated iff the attack occurs

Create repairs for each correlated constraint– There may be multiple correlated constraints– Multiple candidate repairs per constraint

Outline

• Overview

• Learning: create model of normal behavior

• Monitoring: determine properties of attacks

• Repair: propose and evaluate patches

• Evaluation: adversarial Red Team exercise

• Conclusion

Repair

Server Patch

1

Patch 3

Patch 2

Distribute patches to the communitySuccess = no detector is triggered

Community machines

Ranking:Patch 1: 0Patch 2: 0Patch 3: 0…

Patch evaluation uses additional detectors(e.g., crash, difference in attack)

Attack example• Target: JavaScript system routine (written in C++)

– Casts its argument to a C++ object, calls a virtual method– Does not check type of the argument

• Attack supplies an “object” whose virtual table points to attacker-supplied code

• A constraint at the method call is correlated– Constraint: JSRI address target is one of a known set

• Possible repairs:– Skip over the call– Return early – Call one of the known valid methods– No repair

Enabling a repair

• Repair may depend on constraint checking– if (!(copy_len ≤ buff_size)) copy_len = buff_size;

– If constraint is not violated, no need to repair– If constraint is violated, an attack is (probably) underway

• Repair does not depend on the detector– Should fix the problem before the detector is triggered

• Repair is not identical to what a human would write– A stopgap until an official patch is released– Unacceptable to wait for human response

Evaluating a patch

• In-field evaluation– No attack detector is triggered– No crashes or other behavior deviations

Pre-validation, before distributing the patch:• Replay the attack

+ No need to wait for a second attack+ Exactly reproduce the problem– Expensive to record log; log terminates abruptly– Need to prevent irrevocable effects– Delays distribution of good patches

• Run the program’s test suite– May be too sensitive– Not available for COTS software

Outline

• Overview

• Learning: create model of normal behavior

• Monitoring: determine properties of attacks

• Repair: propose and evaluate patches

• Evaluation: adversarial Red Team exercise

• Conclusion

Red Team

• Red Team attempts to break our system– Hired by DARPA; 10 engineers– (Researchers = “Blue Team”)

• Red Team created 10 Firefox exploits– Each exploit is a webpage– Firefox executes arbitrary code– Malicious JavaScript, GC errors, stack

smashing, heap buffer overflow, uninitialized memory

Rules of engagement

• Firefox 1.0– Blue team may not tune system to known vulnerabilities – Focus on most security-critical components

• No access to a community for learning

• Focus on the research aspects– Red Team may only attack Firefox and the Blue Team

infrastructure• No pre-infected community members

– Future work• Red Team has access to the implementation and all

Blue Team documents & materials– Red Team may not discard attacks thwarted by the system

Results

• Blue Team system:– Detected all attacks, prevented all exploits– Repaired 7/10 attacks

• After an average of 5.5 minutes and 8 attacks

– Handled attack variants– Handled simultaneous & intermixed attacks– Suffered no false positives– Monitoring/repair overhead was not noticeable

Results: Caveats

• Blue team fixed a bug with reusing filenames

• Communications infrastructure (from subcontractor) failed during the last day

• Learning overhead is high– Optimizations are possible– Can be distributed over community

Outline

• Overview

• Learning: create model of normal behavior

• Monitoring: determine properties of attacks

• Repair: propose and evaluate patches

• Evaluation: adversarial Red Team exercise

• Conclusion

Related work

• Distributed detection– Vigilante [Costa] (for worms; proof of exploit)– Live monitoring [Kıcıman]– Statistical bug isolation [Liblit]

• Learning (lots)– Typically for anomaly detection– FSMs for system calls

• Repair [Demsky]– requires specification; not scalable

Credits

• Saman Amarasinghe• Jonathan Bachrach• Michael Carbin• Danny Dig• Michael Ernst• Sung Kim• Samuel Larsen

• Carlos Pacheco• Jeff Perkins• Martin Rinard• Frank Sherwood• Greg Sullivan• Weng-Fai Wong• Yoav Zibin

Subcontractor: Determina, Inc.Funding: DARPA (PM: Lee Badger)Red Team: SPARTA, Inc.

Contributions

Framework for collaborative learning and repair• Pluggable detection, learning, repair • Handles unknown vulnerabilities• Preserves functionality by repairing the vulnerability• Learns from failure• Focuses resources where they are most effective

Implementation for Windows x86 binaries

Evaluation via a Red Team exercise

My other research

Making it easier (and more fun!) to create reliable software

Security: quantitative information-flow; web vulnerabilities

Programming languages:– Design: User-defined type qualifiers (in Java 7)

– Type systems: immutability, canonicalization

– Type inference: abstractions, polymorphism, immutability

Testing:– Creating complex test inputs

– Generating unit tests from system tests

– Classifying test behavior

Reproducing in-field failures; combined static & dynamic analysis; analysis of version history; refactoring; …

Contributions

Framework for self-defending software• Pluggable detection, learning, repair • Handles unknown vulnerabilities• Preserves functionality by repairing the vulnerability• Learns from failure• Uses a community• Focuses resources where they are most effective

Implementation for Windows x86 binaries

Evaluation via a Red Team exercise

Underlying technology

• x86 instrumentation/patching: DynamoRIO (HP/MIT) & LiveShield (Determina)– Low overhead– No observable change to applications

• Learning: Dynamic invariant detection (UW)

• Detector for code injection: Memory Firewall (Determina)

Determina MPEE andClient Library

Application (binary)

Basic Block CheckingAnd Transformation

Basic Block

Checked, Transforme

d Basic Block

Code Cache

PC

• In learning mode:•Send trace data to learner

• In monitoring mode:•Checked learned constraints

• In repair mode:• Patch data or control

Collaborative Learning System Architecture

Constraints

Patch/RepairCode Generation

Patches

Client Workstations (Learning) Client Workstations (Protected)

Patches

Patches

Patch/Repair results

Merge Constraints (Daikon)

MPEE

Data Acquisition

Client Library

Application

Learning (Daikon)

Sample Data

MPEE

Application

Memory Firewall

Live Shield

Evaluation (observe execution)

MPEE

Application

Memory Firewall

Live Shield

Evaluation (observe execution)

MPEE

Data Acquisition

Client Library

Application

Learning (Daikon)

Sample Data

Central Management System

Learning Mode Architecture

Tracing

Client Library

Determina MPEE

Application

Local Daikon

NodeManager

Central Daikon

ManagementConsole

InvariantDatabase

Trace Data

InvariantsInvariants

Invariants

Invariant Updates

(https/ssl)

Community Machine

Server Machine

What Is Trace Data?• Sequence of observations

<basic block, binary variable, value>• Binary variables

– Variable at binary (not source) level– Type determined by use

• Example1: mov edx, [eax]2: cmp edx, [ecx+4]

• Five binary variables – – 1:eax (ptr) 1:[eax] (int) – 2:edx (int) 2:ecx (ptr) 2:[ecx+4] (int)

Learning Mode Architecture

Tracing

Client Library

Determina MPEE

Application

Local Daikon

NodeManager

Central Daikon

ManagementConsole

InvariantDatabase

Trace Data

InvariantsInvariants

Invariants

Invariant Updates

(https/ssl)

Community Machine

Server Machine

What Does the Local Daikon Do?

• Local Daikon– Reads trace data – Performs invariant inference

• Standard set of invariants– One of (var = one of {val1, …, valn})

– Not null (var != null)

– Less than (var1 - var2 < c)

– Many more (75 different kinds)

• Variables from same basic block (for now)

Learning Mode Architecture

Tracing

Client Library

Determina MPEE

Application

Local Daikon

NodeManager

Central Daikon

ManagementConsole

InvariantDatabase

Trace Data

InvariantsInvariants

Invariants

Invariant Updates

(https/ssl)

Community Machine

Server Machine

What Does Central Daikon Do?

• Takes invariants from Local Daikons• Logically merges invariants into Invariant

Database– Each kind of invariant has merge rules– For example

• x = 5 merge x = 6 is x one-of {5, 6}• x > 0 merge x > 10 is x > 10• x = 5 merge no invariant about x is no

invariant about x• x = 5 merge no data yet about x is x = 5

Application Community Issues

• Lots of community members learning at same time

• Each community member instruments a (randomly chosen) subset of basic blocks– Minimizes learning overhead– While obtaining reasonable coverage

• Learning takes place over successful executions (without attacks)– Controlled environment– A posteriori judgement

Monitoring Mode Architecture

Client Library

Determina MPEE

Application

NodeManager

Protection Manager

ManagementConsole

Attack Informatio

nAttack

Information

(https/ssl)

Community Machine

Server Machine

Attack Detection

Community Machine• Detects attack signal

– Determina Memory Firewall– Fatal error (invalid address, divide by zero)– In principle, any indication of attack

• Attack information– Program counter where attack occurred– Stack when attack occurred

• Sent to server as application dies

Invariant Localization Overview

• Goal: Find out which invariants are violated when program is attacked

• Strategy: – Find invariants close to attack – Make running applications check for

violations of these invariants– Correlate invariant violations with attacks

Invariant Localization Mode Architecture

Attack & InvariantViolation Detector

Client Library

Determina MPEE

Application

NodeManager

Protection Manager

ManagementConsole

InvariantDatabase

Attack & Invariant

Information

Attack & Invariant

Information

Invariants

(https/ssl)

Community Machine

Server Machine

LiveShieldGeneration

LiveShieldInstallation

LiveShields

LiveShields LiveShield

s

Finding Invariants Close to Attack

• Attack Information– PC of instruction where attack detected

(jump to invalid code) (instruction that accessed invalid memory) (divide by zero instruction)

– Call stack• Duplicate stack

• Preserved even for stack smashing attacks

• Find basic blocks that are close to involved PCs

• Find invariants for those basic blocks

Detecting Invariant Violations

• Add checking code to application– Check for violations of selected invariants– Log any violated invariants

• Use Determina LiveShield mechanism– Distribute code patches to basic blocks– Eject basic blocks from code cache– Insert new version of basic block with new

checking code– Updates programs as they run

Using LiveShield Mechanism

• Protection manager selects invariants to check• Generates C code that implements check• Passes C code to scripts

– Compile the code– Generate patch– Sign it, convert to LiveShield format

• Distribute LiveShields back to applications– Each application gets all LiveShields– Goal is to maximize checking information

Correlating Invariant Violations and Attacks

• Protection manager fed two kinds of information– Invariant violation information– Attack information

• Correlates the information– If invariant violation is followed by an attack– Then invariant is a candidate for enforcement

Protection Mode Architecture

Client Library

Determina MPEE

Application

NodeManager

Protection Manager

ManagementConsole

InvariantDatabase

Attack & Invariant

Information

Attack & Invariant

Information

Invariants

(https/ssl)

Community Machine

Server Machine

LiveShieldGeneration

LiveShieldInstallation

LiveShields

LiveShields LiveShield

s

Attack Detector & Invariant Enforcement

Invariant Enforcement• Given an invariant to enforce• Protection manager generates LiveShields that

correspond to different repair options• Current implementation for one-of constraints

– Variable is a pointer to a function– Constraint violation is a jump to function previously

unseen at that jump instruction– Potential repairs

• Call one of previously seen functions• Skip call• Return immediately back to caller

Selecting A Good Repair• Protection manager generates a LiveShield

for each repair option• Distributes LiveShields across applications• Random assignment, biased as follows• Each LiveShield has a success number

– Invariant enforcement followed by continued successful execution increments number

– Attack or crash decrements number– Probability of selection is proportional to success

number– Periodically reassign LiveShields to applications

System in Action – Concrete Example

• Learning mode– Key binary variable is target of jsri instruction– Learn a one-of constraint

(target is one-of invoked functions) • Monitoring mode

– Memory Firewall detects attempt to execute unauthorized function

• Invariant localization mode– Attack information identifies jsri instruction as

target of attack– Correlates invariant violation with attack

• Protection Mode– Distribute range of repairs

(skip call, call previously observed function)– Check that they successfully neutralize attack

Attack Surface Issues• Determina Runtime as attack target• Addressed with page protection policies

• Also randomize placement– Runtime data– Runtime code, code cache

Page Type Runtime Mode Application Mode

App code R R

App data RW RW

Runtime code RE R

Code Cache RW RE

Runtime data RW R

Communication Issues

• What about forged communications?

• Management console has certificate authority– Clients use password to get certificates– All communications

• Signed, authenticated, encrypted• Revocation if necessary

InvariantDatabase

ManagementConsole

CertificateAuthority

Parameterized Architecture and Implementation

• Parameterization points– Attack signal– Invariants

• Inference• Enforcement mechanisms

• Flexibility in implementation strategies– Invariant localization strategies– Invariant repair strategies

Class of Attacks

Prerequisites for stopping an attack• Attack characteristics

– Attack signal– Attack must violate invariants– Enforcing invariants must neutralize attack

• Invariant characteristics– Daikon must recognize invariants– System must be able to successfully repair

violations of invariants

Examples of Attacks We Can Stop

• Function pointer – Attack signal – Determina Memory Firewall– Invariant

• One-of invariant• Function pointer binary variable

– Repair• Jump to previously seen function• Skip call

Examples of Attacks We Can Stop

• Code injection attacks via stack overwriting– Attack signal – Determina Memory Firewall– Invariant

• Less than invariant• Stack pointer binary variable

– Repair• Skip writes via binary variable• Coerce binary variable back into range

Conclusion• Critical vulnerabilities are recognized

– Code injection – Denial of service– Framework can be extended to other detectors

• Vulnerability is repaired– This attack will fail in the future

• Overhead is low– Detector overhead is low– Compute correlation only for constraints at attack

sites

• Effective on legacy x86 binaries

Collaborative learning approach

1. Learning: Infer normal behavior from successful executions

2. Monitoring: Detect attacks (plug-in module)When an attack occurs:3. Localization: What code does the attack

target? How does the attack change behavior?4. Fixing: Propose patches based on the

specifics of the attack5. Protection: Evaluate patches, distribute the

most successful onesResult: applications are automatically protected

Collaborative Learning Framework• Provide best possible out-of-box proactive protection

against vulnerabilities– Protect (and take advantage of) a community

• Automatically repair vulnerabilities for improved continuity– Use dynamically learned constraints

• Key Features:– Proactive protection (Memory Firewall) for unknown

vulnerabilities– Attack detector based constraint checking focuses repair on

exploited vulnerabilities– Code injection and Denial of Service (crashes)

vulnerabilities are protected and repaired– Other detectors can be added to the framework– Supports arbitrary x86 binaries– Adaptive: Repairs that perform poorly are removed

Goal: Security for COTS software

• Proactively prevent attacks via unknown vulnerabilities: zero-day exploits– No pre-generated signatures– No time for human reaction

• Maintain continuity and preserve functionality– Program continues to operate despite attacks– Not the right goal for all applications– Create a patch (a repair)– Low overhead

• COTS software– No built-in survivability features– No modification to source or executables– x86 Windows binaries

• Leverage the benefits of a monoculture

Recommended