1 Dependable Intrusion Tolerance Alfonso Valdes ([email protected]) Magnus Almgren, Dan Andersson, Steve Cheung, Bruno Dutertre, Yves Deswarte, Hassen

1

Dependable Intrusion Tolerance

Alfonso Valdes ([email protected])

Magnus Almgren, Dan Andersson, Steve Cheung, Bruno Dutertre, Yves Deswarte, Hassen Saïdi,Tomas Uribe

July 2001

AcknowledgementsResearch sponsored under DARPA Contract N66001-00-C-8058. Views presented are those of the authors and do not represent the views of DARPA or the Space and Naval Warfare Systems Center

2

Dependable Intrusion Tolerance

Intrusion Detection to Date Seeks to detect possibly infinite

number of attacks in progress Relies on signature analysis and

probabilistic (including Bayes) techniques

Response components immature No concept of intrusion tolerance

New Emphasis Detection, diagnosis, and

recovery Finite number of attacks or

deviations from expected system behavior

Seek a synthesis of intrusion detection, unsupervised learning, and proof-based methods for the detection aspect

Concepts from fault tolerance are adapted to ensure delivery of service (possibly degraded)

3

Outline

Architecture overview Initial implementation experience Simulation analysis of response tradeoffs (5 minute talk at

Oakland Conference) On-line verifiers (Tomas Uribe/Hassen Saïdi)

4

Architecture

e EMERALD Network ApplianceSensor SubnetProxy-AS SubnetExternal traffic

e e2

AppServer

AppServer

AppServer

AppServer

Proxy

5

Archtecture (2)

Proxy

e e2

AppServer

AppServer

AppServer

AppServer

e EMERALD Network ApplianceSensor SubnetProxy-AS SubnetExternal traffic

Proxy

Proxy

Proxy

6

The Sensor Picture

Application Server

EMERALDHost

Monitor

Critical APP

EMERALDAPP

Monitor

Proof BasedTrigger

Tolerance Proxy

EMERALDHost

Monitor

Proxy function

On-Line Verifier

IDS Network Appliance

EMERALDAMI

Net Experts

BlueSensor

eBayes-TCP

Application Server

EMERALDHost

Monitor

Critical APP

EMERALDAPP

Monitor

Proof BasedTrigger

Application Server

EMERALDHost

Monitor

Critical APP

EMERALDAPP

Monitor

Proof BasedTrigger

Application Server

EMERALDHost

Monitor

Critical APP

EMERALDAPP

Monitor

Correlation

Note: The Net appliance has a passive interface for the network traffic. Net appliance and app servers have write-only access to sensor subnet (for alert reporting). Proxies use sensor subnet for alert reporting and management.

e

7

Proxy Implementation

Built as module in Squid proxy

Squid functionality:

Accept HTTP connection

Read client HTTP request

Check ACLs

Load balancing

Send reply to client

New functionality

Interface to sensor subsystem to change policy if needed

Check content agreement (depends on dynamic policy)

Alert the sensor subsystem if content disagrees

8

Ensuring Correct Content

In agreement modes, we compare content from more than one APP server

For efficiency and bandwidth, we actually check MD5 checksums for all polled servers

If these agree, we obtain content from one of the servers and actually verify the MD5 at the proxy

If this agrees with the previous MD5 check, the content is forwarded to the client

For efficiency, we get MD5 only from all but one and MD5 plus the full body of the request from the remaining server. Then if the tests succeed, the proxy already has the content and we save one request

9

Four policy levels Benign - 1 GET request

Duplex (default regime at system start)

1 HEAD (get MD5 only) and 1 GET (MD5 plus content).

If MD5 agree, send content to client

Otherwise, go to Triplex

Triplex -

2 HEAD- and 1 GET-request.

If MD5 all agree, send content to client. If majority obtained, consider minority AS COMPROMISED. Send content to client, rebuild AS, continue Triplex

Full Agreement - Not implemented yet

Transition to a more permissive regime after some time of normal activity

10

Simulation Analysis

System is modeled as 14 Poisson processes Processes include client requests, server replies, challenge/response

requests (from proxy, to assess content validity), random failures, attacks (which make transitions between attack states), IDS false alarms, IDS detections,...

Process rates are state dependent Requests, attacks, failures always ON. Response process is ON if there are

active requests. False alarms are always ON, detections are ON if there are active attacks in a detectable state.

System performance is based on true state. Tolerance response is based on sensor reports

Responses include various levels of content agreement as well as server reboot

Objective: Minimize dropped requests and requests with invalid replies (the latter come from a root-compromised app server)

All tolerance responses have a cost with respect to these objectives, but not responding can also cost

11

Initial Results

Requests arrive at 1000/unit time. Total reply capacity is 4000/unit time. Attack rate is 50/unit time.

Redundancy is beneficial, but diminishing returns beyond 2 App Servers (Total server capacity is 4000/unit time)

Frequent challenge/response requests improve system objectives

App Servers % Drop % Invalid1@4000/time 3.62 2.78

2@2000/time each 0.04 1.263@1333/time each 0.16 0.594@1000/time each 0.99 0.51

Challenge % Drop % Invalid0 0 26.21

100 0.43 1.89500 0.99 0.51

1000 0.31 0.33

12

Status and Plans

Status (end of year 1) Architecture definition Continually refining specification Clarification of the sensor landscape Initial implementation of Single proxy system (static content) Specification of on-line verifiers

Plans - Multi-proxy system Additional protocols such as challenge/response Dynamic Content Implement on-line verifiers

13

Summary

Exploring response tradeoffs as part of a larger dependable intrusion tolerant system study

Explicitly model shortcomings of foreseeable IDS technology with respect to false alarms, missed detections, inaccuracy, and delay

Considering emerging correlation and asset distress monitoring technologies

Recognize that actions have a cost, but no response has a cost as well

Gives some idea of value of redundancy, enforcing agreement, rebooting servers, etc. with respect to some figure of merit

14

(Backup) Poisson Processes

Poisson process: Event stream where inter-event times have an exponential distribution. Parameter is referred to as the process rate, typically denoted

Mathematical properties of multiple simultaneous Poisson processes lead to tractable implementation:

Overall process is Poisson, with overall rate equal to the sum of the rates of the individual processes

Next event is of a given class with the following probability:

overall ii

P Next event is of class i i

overall

15

Proxy Capabilities Simulated

IDS detect probes and root compromises, but occasionally fail to detect or are too slow, or generate false alerts

Asset distress monitor (blue sensor) can detect a “down” server by rate of failed requests

Proxy detects AHBL when request queue overflows Challenge/Response: Periodically issues a request to all servers,

for which the reply is known Can detect compromised server if reply is invalid Can detect a “down” server These detections are typically much later than from an IDS

Available responses are: Invoke a content agreement regime for client requests with 2..n

servers Reboot a server

16

Processes and Rates

Process Rate per unit time CommentRequest 1000

Reply 4000 totalActive if there are active requests

Challenge/Response 500

Compete with client requests for server bandwidth

Non-malicious crash 1

Reboot 100So E(reboot time)=0.01

Probe attack 50Probe_to_root 10Probe_to_crash 5Probe_to_term 5Root_to_crash 5

Root_to_term 5Attack in this state compromises host

Probe_detect 10

Root detect 50Must detect before root_to_term

False Detect 5

Note: Time units not specified. These rates should be viewed as relative.

Documents

1 Dependable Intrusion Tolerance Alfonso Valdes ([email protected]) Magnus Almgren, Dan Andersson, Steve Cheung, Bruno Dutertre, Yves Deswarte, Hassen