Formal Methods and Computer Security John Mitchell Stanford University

Preview:

Citation preview

Formal Methods and Computer Security

John MitchellStanford University

Invitation

• I'd like to invite you to speak about the role of formal methods in computer security.

• This audience is … on the systems end …

• If you're interested, let me know and we can work out the details.

Outline

What’s a “formal method”?Java bytecode verificationProtocol analysis

• Model checking• Protocol logic

Trust management• Access control policy language

Big Picture

Biggest problem in CS• Produce good software efficiently

Best tool• The computer

Therefore• Future improvements in computer

science/industry depend on our ability to automate software design, development, and quality control processes

Formal method

Analyze a system from its description• Executable code• Specification (possibly not executable)

Analysis based on correspondence between system description and properties of interest• Semantics of code• Semantics of specification language

Example: TCAS [Levison, Dill, …]

Specification• Many pages of logical formulas specifying

how TCAS responds to sensor inputsAnalysis

• If module satisfies specification, and aircraft proceeds as directed, then no collisions will occur

Method• Logical deduction, based on formal rules

Formal methods: good and bad

Strengths• Formal rules captures years of

experience• Precise, can be automated

Weaknesses• Some subtleties are hard to formalize• Methods cumbersome, time consuming

Formal methods sweet spot

System complexity * Property complexity

Multiplier parity OS verification

Use

rs *

Im

port

an

ce

Worthwhile

Not worth the effort

Not feasible

Target areas

Hardware verification Program verification

• Prove properties of programs• Requirements capture and analysis• Type checking and “semantic analysis”

Computer security• Mobile code security• Protocol analysis• Access control policy languages, analysis

Computer Security

Access control Network security OS security Web browser/server Database/application …

Crypto

Security

Goal: protect computer systems and digital information

Current formal methods use abstract view of cryptography

Mobile code: Java Applet

Local window Download

• Seat map• Airline data

Local data• User profile• Credit card

Transmission• Select seat• Encrypted msg

A.classA.javaJava

Compiler

B.class

Loader

Verifier

Linker

Bytecode Interpreter

Java Virtual Machine

Compile source code

Network

Java Virtual Machine Architecture

Java Sandbox

Four complementary mechanisms• Class loader

– Separate namespaces for separate class loaders– Associates protection domain with each class

• Verifier and JVM run-time tests– NO unchecked casts or other type errors, NO array overflow– Preserves private, protected visibility levels

• Security Manager– Called by library functions to decide if request is allowed– Uses protection domain associated with code, user policy– Enforcement uses stack inspection

Verifier

Bytecode may not come from standard compiler• Evil hacker may write dangerous bytecode

Verifier checks correctness of bytecode• Every instruction must have a valid operation code • Every branch instruction must branch to the start of

some other instruction, not middle of instruction • Every method must have a structurally correct

signature • Every instruction obeys the Java type discipline

Last condition is fairly complicated .

How do we know verifier is correct?

Many attacks based on verifier errors

Formal studies prove correctness• Abadi and Stata• Freund and Mitchell• Nipkow and others …

A type system for object initialization in the

Java bytecode language

Stephen Freund John MitchellStanford University

(Raymie Stata and Martín Abadi, DEC SRC)

Bytecode/Verifier Specification

Specifications from Sun/JavaSoft:• 30 page text description [Lindholm,Yellin]

• Reference implementation (~3500 lines of C code)

These are vague and inconsistentDifficult to reason about:

• safety and security properties• correctness of implementation

Type system provides formal spec

JVM uses stack machine

JavaClass A extends Object { int i void f(int val) { i = val + 1;}}

BytecodeMethod void f(int) aload 0 ; object ref this iload 1 ; int val iconst 1 iadd ; add val +1 putfield #4 <Field int i> return data

area

local variabl

es

operandstack

Return addr, exception info, Const pool res.

JVM Activation Record

refers to const pool

Java Object Initialization

No easy pattern to match Multiple refs to same uninitialized object

Point p = new Point(3);p.print();

1: new Point2: dup3: iconst 34: invokespecial <method Point(int)>5: invokevirtual <method print()>

JVMLi Instructions

Abstract instructions:• new allocate memory for object• init initialize object• use use initialized object

Goal• Prove that no object can be used

before it has been initialized

Typing Rules

For program P, compute for iDom(P) Fi : Var type type of each variable

Si : stack of types type of each stack location

Example: static semantics of inc P[i] = inc

Fi+1 = Fi

Si+1 = Si = Int

i+1 Dom(P) F, S, i P

Typing RulesEach rule constrains successors of instruction:

Well-typed = Accepted by Verifier

Alias Analysis

Other situations:

or

Equivalence classes based on line where object was created.

1: new P2: new P3: init P

init P

new P

i : uninitialized object of

type allocated on line i.

The new Instruction

Uninitialized object type placed on stack of types:

P[i] = new

Fi+1 = Fi

Si+1 = i Si

i Si

i Range(Fi)

i+1 Dom(P)F, S, i P

The init Instruction

Substitution of initialized object type for uninitialized object type:

P[i] = init

Si = j , j Dom(P)

Si+1 =[/ j]

Fi+1 =[/ j] Fi

i+1 Dom(P)F, S, i P

Soundness

Theorem: A well-typed program will not generate a run-time error when executed

Invariant:• During program execution, there is never

more than one value of type present.• If this is violated, we could initialize one

object and mistakenly believe that a different object was also initialized.

Extensions

Constructors• constructor must call superclass constructor

Primitive Types and Basic Operations Subroutines [Stata,Abadi]

• jsr L jump to L and push return address on stack• ret x jump to address stored in x• polymorphic over untouched variables

Dom(FL) restricted to variables used by subroutine

variables 1 and 2 contain references to

two different objects with type P11 .

Bug in Sun JDK 1.1.4

1: jsr 102: store 13: jsr 104: store 25: load 26: init P7: load 18: use P9: halt

10: store 011: new P12: ret 0

verifier allows use of uninitialized object

Related Work

Java type systems• Java Language [DE 97], [Syme 97], ...• JVML [SA 98], [Qian 98], [HT 98], ...

Other approaches• Concurrent constraint programs [Saraswat

97]• defensive-JVM [Cohen 97]• data flow analysis frameworks [Goldberg 97]• Experimental tests [SMB 97]

TIL / TAL [Harper,Morrisett,et al.]

Protocol Security

Cryptographic Protocol• Program distributed over network• Use cryptography to achieve goal

Attacker• Read, intercept, replace messages,

remember their contentsCorrectness

• Attacker cannot learn protected secret or cause incorrect protocol completion

Example Protocols

Authentication Protocols• Clark-Jacob report >35 examples (1997)• ISO/IEC 9798, Needham-S, Denning-

Sacco, Otway-Rees, Woo-Lam, KerberosHandshake and data transfer

• SSL, SSH, SFTP, FTPS, …Contract signing, funds transfer, …Many others

Characteristics

Relatively simple distributed programs• 5-7 steps, 3-10 fields per message, …

Mission critical• Security of data, credit card numbers, …

Subtle• Attack may combine data from many

sessions

Good target for formal methods However: crypto is hard to model

Run of protocol

A

BInitiate

Respond

C

D

Attacker

Correct if no security violation in any run

Protocol Analysis Methods

Non-formal approaches (useful, but no tools…)

• Some crypto-based proofs [Bellare, Rogaway]

• Communicating Turing Machines [Canetti]

BAN and related logics • Axiomatic semantics of protocol steps

Methods based on operational semantics• Intruder model derived from Dolev-Yao• Protocol gives rise to set of traces

– Denotation of protocol = set of runs involving arbitrary number of principals plus intruder

Example projects and tools

Prove protocol correct • Paulson’s “Inductive method”, others in HOL, PVS,• MITRE -- Strand spaces• Process calculus approach: Abadi-Gordon spi-calculus

Search using symbolic representation of states• Meadows: NRL Analyzer, Millen: CAPSL

Exhaustive finite-state analysis• FDR, based on CSP [Lowe, Roscoe, Schneider, …]• Clarke et al. -- search with axiomatic intruder model

Protocol analysis spectrum

Low High

Hig

hL

owSo

ph

isti

cati

on

of

atta

ck

s

Protocol complexity

Mur

FDR

NRLAthena

Hand proofs

Paulson

Bolignano

BAN logic

Spi-calculus

Poly-time calculus

Model checking

Multiset rewriting with

Protocol logic

Important Modeling Decisions

How powerful is the adversary?• Simple replay of previous messages• Block messages; Decompose, reassemble, resend• Statistical analysis, traffic analysis• Timing attacks

How much detail in underlying data types?• Plaintext, ciphertext and keys

– atomic data or bit sequences

• Encryption and hash functions– “perfect” cryptography– algebraic properties: encr(x*y) = encr(x) * encr(y) for RSA encrypt(k,msg) = msgk mod N

Four efforts (w/various collaborators)

Finite-state analysis• Case studies: find errors, debug specifications

Logic based model - Multiset rewriting• Identify basic assumptions• Study optimizations, prove correctness• Complexity results

Framework with probability and complexity• More realistic intruder model• Interaction between protocol and cryptography• Significant mathematical issues, similar to hybrid

systems (Panangaden, Jagadeesan, Alur, Henzinger, de Alfaro, …)

Protocol logic

Rest of talk

Model checking• Contract signing

MSR• Overview, complexity results

PPoly• Key definitions, concepts

Protocol logic• Short overview

Likely to run out of time …

Contract-signing protocols

John Mitchell, Vitaly ShmatikovStanford University

Subsequent work by Chadha, Kanovich, Scedrov,Other analysis by Kremer, Raskin

Example

Both parties want to sign the contract

Neither wants to commit first

Immunitydeal

General protocol outline

Trusted third party can force contract• Third party can declare contract binding if presented with first two messages.

A B

I am going to sign the contract

I am going to sign the contract

Here is my signature

Here is my signature

Assumptions

Cannot trust communication channel• Messages may be lost• Attacker may insert additional

messagesCannot trust other party in protocolThird party is generally reliable

• Use only if something goes wrong• Want TTP accountability

Desirable properties

Fair• If one can get contract, so can other

Accountability• If someone cheats, message trace

shows who cheatedAbuse free

• No party can show that they can determine outcome of the protocol

BA

m1= sign(A, c, hash(r_A) )

sign(B, m1, hash(r_B) )r_A

r_B

Agree

A BNetwork

T

Abort

???

Resolve Attack?

BA Net

T sigT (m1, m2)

m1

???

m2 A

T

Asokan-Shoup-Waidner protocol

If not alreadyresolved

a1

sigT (a1,abort)

Results

Exhaustive finite-state analysis• Two signing parties, third party• Attacker tries to subvert protocol

Two attacks• Replay attack

– Restart A’s conversation to fool B

• Inconsistent signatures– Both get contracts, but with different ID’s

Repair• Add data to m3, m4; prevent both attacks

Related protocol

Designed to be “abuse free”• B cannot take msg from A and show to C• Uses special cryptographic primitive• T converts signatures, does not use own

Finite-state analysis• Attack gives A both contract and abort• T colludes weakly, not shown accountable• Simple repair using same crypto primitive

[Garay, Jakobsson, MacKenzie]

BA

PCSA(text,B,T)

PCSB(text,A,T)

sigA(text)

sigB(text)

Agree

A BNetwork

T

m1 = PCSA(text,B,T)

Abort

???

Resolve Attack

BA Net

T PCSA(text,B,T)

sigB(text)

PCSA(text,B,T)

???

PCSB(text,A,T) B

T

sigT(abort)

abort AND sigB(text) abort

Leaked by T

Garay, Jakobsson, MacKenzie

Modeling Abuse-Freeness

Depend on set of traces through a state Approximation for finite-state analysis

• Nondet. challenge A to resolve or abort• If trace s.t. outcome challenge,

then A cannot determine the outcome

Abuse

Ability to determine the outcome

Ability to prove it

= +Not a trace property!

Conclusions

Online contract signing is subtle• Fairness• Abuse-freeness• Accountability

Several interdependent subprotocols• Many cases and interleavings

Finite-state tool great for case analysis!• Find bugs in protocols proved correct

Multiset Rewriting and Security Protocol Analysis

John MitchellStanford University

I. Cervesato, N. Durgin, P. Lincoln, A. Scedrov

A notation for inf-state systems

• Many previous models are buried in tools• Define common model in tool-independent formalism

Linear Logic( )

Process Calculus

Finite Automata

Proof search(Horn clause)

Multisetrewriting

Modeling Requirements

Express properties of protocols• Initialization

– Principals and their private/shared data

• Nonces– Generate fresh random data

Model attacker• Characterize possible messages by attacker• Cryptography

Set of runs of protocol under attack

Notation commonly found in literature

• The notation describes protocol traces• Does not

– specify initial conditions– define response to arbitrary messages– characterize possible behaviors of attacker

A B : { A, Noncea }Kb

B A : { Noncea, Nonceb }Ka

A B : { Nonceb }Kb

Rewriting Notation

Non-deterministic infinite-state systems Facts

F ::= P(t1, …, tn)

t ::= x | c | f(t1, …, tn)

States { F1, ..., Fn }• Multiset of facts

– Includes network messages, private state– Intruder will see messages, not private state

Multi-sorted first-order atomic formulas

Rewrite rules

Transition• F1, …, Fk x1 … xm. G1, … , Gn

What this means• If F1, …, Fk in state , then a next state ’ has

– Facts F1, …, Fk removed

– G1, … , Gn added, with x1 … xm replaced by new symbols

– Other facts in state carry over to ’

• Free variables in rule universally quantified Note

• Pattern matching in F1, …, Fk can invert functions

• Linear Logic: F1…Fk x1 … xm(G1…Gn)

Finite-State Example

• Predicates: State, Input

• Function: • Constants: q0, q1, q2, q3, a, b, nil

• Transitions: State(q0), Input(a x) State(q1), Input(x)

State(q0), Input(b x) State(q2), Input(x)

... Set of rewrite transition sequences = set of runs of automaton

q0

q1

q3

q2b

a

aa

b

b

b a b

Simplified Needham-Schroeder

PredicatesAi, Bi, Ni

-- Alice, Bob, Network in state i

Transitionsx. A1(x)

A1(x) N1(x), A2(x)

N1(x) y. B1(x,y)

B1(x,y) N2(x,y), B2(x,y)

A2(x), N2(x,y) A3(x,y)

A3(x,y) N3(y), A4(x,y)

B2(x,y), N3(y) B3(x,y)

picture next slide

A B: {na, A}Kb

B A: {na, nb}Ka

A B: {nb}Kb

AuthenticationA4(x,y) B3(x,y’) y=y’

Sample TraceA B: {na, A}Kb

B A: {na, nb}Ka

A B: {nb}Kb

A2(na)

A1(na)

A2(na)

A2(na)

A3(na, nb)

A4(na, nb)

A4(na, nb)

B2(na, nb)

B1(na, nb)

B2(na, nb)

B3(na, nb)

B2(na, nb)

N1(na)

N2(na, nb)

N3( nb)

x. A1(x)

A1(x) A2(x), N1(x)

N1(x) y. B1(x,y)

B1(x,y) N2(x,y), B2(x,y)

A2(x), N2(x,y) A3(x,y)

A3(x,y) N3(y), A4(x,y)

B2(x,y), N3(y) B3(x,y)

Common Intruder Model

Derived from Dolev-Yao model • Adversary is nondeterministic process• Adversary can

– Block network traffic– Read any message, decompose into parts– Decrypt if key is known to adversary– Insert new message from data it has observed

• Adversary cannot– Gain partial knowledge– Guess part of a key– Perform statistical tests, …

Formalize Intruder Model

Intercept, decompose and remember messages N1(x) M(x) N2(x,y) M(x), M(y)

N3(x) M(x)

Decrypt if key is known M(enc(k,x)), M(k) M(x)

Compose and send messages from “known” data M(x) N1(x), M(x)

M(x), M(y) N2(x,y), M(x), M(y)

M(x) N3(x), M(x)

Generate new data as needed x. M(x)

Highly nondeterministic, same for any protocol

Attack on Simplified Protocol

A2(na)

A1(na)

A2(na)

A2(na)

B1(na’, nb)

N1(na)x. A1(x)

A1(x) A2(x), N1(x)

N1(x) M(x)

x. M(x)

M(x) N1(x), M(x)

N1(x) y. B1(x,y)

M(na)

M(na), M(na’)

N1(na’)A2(na) M(na), M(na’)

A2(na) M(na), M(na’)

Continue “man-in-the-middle” to violate specification

Protocols vs Rewrite rules

Can axiomatize any computational system

But -- protocols are not arbitrary programs Choose principals

Client

Select roles

Client TGS Server

Thesis: MSR Model is accurate

Captures “Dolev-Yao-Needham-Millen-Meadows- …” model• MSR defines set of traces protocol and attacker• Connections with approach in other formalisms

Useful for protocol analysis• Errors shown by model are errors in protocol• If no error appears, then no attack can be

carried out using only the actions allowed by the model

Complexity results using MSR

Key insight: existential quantification () captures cryptographic nonce; main source of complexity

[Durgin, Lincoln, Mitchell, Scedrov]

only

,

only

,

Intruder w/o

Intruder with

Unbounded use of

Bounded use of

Bounded # of roles

NP – complete

Undecidable

??

DExp – time

All: Finite number of different roles, each role of finite length, bounded message size

Additional decidable cases

Bounded role instances, unbounded msg size• Huima 99: decidable• Amadio, Lugiez: NP w/ atomic keys• Rusinowitch, Turuani: NP-complete, composite keys• Other studies, e.g., Kusters: unbounded # data fields

Constraint systems• Cortier, Comon: Limited equality test• Millen, Shmatikov: Finite-length runs

All: bound number of role instances

Probabilistic Polynomial-Time Process Calculus

for Security Protocol Analysis

J. Mitchell, A. Ramanathan, A. Scedrov, V. Teague

P. Lincoln, M. Mitchell

Limitations of Standard Model

Can find some attacks• Successful analysis of industrial protocols

Other attacks are outside model• Interaction between protocol and encryption

Some protocols cannot be modeled• Probabilistic protocols• Steps that require specific property of

encryption Possible to “OK” an erroneous protocol

Non-formal state of the art

Turing-machine-based analysis• Canetti• Bellare, Rogaway• Bellare, Canetti, Krawczyk• others …

Prove correctness of protocol transformations• Example: secure channel -> insecure

channel

Language Approach

Write protocol in process calculus Express security using observational equivalence

• Standard relation from programming language theory P Q iff for all contexts C[ ], same observations about C[P] and C[Q]• Context (environment) represents adversary

Use proof rules for to prove security• Protocol is secure if no adversary can distinguish it

from some idealized version of the protocol

[Abadi, Gordon]

Probabilistic Poly-time Analysis

Adopt spi-calculus approach, add probability Probabilistic polynomial-time process calculus

• Protocols use probabilistic primitives– Key generation, nonce, probabilistic encryption, ...

• Adversary may be probabilistic• Modal type system guarantees complexity bounds

Express protocol and specification in calculus Study security using observational equivalence

• Use probabilistic form of process equivalence

[Lincoln, Mitchell, Mitchell, Scedrov]

Needham-Schroeder Private Key

Analyze part of the protocol P A B : { i } K

B A : { f(i) } K

“Obviously’’ secret protocol Q (zero knowledge)

A B : { random_number } K

B A : { random_number } K

Analysis: P Q reduces to crypto condition related to non-malleability [Dolev, Dwork, Naor]

– Fails for RSA encryption, f(i) = 2i

Technical Challenges

Language for prob. poly-time functions• Extend Hofmann language with rand

Replace nondeterminism with probability• Otherwise adversary is too strong ...

Define probabilistic equivalence• Related to poly-time statistical tests ...

Develop specification by equivalence• Several examples carried out

Proof systems for probabilistic equivalence • Work in progress

Basic example

Sequence generated from random seedPn: let b = nk-bit sequence generated from n random bits

in PUBLIC b end Truly random sequence

Qn: let b = sequence of nk random bits

in PUBLIC b end P is crypto strong pseudo-random generator

P QEquivalence is asymptotic in security parameter n

Compositionality

Property of observational equiv

A B C D

A|C B|D

similarly for other process forms

Current State of Project

New framework for protocol analysis • Determine crypto requirements of protocols !• Precise definition of crypto primitives

Probabilistic ptime language Pi-calculus-like process framework

• replaced nondeterminism with rand• equivalence based on ptime statistical tests

Proof methods for establishing equivalence

Future: tool development

Protocol logic

Alice’s information• Protocol• Private data• Sends and receives

Honest Principals,Attacker

Send

Receive

Protocol

Private Data

Intuition

Reason about local information• I chose a new number• I sent it out encrypted• I received it decrypted • Therefore: someone decrypted it

Incorporate knowledge about protocol• Protocol: Server only sends m if it got m’• If server not corrupt and I receive m

signed by server, then server received m’

Bidding conventions (motivation)

Blackwood response to 4NT –5 : 0 or 4 aces –5 : 1 ace –5 : 2 aces –5 : 3 aces

Reasoning • If my partner is following Blackwood,

then if she bid 5, she must have 2 aces

Logical assertions

Modal operator• [ actions ] P - after actions, P reasons

Predicates in • Sent(X,m) - principal X sent message m

• Created(X,m) – X assembled m from parts

• Decrypts(X,m) - X has m and key to decrypt m

• Knows(X,m) - X created m or received msg containing m and has keys to extract m from msg

• Source(m, X, S) – YX can only learn m from set S

• Honest(X) – X follows rules of protocol

Correctness of NSL

Bob knows he’s talking to Alice[ recv encrypt( Key(B), A,m ); new n; send encrypt( Key(A), m, B, n ); recv encrypt( Key(B), n ) ] B

Honest(A) Csent(A, msg1) Csent(A, msg3)

where Csent(A, …) Created(A, …) Sent(A, …)

msg1

msg3

Honesty rule (rule scheme)

roles R of Q. initial segments A R.

Q |- [ A ]X

Q |- Honest(X)

• This is a finitary rule:– Typical protocol has 2-3 roles– Typical role has 1-3 receives– Only need to consider A waiting to receive

Conclusions

Security Protocols• Subtle, Mission critical, Prone to error

Analysis methods• Model checking

– Practically useful; brute force is a good thing– Limitation: find errors in small configurations

• Proof methods– Time-consuming to use general logics– Special-purpose logics can be sound, useful

Room for another 5+ years of work

Access Control / Trust Mgmt

Conference Registration

Regular $1000Academic $500Student $100

Root CA

Stanford

Mitchell

Chander

Stanford is accred university

Mitchell is regular faculty

Chander is my student

Registration message

Root signs: Stanford is accred universityStanford signs: Mitchell is regular faculty Faculty can ident studentsMitchell signs: Chander is my student

Certification

Faculty can ident students

Formal methods

System complexity * Property complexity

Use

rs *

Im

port

an

ce

Worthwhile

Not worth the effort

Not feasible

System-Property Tradeoff

Low High

Hig

hL

owSo

ph

isti

cati

on

of

atta

ck

s

Protocol complexity

Mur

FDR

NRLAthena

Hand proofs

Paulson

Bolignano

BAN logic

Spi-calculus

Poly-time calculus

Model checking

Multiset rewriting with

Protocol logic

The Wedge of Formal Verification

Valueto Design

Effort Invested

Verify

RefuteAbstract

Invisible FM

Big Picture

Biggest problem in CS• Produce good software efficiently

Best tool• The computer

Therefore• Future improvements in computer

science/industry depend on our ability to automate software design, development, and quality control processes

Recommended