Secure Certifying Compilation
David Walker
Cornell University
What do you want to type check today?
April 12, 2000
David Walker, Cornell University 2
Extensible Systems
Many systems have programmable interfaces.
– printers and editors (postscript printers, emacs, Word)– browsers and servers (applets, plugins, CGI-scripts)– operating systems (virus scanners)– networks (active networks, JINI)Code
SystemInterface
Download, Link & Execute
April 12, 2000
David Walker, Cornell University 3
Extensible Systems: Pros
• Client-side customization– plug in your own devices, 3rd-party utilities
• Preservation of market-share– vendors can add features, improve
functionality easily
• System maintenance and evolution– software subscriptions
April 12, 2000
David Walker, Cornell University 4
Extensible Systems: Cons• Security
– extensibility opens system to malicious attacks– how do we prevent misuse of resources?
• Reliability– flexibility makes it hard to reason about
system evolution– how do we limit damage done by erroneous
extensions?
April 12, 2000
David Walker, Cornell University 5
Extensible Systems: Reality
• Strong economic and engineering pros – Mobile code, systems with programmable
interfaces will proliferate
• A necessity: practical technology for increasing the security and reliability of extensible systems
April 12, 2000
David Walker, Cornell University 6
Outline
• Framework for improved reliability and security– Idea I: certifying compilation– Idea II: security via code instrumentation
• An instance [popl '00]
– Security automaton specifications– A dependently-typed target language (TAL)
• Related work & research directions
April 12, 2000
David Walker, Cornell University 7
Certified Code
• Attach annotations/certificate (types, proofs, ...) to untrusted object code extensions
• Certificates make verification feasible• Move away from trust-based security & reliability
UntrustedCode
SystemInterface
Download &CheckCertificate
Link & Execute
SecureCode
April 12, 2000
David Walker, Cornell University 8
Certifying Compilation
High-level Program
Compile
Optimize
AnnotatedIR
Transmit
• Low-level certificate generation must be automated
• Necessary components:1) a source-level programming
language
2) a compiler to compile and optimize source programs while preserving the certificate
3) a certifying target language
certificate
April 12, 2000
David Walker, Cornell University 9
Question
How should we obtain the initial certificate?
April 12, 2000
David Walker, Cornell University 10
Answer
• Use a type-safe language
• Type inference relieves the tedium of proof construction
• Programmers will rewrite programs so they type check
April 12, 2000
David Walker, Cornell University 11
Certifying Compilation So FarType SafeHigh-level Program
Compile
Optimize
TypedProgram
Transmit
1) a strongly typed source-level programming language
2) a type-preserving compiler to compile and optimize source programs
3) a certificate language for type-safety properties
types
April 12, 2000
David Walker, Cornell University 12
Certifying Compilers• Proof-Carrying Code [Necula & Lee]
– an expressive base logic that can encode many security policies
– in practice, logic is extended with a type system– compilers produce type safety proofs
• Typed Assembly Language [Morrisett, Walker, et al]
– flexible type constructor language that can encode high-level abstractions
– guarantees type safety properties
April 12, 2000
David Walker, Cornell University 13
Conventional Type Safety
• Conventional types ensure basic safety:– basic operations performed correctly– abstraction/interfaces hide data representations
and system code
• Conventional types don't describe complex security policies– eg: policies that depend upon history
• Melissa virus reads Outlook contacts list and then sends 50 emails
April 12, 2000
David Walker, Cornell University 14
Outline
• Framework for improved reliability and security– Idea I: certifying compilation– Idea II: security via code instrumentation
• An instance [popl '00]
– Security automaton specifications– A dependently-typed target language (TAL)
• Related work & research directions
April 12, 2000
David Walker, Cornell University 15
Flexible Security Policies
• Specify policies independently of extensible system
• Compiler instruments extensions
• Easier to understand, debug, evolve policies
High-levelExtension
Compiler
Analyze &Optimize
SecurityPolicy Instrument
April 12, 2000
David Walker, Cornell University 16
Security Policy Specifications• Requirement: a language for specifying security
policies• Features:
– Notation for specifying events of interest• "network send" and "file read" are security-sensitive
– Notation for specifying illegal behaviour• a privacy policy: "no send after read"
– A feasible compilation strategy• must be able to prevent programs from violating the policy
April 12, 2000
David Walker, Cornell University 17
Examples
• SFI [Wahbe et al]
– events are read, write, jump– enforce memory safety properties
• SASI [Erlingsson & Schneider], Naccio [Evans & Twyman]
– flexible policy languages– not certifying compilers
April 12, 2000
David Walker, Cornell University 18
Putting it Together
– define policies in a high-level, flexible and system-independent specification language
– instrument system extensions both with dynamic security checks and static information
– preserve proof of security policy during compilation and optimization
– verify certified compiler output to reduce TCB
April 12, 2000
David Walker, Cornell University 19
Outline
• Framework for improved reliability and security– Idea I: certifying compilation– Idea II: security via code instrumentation
• An instance [popl '00]
– Security automaton specifications– A dependently-typed target language (TAL)
• Related work & research directions
April 12, 2000
David Walker, Cornell University 20
Secure Certified Code
• Overview of Architecture• Security Automata [Erlingsson & Schneider]
– How to specify security properties– A simple compilation strategy
• A dependently-typed target language (TAL) – A brief introduction to TAL– Extensions for certifying security properties
• theoretical core language proven sound• can express any security automaton policy
April 12, 2000
David Walker, Cornell University 21
Security Architecture
High-level Extension
Instrument
Optimize
SecureTypedExtension
Transmit
SecurityAutomaton
Specification
SecureTypedInterface
Type Check
System Interface
Annotate
SecureExecutable
April 12, 2000
David Walker, Cornell University 22
Security Automata• A general mechanism for specifying security policies • Specify any safety property
– access control policies: • “cannot access file foo”
– resource bound policies: • “allocate no more than 1M of memory”
– the Melissa policy: • “no network send after file read”
April 12, 2000
David Walker, Cornell University 23
Example
• Policy: No send operation after a read operation• States: start, has read, bad• Inputs (program operations): send, read• Transitions (state x input -> state):
– start x read(f) -> has read
starthasread
read(f)
send read(f)
badsend
April 12, 2000
David Walker, Cornell University 24
Example Cont’d
starthasread
read(f)
send read(f)
badsend
% untrusted program % s.a.: start statesend(); % ok -> startread(f); % ok -> has read send(); % bad, security violation
• S.A. monitor program execution
• Entering the bad state = security violation
April 12, 2000
David Walker, Cornell University 25
Bounding Resource Use
0
malloc (i)
i n - 1
• Policy: "allocate fewer than n bytes"
...
bad
malloc (i)
...
April 12, 2000
David Walker, Cornell University 26
Enforcing S.A. Specs• Every security-relevant operation has
an associated function: checkop
• Trusted, provided by policy writer
• checkop implements the s. a. transition function
checksend (state) = if state = start then start else halt % terminates execution
April 12, 2000
David Walker, Cornell University 27
Enforcing S.A. Specs
• Easy, wrap all function calls in checks:
• Improve performance using program analysis
let next_state = checksend(current_state) insend()
send()
April 12, 2000
David Walker, Cornell University 28
Outline
• Technology for improved reliability and security– Idea I: certifying compilation– Idea II: security via code instrumentation
• Secure certifying compilation [popl '00]
– Security automaton specifications– A dependently-typed target language (TAL)
• Related work & research directions
April 12, 2000
David Walker, Cornell University 29
Brief TAL Overview
• Assembly or machine code with typing annotations• Object files checked separately and linked together• Ensures basic safety without run-time checks
– Memory safety: can't read/write arbitrary memory– Control-flow safety: can't execute arbitrary data– Type abstraction: TAL can encode and enforce high-level abstract data types
Typecheck Link
April 12, 2000
David Walker, Cornell University 30
A TAL Compiler
• TAL is practical
– We compile "safe C" (aka Popcorn)
– No pointer arithmetic, unsafe casts
– ML-style data types, polymorphism, exceptions
– Some simple optimizations
• null-check elimination, inlining, register allocation
– The compiler bootstraps
• most compiler hacking by Grossman, Morrisett, Smith
April 12, 2000
David Walker, Cornell University 31
Other TAL Features• Memory management features
– Stack types– Aliasing– Region-based MM– See Dave’s thesis
• Other features– Dynamic linking– Run-time code generation
• http://www.cs.cornell/talc
April 12, 2000
David Walker, Cornell University 32
Typing Assembly Code
• Programs divided into labeled code blocks• Each block has a code type: {eax:,ebx:,...}• Code types specify expected register contents
– Assume code type to check the block– Prove control transfers (jumps) meet the assumptions
Foo : {eax: int, ecx: {eax: int}} mov ebx, 3; % {eax: int, ebx: int, ecx: {eax: int}} add eax, ebx; % OK jmp ecx % OK
April 12, 2000
David Walker, Cornell University 33
Increasing Expressiveness
• Basic types ensure standard type safety– functions and data used as intended and cannot
be confused– security checks can’t be circumvented
• Introduce a logic into the type system to express security invariants
• Use the logic to encode the s.a. policy
• Use the logic to prove checks unnecessary
April 12, 2000
David Walker, Cornell University 34
Target Language Predicates• States (for compile-time reasoning)
• constants: start, has read, bad, ...
• variables: 1, 2, ...
• Predicates:– describe security states
• instate()
– describe relationships between states• transsend(1,2)
– describe dependencies between values• (see the paper)
April 12, 2000
David Walker, Cornell University 35
bar: {...} ... % Known: instate(start) ... jmp foo [start]
Preconditions
• Code types can specify preconditions:
• A typical use:
foo: [, instate(), bad].{eax:1, ecx:2}
- instantiate polymorphic variable - prove residual preconditions- eg: instate(start), start bad- hope proofs are easy (syntactic matching)- otherwise place explicit proof at call site- eg: jmp foo [start, Proof, Proof]
April 12, 2000
David Walker, Cornell University 36
Postconditions
• Expressed as a precondition on the return address type:
bar: { eax: 1, ecx: [instate(has read)].{eax: 2} }
• Before returning, bar proves instate(has read)• After return, assume instate(has read)
April 12, 2000
David Walker, Cornell University 37
Encoding Security Automata• Each security-relevant function has a type
specifying 3 preconditions, 1 postcondition• the send function:
– P1: instate(curr)
– P2: transsend(curr,next)
– P3: next bad
Pre: P1, P2, P3
send: [curr,next,P1,P2,P3].{ ecx: [P4].{ } }
– P4: instate(next)
Post: P4
April 12, 2000
David Walker, Cornell University 38
Technical Note
• State predicates behave linearly– as in linear logic, each state predicate is used once
– instate(curr) is "consumed" at send call site
• can't be used in future proofs• can't fool type system into thinking code continues
to be in state curr
– instate(next) is "produced" on return
• will be used when next calling a security-sensitive function
April 12, 2000
David Walker, Cornell University 39
Compile-time & Run-time• Compile-time reasoning depends on run-
time valuesfoo: mov eax, state % should represent the current state mov ecx, ret1 jmp checksend % state argument, state result in eax
ret1: push eax % save next state on the stack mov ecx, ret2 jmp send % must establish precondition for send
checksend: % postcond. == precond. for ret1, send
April 12, 2000
David Walker, Cornell University 40
Checksend
• A type for checksend (first try)
checksend:
[curr,P1].{eax:state, ecx:[next,P1,P2,P3].{eax:state} }
where
P1 = instate(curr), P2 = transsend(curr,next), P3 = next bad
April 12, 2000
David Walker, Cornell University 41
Checksend
• A type for checksend (first try)
• No correspondence between run-time argument and static information
checksend:
[curr,P1].{eax:state, ecx:[next,P1,P2,P3].{eax:state} }
where
P1 = instate(curr), P2 = transsend(curr,next), P3 = next bad
mov eax, wrong_state; mov ecx, next; jmp checksend
April 12, 2000
David Walker, Cornell University 42
Checksend
• Solution: provide very precise types
• Singleton types– A type containing one value– eax : state(start)
• means eax contains a data structure that represents exactly the start state and no other state
– eax : state()• eax contains data representing the unknown state • useful in many contexts
– Similar to Dependent ML [Xi & Pfenning]
April 12, 2000
David Walker, Cornell University 43
[curr,P1].{eax:state(curr),ecx:[next,P1,P2,P3].{eax:state(next)}}
– P1 = instate(curr), P2 = transsend(curr,next), P3 = next bad
• checksend
– implements the automaton transition function• intuitively has type state -> state• singletons help relate run-time values to compile-time predicates
Using Singletons
April 12, 2000
David Walker, Cornell University 44
Using Checksend
foo: { … } ... % Assume: instate(curr), eax : state(curr) mov ecx, ret1 jmp check_send[curr]
ret1: [next, instate(curr), transsend(curr,next), next bad]. {eax:state(next)}. push eax; mov ecx, ret2; jmp send [curr,next] % P1 & P2 & P3 ==> ok
ret2: ...
April 12, 2000
David Walker, Cornell University 45
Optimization• Analysis of s.a. structure makes redundant
check elimination possible– eg:
– identify transsend(start,start) as valid
starthasread
read(f)
send read(f)
badsend
April 12, 2000
David Walker, Cornell University 46
Optimization
Low-level Interface
send: 'read: 'checksend: 'checkread: '
Axiom A = transsend(start,start)
Policy
High-level Interface
April 12, 2000
David Walker, Cornell University 47
Optimization
• Type-checker is simple but general
• Typical optimizations– redundant check removal– loop invariant removal
loop : [instate(start)].{ } mov ecx, loop jmp send [start,start,By A];
send: [curr,next,instate(curr),transsend(curr,next), next bad].{ecx: [P4].{ }}
April 12, 2000
David Walker, Cornell University 48
Implementation
• TALx86 implementation is sufficient for these encodings– includes polymorphism, higher-order type
constructors, logical connectives (,,), singleton types, ....
• Lots more work to be done– axioms in module interfaces– policy compiler
April 12, 2000
David Walker, Cornell University 49
Outline
• Technology for improved reliability and security– Idea I: certifying compilation– Idea II: security via code instrumentation
• Secure certifying compilation [popl '00]
– Security automaton specifications– A certifying target language
• Related work & research directions
April 12, 2000
David Walker, Cornell University 50
Research Directions
• Design of policy languages– What kinds of logics can we compile & certify?
• Mawl [Sandholm & Schwartzbach]
• TALres [Crary & Weirich]
• Design of safety architecture– How do we "clean up" after halting a program?– Support for mutually distrustful agents
• Policy-directed optimizations
April 12, 2000
David Walker, Cornell University 51
Summary
• A recipe for secure certified code:– types
• ensure basic safety without run-time overhead• add a logic to encode complex invariants
– policy-directed code instrumentation• specify security policies independently of the rest of
the system• use dynamic checking to enforce policies when
they can’t be proven statically