29
David Evans [email protected] http://www.cs.virginia.edu/~evans The Bugs and the Bees Research in Programming Languages and Security University of Virginia Department of Computer Science

David Evans [email protected] cs.virginia/~evans

  • Upload
    yazid

  • View
    163

  • Download
    7

Embed Size (px)

DESCRIPTION

The Bugs and the Bees Research in Programming Languages and Security. David Evans [email protected] http://www.cs.virginia.edu/~evans. University of Virginia Department of Computer Science. Computer Science. “How to” knowledge: Ways of describing imperative processes (computations) - PowerPoint PPT Presentation

Citation preview

Page 1: David Evans evans@cs.virginia cs.virginia/~evans

David [email protected]://www.cs.virginia.edu/~evans

The Bugs and the Bees

Research in Programming Languages and Security

University of VirginiaDepartment of Computer Science

Page 2: David Evans evans@cs.virginia cs.virginia/~evans

23 Sept 2002 David Evans - CS696 2

Computer Science

• “How to” knowledge:– Ways of describing imperative processes

(computations)– Ways of reasoning about (predicting) what

imperative processes will do

• Most interesting CS problems concern:– Better ways of describing computations– Ways of reasoning about what they do

(and don’t do)

Page 3: David Evans evans@cs.virginia cs.virginia/~evans

23 Sept 2002 David Evans - CS696 3

My Research Projects

The Bugs – Splint

The Bees - “Programming the Swarm”

How can we detect code that describes unintended computations?

How can we program massively distributed collections of simple devices and reason about their behavior in hostile environments?

Page 4: David Evans evans@cs.virginia cs.virginia/~evans

23 Sept 2002 David Evans - CS696 4

A Gross Oversimplification

Effort RequiredLow Unfathomable

Formal Verifiers

Bug

s D

etec

ted

none

all

Compilers

SplintSplint

Page 5: David Evans evans@cs.virginia cs.virginia/~evans

23 Sept 2002 David Evans - CS696 5

(Almost) Everyone Likes Types

• Easy to Understand

• Easy to Use

• Quickly Detect Many Programming Errors

• Useful Documentation

• …even though they are lots of work!– 1/4 of text of typical C program is for types

Page 6: David Evans evans@cs.virginia cs.virginia/~evans

23 Sept 2002 David Evans - CS696 6

Limitations of Standard Types

Type of reference never changes

State changes along program paths

Language defines checking rules

System or programmer defines checking rules

One type per reference

Many attributes per reference

Page 7: David Evans evans@cs.virginia cs.virginia/~evans

23 Sept 2002 David Evans - CS696 7

Type of reference never changes

State changes along program paths

Language defines checking rules

System or programmer defines checking rules

One type per reference

Many attributes per reference

AttributesLimitations of

Standard Types

Page 8: David Evans evans@cs.virginia cs.virginia/~evans

23 Sept 2002 David Evans - CS696 8

Approach• Programmers add annotations (formal

specifications)– Simple and precise– Describe programmers intent:

• Types, memory management, data hiding, aliasing, modification, null-ity, buffer sizes, security, etc.

• Splint detects inconsistencies between annotations and code– Simple (fast!) dataflow analyses

Page 9: David Evans evans@cs.virginia cs.virginia/~evans

23 Sept 2002 David Evans - CS696 9

Security Flaws

Malformed Input16%

Resource Leaks

6%

Format Bugs6%

Buffer Overflows

19%

Access16%

Pathnames10%

Symbolic Links11%

Other16%

Reported flaws in Common Vulnerabilities and Exposures Database, Jan-Sep 2001.[Evans & Larochelle, IEEE Software, Jan 2002.]

190 VulnerabilitiesOnly 4 having to do with crypto108 of them could have been

detected with simple static analyses!

Page 10: David Evans evans@cs.virginia cs.virginia/~evans

23 Sept 2002 David Evans - CS696 10

Example: Buffer OverflowsDavid Larochelle

• Most commonly exploited security vulnerability– 1988 Internet Worm– Still the most common attack

• Code Red exploited buffer overflow in IIS• >50% of CERT advisories, 23% of CVE entries in 2001

• Attributes describe sizes of allocated buffers• Heuristics for analyzing loops• Found several known and unknown buffer

overflow vulnerabilities in wu-ftpd

Page 11: David Evans evans@cs.virginia cs.virginia/~evans

23 Sept 2002 David Evans - CS696 11

Some Open Issues• Differential Program Analysis [Joel Winstead]

– We usually don’t just have one program, we have lots of versions of similar programs

– How can we discover interesting differences between two versions of a program?

• e.g., find a test case that reveals the difference, find invariants that are different

• Design-level Properties– Can we develop annotations and checks that deal with

design-level properties?

• Integrate run-time checking– Combine static and run-time checking to enable

additional checking and completeness guarantees

Page 12: David Evans evans@cs.virginia cs.virginia/~evans

23 Sept 2002 David Evans - CS696 12

Splint • More information: splint.org

IEEE Software ’02, USENIX Security ’01, PLDI ’96 • Public release – real users, mentioned in C FAQ, C

Unleashed, Linux Journal, etc.• Students (includes other PL/SE/security related

projects): – David Larochelle: buffer overflows, automatic annotations– Joel Winstead: differential program analysis– Greg Yukl: source code generation

• Current Funding: NASA (joint with John Knight)

Page 13: David Evans evans@cs.virginia cs.virginia/~evans

23 Sept 2002 David Evans - CS696 13

Programming the Swarm

Page 14: David Evans evans@cs.virginia cs.virginia/~evans

23 Sept 2002 David Evans - CS696 14

1950s: Programming in the small...Programmable computersLearned the programming is hardBirth of higher-order languagesTools for reasoning about trivial programs

Really Brief History of Computer Science

1970s: Programming in the large...Abstraction, objectsMethodologies for developmentTools for reasoning about

component-based systems

2000s: Programming the Swarm!

Page 15: David Evans evans@cs.virginia cs.virginia/~evans

23 Sept 2002 David Evans - CS696 15

What’s Changing• Execution Platforms

– Small, cheap and unreliable– Limited power – communication is expensive

• Execution environment– Interact with physical world– Unpredictable, dynamic

• Programs– Old style of programming won’t work– Is there a new paradigm?

Page 16: David Evans evans@cs.virginia cs.virginia/~evans

23 Sept 2002 David Evans - CS696 16

Programming the Swarm: Long-Range Goal

Cement10 GFlop

Page 17: David Evans evans@cs.virginia cs.virginia/~evans

23 Sept 2002 David Evans - CS696 17

Why this Might be Possible?

• We are surrounded by systems that:– Contain 50 Trillion (5 * 1013) components– Continue to function when 50 million

components fail every second– Survive in hostile environments (even

Canada!)– Self-organize starting from a single

component and a program that is smaller than WindowsXP

Page 18: David Evans evans@cs.virginia cs.virginia/~evans

23 Sept 2002 David Evans - CS696 18

A Biological Programming ModelSelvin George

• Program systems the way biology does

• Literal interpretation:– Cells can change state (genes turn on and

off)– Cells can divide

• Asymmetrically

– Cells can communicate over short distances• Chemical diffusion

Page 19: David Evans evans@cs.virginia cs.virginia/~evans

23 Sept 2002 David Evans - CS696 19

Example Cell

Program

state s1 { transitions -> (s1, s1) normal;}

Page 20: David Evans evans@cs.virginia cs.virginia/~evans

23 Sept 2002 David Evans - CS696 20

Cell Programs

• Use chemicals to control development• How can we produce cell programs that

generate particular structures?• How can we reason about the behavior

of cell programs in the presence of failures and randomness?

• How can we describe cell programs at a higher level? (Making abstractions)

Page 21: David Evans evans@cs.virginia cs.virginia/~evans

23 Sept 2002 David Evans - CS696 21

Less Literal Interpretation

• Learn about self-organization and robustness by mimicking biology– Learn principles from biology, not

programs

• Use this to build real systems– Sensor networks– Distributed file sharing

Page 22: David Evans evans@cs.virginia cs.virginia/~evans

23 Sept 2002 David Evans - CS696 22

Sensor Networks

Thousands of small, low-powered devices with sensors and actuators, communicating wirelessly

High-power base station

Page 23: David Evans evans@cs.virginia cs.virginia/~evans

23 Sept 2002 David Evans - CS696 23

Sensor NetworksHigh-power base station Compromised Node!

Enemy base station

Page 24: David Evans evans@cs.virginia cs.virginia/~evans

23 Sept 2002 David Evans - CS696 24

Security for Sensor Networks

• Control Messages– Only messages from base station (or other

nodes) should change device behavior

• Data Collection– A few compromised nodes should not be able

to prevent or tamper with data collection

• Data Confidentially– Some applications: eavesdropper shouldn’t

be able to interpret messages

Page 25: David Evans evans@cs.virginia cs.virginia/~evans

23 Sept 2002 David Evans - CS696 25

Why security for sensor networks is hard

• Low power devices– Cannot do traditional public-key algorithms

• Limited device communication– Sending messages is extremely expensive

• Communication is wireless– All messages are vulnerable to

eavesdropping and forgery

• Devices start identical – no stored secrets

Page 26: David Evans evans@cs.virginia cs.virginia/~evans

23 Sept 2002 David Evans - CS696 26

Asymmetric Cryptography• Cryptography depends either on:

– Shared secrets– Asymmetry (normally or information)

• Exploit time and space asymmetries– Public-key systems get asymmetry by only

one party knowing private key– In sensor networks, we can get asymmetry

by using time (key is revealed later, but in a verifiable way) and space (only nodes within a certain distance can hear)

Page 27: David Evans evans@cs.virginia cs.virginia/~evans

23 Sept 2002 David Evans - CS696 27

Non-Cryptographic Techniques

• Redundancy– Lots of sensors, only a few will be

compromised or bogus

• Snooping– Because communication is wireless, nodes

can hear what their neighbors are saying– If they are lying, tattle tale!

Page 28: David Evans evans@cs.virginia cs.virginia/~evans

23 Sept 2002 David Evans - CS696 28

Programming the Swarmswarm.cs.virginia.edu

• Students: – Selvin George: Biological Programming Model– Undergraduates: Keen Browne, Jacques Fournier,

Chris Frost, Ami Malaviya, Jon McCune

• Funding: NSF Career Award, NSF ITR

Page 29: David Evans evans@cs.virginia cs.virginia/~evans

23 Sept 2002 David Evans - CS696 29

Summary• Programming the Swarm: Describing and

reasoning about behavior of large ad hoc collections in hostile environments

• Splint: Detecting differences between what programs express and what programmers intend

• Be proactive about finding an advisor– Most important decision you will make in grad school– Matching process is last resort

• Email to arrange meetings: [email protected]