32
Static Code Static Code Checking: Security Checking: Security and Concurrency and Concurrency Ben Watson Ben Watson The George Washington University The George Washington University CS 297 Security and Programming Languages CS 297 Security and Programming Languages June 9, 2005 June 9, 2005

Static Code Checking: Security and Concurrency Ben Watson The George Washington University CS 297 Security and Programming Languages June 9, 2005

Embed Size (px)

Citation preview

Page 1: Static Code Checking: Security and Concurrency Ben Watson The George Washington University CS 297 Security and Programming Languages June 9, 2005

Static Code Checking: Static Code Checking: Security and ConcurrencySecurity and Concurrency

Ben WatsonBen Watson

The George Washington UniversityThe George Washington UniversityCS 297 Security and Programming LanguagesCS 297 Security and Programming Languages

June 9, 2005June 9, 2005

Page 2: Static Code Checking: Security and Concurrency Ben Watson The George Washington University CS 297 Security and Programming Languages June 9, 2005

The VideoThe Video

Page 3: Static Code Checking: Security and Concurrency Ben Watson The George Washington University CS 297 Security and Programming Languages June 9, 2005

The ProblemThe Problem

How to discover errors in code without running itHow to discover errors in code without running itCode can run for weeks or months without Code can run for weeks or months without displaying the errordisplaying the errorMany errors are caused by pieces of code that Many errors are caused by pieces of code that are very difficult to testare very difficult to test Device drivers – manufacturers aren’t always good at Device drivers – manufacturers aren’t always good at

this, and one OS company can’t possibly test all the this, and one OS company can’t possibly test all the tens of thousands of devices out theretens of thousands of devices out there

The Windows 98 crash was caused by a bad scanner driverThe Windows 98 crash was caused by a bad scanner driver Concurrent code—debugging complicated Concurrent code—debugging complicated

concurrency problems is a nightmare x n.concurrency problems is a nightmare x n.

Page 4: Static Code Checking: Security and Concurrency Ben Watson The George Washington University CS 297 Security and Programming Languages June 9, 2005

The ScopeThe Scope

Lines of Code (estimated)Lines of Code (estimated)

Windows 3.1Windows 3.1 3,000,0003,000,000

Windows NT 3.51Windows NT 3.51 4,000,0004,000,000

Windows 95Windows 95 15,000,00015,000,000

RedHat Linux 7.1RedHat Linux 7.1 30,000,00030,000,000

Windows 2000Windows 2000 35,000,00035,000,000

Windows XPWindows XP 40,000,00040,000,000

Debian Linux 2.2Debian Linux 2.2 56,000,00056,000,000

Debian Linux 3.1Debian Linux 3.1 213,000,000213,000,000

Page 5: Static Code Checking: Security and Concurrency Ben Watson The George Washington University CS 297 Security and Programming Languages June 9, 2005

The Real ProblemThe Real Problem

We’re only humanWe’re only human No person, no group of people can possibly No person, no group of people can possibly

manually debug anything as complicated as manually debug anything as complicated as an OS and its related piecesan OS and its related pieces

Good tools are not enoughGood tools are not enoughCan’t rely on thorough annotations of entire code Can’t rely on thorough annotations of entire code basebase

Can’t rely on manual directions: the more Can’t rely on manual directions: the more automated the betterautomated the better

Page 6: Static Code Checking: Security and Concurrency Ben Watson The George Washington University CS 297 Security and Programming Languages June 9, 2005

The SolutionsThe Solutions

MC Security checking systemMC Security checking system

RacerX: Race condition and Deadlock RacerX: Race condition and Deadlock detectiondetection

General rule inference from source codeGeneral rule inference from source code

Page 7: Static Code Checking: Security and Concurrency Ben Watson The George Washington University CS 297 Security and Programming Languages June 9, 2005

MECA: Statically Checking MECA: Statically Checking Security PropertiesSecurity Properties

Checks low-level properties (pointer safety, Checks low-level properties (pointer safety, etc.)etc.)

Relies on annotations that propagate through Relies on annotations that propagate through the analysisthe analysis

GoalsGoals ExpressivenessExpressiveness Low manual overhead—programmers only have to Low manual overhead—programmers only have to

type in a relatively few number of annotationstype in a relatively few number of annotations Low false-positivesLow false-positives

Page 8: Static Code Checking: Security and Concurrency Ben Watson The George Washington University CS 297 Security and Programming Languages June 9, 2005

How MC WorksHow MC Works

Uses a modified GCC compilerUses a modified GCC compilerParses source along with abstract syntax Parses source along with abstract syntax tree generated by compilertree generated by compilerAST used to build a control-flow graphAST used to build a control-flow graphAnnotation propagator uses CFG to Annotation propagator uses CFG to propagate annotations through entire propagate annotations through entire graphgraphCheckers are run on the completed graphCheckers are run on the completed graphResults are ranked and filteredResults are ranked and filtered

Page 9: Static Code Checking: Security and Concurrency Ben Watson The George Washington University CS 297 Security and Programming Languages June 9, 2005

An exampleAn example

Rule: OS kernel may not access a user-Rule: OS kernel may not access a user-pointer (there are “paranoid” functions to pointer (there are “paranoid” functions to access the data pointed to by a user-access the data pointed to by a user-pointer)pointer) Referred to as a “tainted” pointersReferred to as a “tainted” pointers

Annotate:Annotate: Tainted variables, parameters, and fieldsTainted variables, parameters, and fields Functions that produce tainted valuesFunctions that produce tainted values

Page 10: Static Code Checking: Security and Concurrency Ben Watson The George Washington University CS 297 Security and Programming Languages June 9, 2005

Source annotationsSource annotations

struct myStruct {struct myStruct {/*@ tainted */ int*p;/*@ tainted */ int*p;};};

/*@ tainted */ int *foo(/*@ tainted /*@ tainted */ int *foo(/*@ tainted */int *p);*/int *p);

void memcpy(/*@ !tainted */void *dst, void memcpy(/*@ !tainted */void *dst, /*@ !tainted */void *src, unsigned /*@ !tainted */void *src, unsigned nbytes);nbytes);

Page 11: Static Code Checking: Security and Concurrency Ben Watson The George Washington University CS 297 Security and Programming Languages June 9, 2005

Source annotationsSource annotations

//Binding://Binding:

/*@ set_length($ret, sz) *//*@ set_length($ret, sz) */

void* malloc(unsigned sz);void* malloc(unsigned sz);

//Global: all sys_* calls //Global: all sys_* calls

//are tainted//are tainted

/*@ global $param ${!/*@ global $param ${!strncmp(current_fn,”sys_”,4)} ==> strncmp(current_fn,”sys_”,4)} ==> tainted */tainted */

Page 12: Static Code Checking: Security and Concurrency Ben Watson The George Washington University CS 297 Security and Programming Languages June 9, 2005

PropagationPropagationvoid bar(/*@ tainted */void *p);void bar(/*@ tainted */void *p);struct S{char* buf;}struct S{char* buf;}//Before analysis//Before analysisvoid foo(char** p, struct S* s)void foo(char** p, struct S* s){{

char *r;char *r;struct S* ss;struct S* ss;r=*p;r=*p;bar(r);bar(r); //taints r and *p//taints r and *pss =s;ss =s;bar(ss->buf);bar(ss->buf); //taints ss and s//taints ss and s

}}

//At the end of analysis://At the end of analysis:Foo(/*@ tainted (*p) */char **p, /*@tainted(s->buf) Foo(/*@ tainted (*p) */char **p, /*@tainted(s->buf)

*/struct S* s);s*/struct S* s);s

Page 13: Static Code Checking: Security and Concurrency Ben Watson The George Washington University CS 297 Security and Programming Languages June 9, 2005

MECA resultsMECA results

On average, one manual annotation led to 682 checksOn average, one manual annotation led to 682 checks

Linux 2.5.63 Bugs:Linux 2.5.63 Bugs:

TypeType WarningsWarnings FixedFixed

Arbitrary writeArbitrary write 1111 1111

Arbitrary readArbitrary read 88 88

Fault at willFault at will 1919 1717

Always failAlways fail 66 33

TotalTotal 4444 3939

False False PositivesPositives

88

Page 14: Static Code Checking: Security and Concurrency Ben Watson The George Washington University CS 297 Security and Programming Languages June 9, 2005

RacerXRacerX

Static detection of race conditions and Static detection of race conditions and deadlocksdeadlocks

Designed to find errors in large, multi-Designed to find errors in large, multi-threaded systemsthreaded systems

Sorts errors by severity (the hard part)Sorts errors by severity (the hard part)

They checked Linux, FreeBSD, and a They checked Linux, FreeBSD, and a mystery OS that has only 500,000 lines of mystery OS that has only 500,000 lines of codecode

Page 15: Static Code Checking: Security and Concurrency Ben Watson The George Washington University CS 297 Security and Programming Languages June 9, 2005

DeadlockDeadlock

DeadlockDeadlock Thread 1 has locked resource AThread 1 has locked resource A Thread 2 has locked resource BThread 2 has locked resource B Thread 1 needs resource B to completeThread 1 needs resource B to complete Thread 2 needs resource A to completeThread 2 needs resource A to complete Neither can proceed—these threads are Neither can proceed—these threads are

deadlockeddeadlocked

Page 16: Static Code Checking: Security and Concurrency Ben Watson The George Washington University CS 297 Security and Programming Languages June 9, 2005

Race conditionRace condition

Multiple threads access the same memoryMultiple threads access the same memory

If memory is unprotected:If memory is unprotected: Two threads can simultaneously write to same Two threads can simultaneously write to same

memory (bad)memory (bad) One thread can read, another can write One thread can read, another can write

simultaneously (bad)simultaneously (bad) Two threads can simultaneously read from same Two threads can simultaneously read from same

memory (probably ok)memory (probably ok)

It’s a It’s a race race because final value is non-because final value is non-deterministically chosen by who gets there first.deterministically chosen by who gets there first.

Page 17: Static Code Checking: Security and Concurrency Ben Watson The George Washington University CS 297 Security and Programming Languages June 9, 2005

Avoiding the ProblemAvoiding the Problem

If data is never accessed by more than If data is never accessed by more than one thread, you don’t have to worry about one thread, you don’t have to worry about concurrencyconcurrencyIf program If program logiclogic ensuresensures that only one that only one thread accesses data, you don’t need to thread accesses data, you don’t need to worry about locking the dataworry about locking the dataIf you’re writing a shared component, you If you’re writing a shared component, you almost almost alwaysalways have to worry about have to worry about concurrencyconcurrency

Page 18: Static Code Checking: Security and Concurrency Ben Watson The George Washington University CS 297 Security and Programming Languages June 9, 2005

AlgorithmAlgorithm

““Lockset” algorithm detects both types of Lockset” algorithm detects both types of problemsproblems

Lockset - A pair ofLockset - A pair of Lock()/Unlock()Lock()/Unlock() InterruptDisable()/InterruptEnable()InterruptDisable()/InterruptEnable() Etc.Etc.

Page 19: Static Code Checking: Security and Concurrency Ben Watson The George Washington University CS 297 Security and Programming Languages June 9, 2005

AlgorithmAlgorithm

Top-down analysis of control-flow graphTop-down analysis of control-flow graph

Add/remove locks as neededAdd/remove locks as needed

Check for race/deadlock on each Check for race/deadlock on each statementstatement

Cache results to ease exponential graph Cache results to ease exponential graph sizesize

Page 20: Static Code Checking: Security and Concurrency Ben Watson The George Washington University CS 297 Security and Programming Languages June 9, 2005

Deadlock CheckDeadlock Check

Basically, finds if there are cycles in the Basically, finds if there are cycles in the lockset dependencieslockset dependencies If lock a is obtained, then lock b, we have: If lock a is obtained, then lock b, we have:

a a b b Following this line of reasoning, we can Following this line of reasoning, we can

discover cases that look like this:discover cases that look like this:a a b b c c a a

Page 21: Static Code Checking: Security and Concurrency Ben Watson The George Washington University CS 297 Security and Programming Languages June 9, 2005

Deadlock CheckDeadlock Check

Deciding how important the cycle is, is Deciding how important the cycle is, is non-trivial.non-trivial.

Basically, rank higher according to:Basically, rank higher according to: Global locks vs. local locksGlobal locks vs. local locks Small depth difference vs. big depth Small depth difference vs. big depth

differencedifference Fewer threads vs. more threadsFewer threads vs. more threads

Page 22: Static Code Checking: Security and Concurrency Ben Watson The George Washington University CS 297 Security and Programming Languages June 9, 2005

Race CheckingRace Checking

This is even harder than deadlock detectionThis is even harder than deadlock detection

Must answer:Must answer: Is lockset valid (if not, you will have LOTS of false Is lockset valid (if not, you will have LOTS of false

positives)positives) Can the unprotected memory be accessed more than Can the unprotected memory be accessed more than

one thread?one thread? Does the access need to be protected?Does the access need to be protected?

Two reads do not a wrong makeTwo reads do not a wrong make Must annotate API functions that require locksMust annotate API functions that require locks

Page 23: Static Code Checking: Security and Concurrency Ben Watson The George Washington University CS 297 Security and Programming Languages June 9, 2005

Race CheckingRace Checking

Deciding if code is multithreaded:Deciding if code is multithreaded: Inferred from “programmer belief” – if a piece Inferred from “programmer belief” – if a piece

of code contains concurrency-related of code contains concurrency-related statements, the code is probably multi-statements, the code is probably multi-threadedthreaded

Annotations—designate API functions as Annotations—designate API functions as requiring locks requiring locks

Page 24: Static Code Checking: Security and Concurrency Ben Watson The George Washington University CS 297 Security and Programming Languages June 9, 2005

Race CheckingRace Checking

Does memory need to be protected?Does memory need to be protected? If it’s never written to, no.If it’s never written to, no. If it’s only written on initialization, no.If it’s only written on initialization, no. On a certain code path, if there are a high-number of On a certain code path, if there are a high-number of

variables that are potentially written to concurrently, variables that are potentially written to concurrently, probably.probably.

Anything that can’t be written atomically, yes. Anything that can’t be written atomically, yes. (although, this is pretty much anything, especially if (although, this is pretty much anything, especially if you have more than 1 CPU)you have more than 1 CPU)

If a variable is statistically likely to be protected by If a variable is statistically likely to be protected by locking code (“Programmer Belief”)locking code (“Programmer Belief”)

Page 25: Static Code Checking: Security and Concurrency Ben Watson The George Washington University CS 297 Security and Programming Languages June 9, 2005

RacerX: ResultsRacerX: Results

ConfirmedConfirmed UnconfirmedUnconfirmed BenignBenign FalseFalse

DeadlockDeadlock

System XSystem X 22 33 77

Linux 2.5.62Linux 2.5.62 44 88 66

FreeBSDFreeBSD 22 33 66

RaceRace

System XSystem X 77 44 1313 1414

Linux 2.5.62Linux 2.5.62 33 22 22 66

Page 26: Static Code Checking: Security and Concurrency Ben Watson The George Washington University CS 297 Security and Programming Languages June 9, 2005

Pop Quiz – Question 1Pop Quiz – Question 1

If you have read the 3If you have read the 3rdrd paper, you may not paper, you may not answer this question.answer this question.

Find the bug:Find the bug:

if (card==NULL) {if (card==NULL) {

printk(KERN_ERR “capidrv-%d: … %d!\printk(KERN_ERR “capidrv-%d: … %d!\n”,n”,

card->contrnr, id);card->contrnr, id);

}}

Page 27: Static Code Checking: Security and Concurrency Ben Watson The George Washington University CS 297 Security and Programming Languages June 9, 2005

Pop Quiz – Answer 1Pop Quiz – Answer 1

if (if (card==NULLcard==NULL) {) {

printk(KERN_ERR “capidrv-%d: … printk(KERN_ERR “capidrv-%d: … %d!\n”,%d!\n”,

card->contrnrcard->contrnr, id);, id);

}}

Page 28: Static Code Checking: Security and Concurrency Ben Watson The George Washington University CS 297 Security and Programming Languages June 9, 2005

Pop Quiz – Question 2Pop Quiz – Question 2

If you have read the 3If you have read the 3rdrd paper, you may paper, you may not answer this question.not answer this question.Find the bug:Find the bug:

struct mxser_struct *info = struct mxser_struct *info = tty->driver_data;tty->driver_data;

unsigned long flags;unsigned long flags;if (!tty || !info->xmit_buf)if (!tty || !info->xmit_buf)

return 0;return 0;

Page 29: Static Code Checking: Security and Concurrency Ben Watson The George Washington University CS 297 Security and Programming Languages June 9, 2005

Pop Quiz – Answer 2Pop Quiz – Answer 2

struct mxser_struct *info = struct mxser_struct *info =

tty->driver_datatty->driver_data;;

unsigned long flags;unsigned long flags;

if (if (!tty!tty || !info->xmit_buf) || !info->xmit_buf)

return 0;return 0;

Page 30: Static Code Checking: Security and Concurrency Ben Watson The George Washington University CS 297 Security and Programming Languages June 9, 2005

General MethodologyGeneral Methodology

Take advantage of programmer beliefsTake advantage of programmer beliefs

Statistics are our friendStatistics are our friend

If something is usually done a certain way, If something is usually done a certain way, then instances that violate that should be then instances that violate that should be examinedexamined

Check Check internal consistencyinternal consistency Discover rules that are built-in to the codeDiscover rules that are built-in to the code Minimal to no annotationMinimal to no annotation

Page 31: Static Code Checking: Security and Concurrency Ben Watson The George Washington University CS 297 Security and Programming Languages June 9, 2005

ConclusionConclusion

The methods tonight provide some of the The methods tonight provide some of the best ways to find errors:best ways to find errors: Millions of lines of code can be checked with Millions of lines of code can be checked with

at mostat most hundreds of lines of annotations hundreds of lines of annotations

The bugs these methods find are fairly The bugs these methods find are fairly specific in nature (revolve around well-specific in nature (revolve around well-structured code constructs)structured code constructs)

Page 32: Static Code Checking: Security and Concurrency Ben Watson The George Washington University CS 297 Security and Programming Languages June 9, 2005

ReferencesReferences

Junfeng Yang, Ted Kremenek, Yichen Xie, and Dawson Engler. Junfeng Yang, Ted Kremenek, Yichen Xie, and Dawson Engler. MECA: an Extensible, Expressive System and Language for StaticalMECA: an Extensible, Expressive System and Language for Statically Checking Security Properties. ly Checking Security Properties. ACM CCS, 2003. ACM CCS, 2003. Dawson Engler and Ken Ashcraft. Dawson Engler and Ken Ashcraft. RacerXRacerX: Effective, Static Detection of Race Conditions and Deadlocks. : Effective, Static Detection of Race Conditions and Deadlocks. SOSP 2003. SOSP 2003. Dawson Engler, David Yu Chen, Seth Hallem, Andy Chou, and Dawson Engler, David Yu Chen, Seth Hallem, Andy Chou, and Benjamin Chelf. Benjamin Chelf. Bugs as Deviant Behavior: A General Approach to Inferring Errors inBugs as Deviant Behavior: A General Approach to Inferring Errors in Systems Code. Systems Code. OSDI 2000. OSDI 2000. Source Lines of Code, Source Lines of Code, http://www.answers.com/topic/source-lines-of-codehttp://www.answers.com/topic/source-lines-of-codeConcurrency – Part 2: Avoiding the Problem, Concurrency – Part 2: Avoiding the Problem, http://blogs.msdn.com/larryosterman/archive/2005/02/15/373460.ashttp://blogs.msdn.com/larryosterman/archive/2005/02/15/373460.aspxpx