70
Finding software bugs with the Clang Static Analyzer Ted Kremenek, Apple Inc.

Finding software bugs with the Clang Static Analyzer

  • Upload
    others

  • View
    15

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Finding software bugs with the Clang Static Analyzer

Finding software bugs with theClang Static Analyzer

Ted Kremenek, Apple Inc.

Page 2: Finding software bugs with the Clang Static Analyzer

Findings Bugs with Compiler Techniques

Page 3: Finding software bugs with the Clang Static Analyzer

Findings Bugs with Compiler TechniquesCompile-time warnings

% clang t.c

t.c:38:13: warning: invalid conversion '%lb' printf("%s%lb%d", "unix", 10, 20); ~~~~^~~~~

Page 4: Finding software bugs with the Clang Static Analyzer

Findings Bugs with Compiler TechniquesCompile-time warnings

Static Analysis

• Checking performed by compiler warnings inherently limited• Find path-specific bugs• Deeper bugs: memory leaks, buffer overruns, logic errors

% clang t.c

t.c:38:13: warning: invalid conversion '%lb' printf("%s%lb%d", "unix", 10, 20); ~~~~^~~~~

Page 5: Finding software bugs with the Clang Static Analyzer

Benefits of Static Analysis

Page 6: Finding software bugs with the Clang Static Analyzer

Benefits of Static Analysis

Early discovery of bugs

• Find bugs early, while the developer is hacking on their code• Bugs caught early are cheaper to fix

Page 7: Finding software bugs with the Clang Static Analyzer

Benefits of Static Analysis

Early discovery of bugs

• Find bugs early, while the developer is hacking on their code• Bugs caught early are cheaper to fix

Systematic checking of all code

• Static analysis reasons about all corner cases

Page 8: Finding software bugs with the Clang Static Analyzer

Benefits of Static Analysis

Early discovery of bugs

• Find bugs early, while the developer is hacking on their code• Bugs caught early are cheaper to fix

Systematic checking of all code

• Static analysis reasons about all corner cases

Find bugs without test cases

• Useful for finding bugs in hard-to-test code• Not a replacement for testing

Page 9: Finding software bugs with the Clang Static Analyzer

This Talk: Clang “Static Analyzer”

Clang-based static analysis tool for finding bugs

• Supports C and Objective-C (C++ in the future)

Outline

• Demo• How it works• Design and implementation• Looking forward

Page 10: Finding software bugs with the Clang Static Analyzer

This Talk: Clang “Static Analyzer”

Clang-based static analysis tool for finding bugs

• Supports C and Objective-C (C++ in the future)

Outline

• Demo• How it works• Design and implementation• Looking forward

http://clang.llvm.org

Page 11: Finding software bugs with the Clang Static Analyzer

Demo

Page 12: Finding software bugs with the Clang Static Analyzer

How does static analysis work?

Page 13: Finding software bugs with the Clang Static Analyzer

How does static analysis work?• Can catch bugs with different degrees of analysis sophistication

Page 14: Finding software bugs with the Clang Static Analyzer

How does static analysis work?• Can catch bugs with different degrees of analysis sophistication• Per-statement, per-function, whole-program all important

Page 15: Finding software bugs with the Clang Static Analyzer

How does static analysis work?• Can catch bugs with different degrees of analysis sophistication• Per-statement, per-function, whole-program all important

compiler warnings (simple checks)

% gcc -Wall -O1 -c t.ct.c: In function ‘f’:t.c:5: warning: ‘x’ may be used uninitialized in this function

% clang -warn-uninit-values t.ct.c:13:12: warning: use of uninitialized variable return x; ^

int f(int y) {  int x;    if (y)    x = 1;    printf("%d\n", y);    return x;}

Page 16: Finding software bugs with the Clang Static Analyzer

How does static analysis work?

int f(int y) {  int x;    if (y)    x = 1;    printf("%d\n", y);    return x;}

Page 17: Finding software bugs with the Clang Static Analyzer

How does static analysis work?

int f(int y) {  int x;    if (y)    x = 1;    printf("%d\n", y);    return x;}

int x;if (y)

x = 1;

printf(“%d\n”, y);return x;

control-flow graph

Page 18: Finding software bugs with the Clang Static Analyzer

How does static analysis work?

int x;if (y)

x = 1;

printf(“%d\n”, y);return x;

control-flow graph

int f(int y) {  int x;    if (y)    x = 1;    printf("%d\n", y);    return x;}

Page 19: Finding software bugs with the Clang Static Analyzer

How does static analysis work?

int x;if (y)

x = 1;

printf(“%d\n”, y);return x;

control-flow graph

The bug occurs on this feasible path

int f(int y) {  int x;    if (y)    x = 1;    printf("%d\n", y);    return x;}

Page 20: Finding software bugs with the Clang Static Analyzer

How does static analysis work?

int f(int y) {  int x;    if (y)    x = 1;    printf("%d\n", y);

}  return x;

Page 21: Finding software bugs with the Clang Static Analyzer

How does static analysis work?

int f(int y) {  int x;    if (y)    x = 1;    printf("%d\n", y);

if (y)

}

  return x;

  return y; return x;

return y;

printf(“%d\n”, y);if (y)

int x;if (y)

x = 1;

Page 22: Finding software bugs with the Clang Static Analyzer

How does static analysis work?

return y;

return x;

printf(“%d\n”, y);if (y)

int x;if (y)

x = 1;

Page 23: Finding software bugs with the Clang Static Analyzer

% gcc -Wall -O1 -c t.ct.c: In function ‘f’:t.c:5: warning: ‘x’ may be used uninitialized in this function

% clang -warn-uninit-values t.ct.c:13:12: warning: use of uninitialized variable return x; ^

How does static analysis work?

return y;

return x;

printf(“%d\n”, y);if (y)

int x;if (y)

x = 1;

Page 24: Finding software bugs with the Clang Static Analyzer

% gcc -Wall -O1 -c t.ct.c: In function ‘f’:t.c:5: warning: ‘x’ may be used uninitialized in this function

% clang -warn-uninit-values t.ct.c:13:12: warning: use of uninitialized variable return x; ^

Two feasible paths:

How does static analysis work?

return y;

return x;

printf(“%d\n”, y);if (y)

int x;if (y)

x = 1;

Page 25: Finding software bugs with the Clang Static Analyzer

How does static analysis work?

return x;

return y;

printf(“%d\n”, y);if (y)

int x;if (y)

x = 1;

% gcc -Wall -O1 -c t.ct.c: In function ‘f’:t.c:5: warning: ‘x’ may be used uninitialized in this function

% clang -warn-uninit-values t.ct.c:13:12: warning: use of uninitialized variable return x; ^

Two feasible paths:

• Neither branch taken (y == 0)

Page 26: Finding software bugs with the Clang Static Analyzer

How does static analysis work?

return x;

return y;

printf(“%d\n”, y);if (y)

int x;if (y)

x = 1;

% gcc -Wall -O1 -c t.ct.c: In function ‘f’:t.c:5: warning: ‘x’ may be used uninitialized in this function

% clang -warn-uninit-values t.ct.c:13:12: warning: use of uninitialized variable return x; ^

Two feasible paths:

• Neither branch taken (y == 0)

• Both branches taken (y != 0)

Page 27: Finding software bugs with the Clang Static Analyzer

How does static analysis work?

return x;

return y;

printf(“%d\n”, y);if (y)

int x;if (y)

x = 1;

Page 28: Finding software bugs with the Clang Static Analyzer

How does static analysis work?

return x;

return y;

printf(“%d\n”, y);if (y)

int x;if (y)

x = 1;

Bogus warning occurs on infeasible path:

• Don’t take first branch (y == 0)

• Take second branch (y != 0)

Page 29: Finding software bugs with the Clang Static Analyzer

How does static analysis work?

Page 30: Finding software bugs with the Clang Static Analyzer

False Positives (Bogus Errors)

Page 31: Finding software bugs with the Clang Static Analyzer

False Positives (Bogus Errors)• False positives can occur due to analysis imprecision

■ False paths■ Insufficient knowledge about the program

Page 32: Finding software bugs with the Clang Static Analyzer

False Positives (Bogus Errors)• False positives can occur due to analysis imprecision

■ False paths■ Insufficient knowledge about the program

• Many ways to reduce false positives■ More precise analysis■ Difficult to eliminate false positives completely

Page 33: Finding software bugs with the Clang Static Analyzer

Flow-Sensitive Analyses

Page 34: Finding software bugs with the Clang Static Analyzer

Flow-Sensitive Analyses• Flow-sensitive analyses reason about flow of values

y = 1;x = y + 2; // x == 3

Page 35: Finding software bugs with the Clang Static Analyzer

Flow-Sensitive Analyses• Flow-sensitive analyses reason about flow of values

• No path-specific information

y = 1;x = y + 2; // x == 3

if (x == 0) ++x; // x == ?else x = 2; // x == 2y = x; // x == ?, y == ?

Page 36: Finding software bugs with the Clang Static Analyzer

Flow-Sensitive Analyses• Flow-sensitive analyses reason about flow of values

• No path-specific information

• LLVM’s SSA form designed for flow-sensitive algorithms

y = 1;x = y + 2; // x == 3

if (x == 0) ++x; // x == ?else x = 2; // x == 2y = x; // x == ?, y == ?

Page 37: Finding software bugs with the Clang Static Analyzer

Flow-Sensitive Analyses• Flow-sensitive analyses reason about flow of values

• No path-specific information

• LLVM’s SSA form designed for flow-sensitive algorithms• Linear-time algorithms

■ Used by optimization algorithms and compiler warnings

y = 1;x = y + 2; // x == 3

if (x == 0) ++x; // x == ?else x = 2; // x == 2y = x; // x == ?, y == ?

Page 38: Finding software bugs with the Clang Static Analyzer

Path-Sensitive Analyses

Page 39: Finding software bugs with the Clang Static Analyzer

Path-Sensitive Analyses• Reason about individual paths and guards on branches

if (x == 0) ++x; // x == 1else x = 2; // x == 2y = x; // (x == 1, y == 1) or (x == 2, y == 2)

Page 40: Finding software bugs with the Clang Static Analyzer

Path-Sensitive Analyses• Reason about individual paths and guards on branches

• Uninitialized variables example:■ Path-sensitive analysis picks up only 2 paths■ No false positive

if (x == 0) ++x; // x == 1else x = 2; // x == 2y = x; // (x == 1, y == 1) or (x == 2, y == 2)

Page 41: Finding software bugs with the Clang Static Analyzer

Path-Sensitive Analyses• Reason about individual paths and guards on branches

• Uninitialized variables example:■ Path-sensitive analysis picks up only 2 paths■ No false positive

• Worst-case exponential-time■ Complexity explodes with branches and loops■ Lots of clever tricks to reduce complexity in practice

if (x == 0) ++x; // x == 1else x = 2; // x == 2y = x; // (x == 1, y == 1) or (x == 2, y == 2)

Page 42: Finding software bugs with the Clang Static Analyzer

Path-Sensitive Analyses• Reason about individual paths and guards on branches

• Uninitialized variables example:■ Path-sensitive analysis picks up only 2 paths■ No false positive

• Worst-case exponential-time■ Complexity explodes with branches and loops■ Lots of clever tricks to reduce complexity in practice

• Clang static analyzer uses flow- and path-sensitive analyses

if (x == 0) ++x; // x == 1else x = 2; // x == 2y = x; // (x == 1, y == 1) or (x == 2, y == 2)

Page 43: Finding software bugs with the Clang Static Analyzer

Finding leaks in Objective-C code

Page 44: Finding software bugs with the Clang Static Analyzer

Memory Management in Objective-C

Objective-C in a Nutshell

• Used to develop Mac/iPhone apps• C with object-oriented programming extensions

Memory management

• Objective-C objects have embedded reference counts• Reference counts obey strict ownership idiom• Garbage collection also available... but there are subtle rules

Page 45: Finding software bugs with the Clang Static Analyzer

Ownership Idiom

Page 46: Finding software bugs with the Clang Static Analyzer

Ownership Idiom

// Allocate an NSString. Since the object is newly allocated,// ‘str’ is an owning reference (+1 retain count).NSString* str = [[NSString alloc] initWithCString:“hello world” encoding:NSASCIIStringEncoding];

Page 47: Finding software bugs with the Clang Static Analyzer

Ownership Idiom

// Allocate an NSString. Since the object is newly allocated,// ‘str’ is an owning reference (+1 retain count).NSString* str = [[NSString alloc] initWithCString:“hello world” encoding:NSASCIIStringEncoding];

// Pass ‘str’ to ‘foo’. ‘foo’ may increment the retain// count, but we are still obligated to decrement the +1// count we have because ‘str’ is an owning reference.foo(str);

Page 48: Finding software bugs with the Clang Static Analyzer

Ownership Idiom

// Allocate an NSString. Since the object is newly allocated,// ‘str’ is an owning reference (+1 retain count).NSString* str = [[NSString alloc] initWithCString:“hello world” encoding:NSASCIIStringEncoding];

// Pass ‘str’ to ‘foo’. ‘foo’ may increment the retain// count, but we are still obligated to decrement the +1// count we have because ‘str’ is an owning reference.foo(str);

// We’re done using str. Decrement our ownership count.[str release];

Page 49: Finding software bugs with the Clang Static Analyzer

Ownership Idiom

// Allocate an NSString. Since the object is newly allocated,// ‘str’ is an owning reference (+1 retain count).NSString* str = [[NSString alloc] initWithCString:“hello world” encoding:NSASCIIStringEncoding];

// Pass ‘str’ to ‘foo’. ‘foo’ may increment the retain// count, but we are still obligated to decrement the +1// count we have because ‘str’ is an owning reference.foo(str);

// We’re done using str. Decrement our ownership count.// LEAK!

Page 50: Finding software bugs with the Clang Static Analyzer

Memory Leak: Colloquy

7/29/08 11:08 PM/Users/resistor/Downloads/Colloquy/Views/MVTextView.m

Page 1 of 7file:///Volumes/Data/Users/kremenek/Desktop/ColloquyAnalysis/Colloquy/report-shpnE5.html#EndPath

[1] Method returns an object with a +1 retain count (owning reference).

[2] Taking true branch.

[3] Object allocated on line 34 and stored into 'newArray' is no longer referenced after this point and has a retain count of +1

(object leaked).

Bug Summary

File: Views/MVTextView.m

Location: line 39, column 3

Description: Memory Leak

Code is compiled without garbage collection.

Annotated Source Code

1 #import "MVTextView.h"

2 #import "JVTranscriptFindWindowController.h"

3

4 @interface MVTextView (MVTextViewPrivate)

5 - (BOOL) checkKeyEvent:(NSEvent *) event;

6 - (BOOL) triggerKeyEvent:(NSEvent *) event;

7 @end

8

9 #pragma mark -

10

11 @implementation MVTextView

12 - (id)initWithFrame:(NSRect)frameRect textContainer:(NSTextContainer *)aTextContainer {

13 if( (self = [super initWithFrame:frameRect textContainer:aTextContainer] ) )

14 defaultTypingAttributes = [[NSDictionary allocWithZone: ] init];

15 return self;

16 }

17

18 - (void) dealloc {

19 [defaultTypingAttributes release];

20 defaultTypingAttributes = ;

21

22 [_lastCompletionMatch release];

23 _lastCompletionMatch = ;

24

25 [_lastCompletionPrefix release];

26 _lastCompletionPrefix = ;

27

28 [super dealloc];

29 }

30

31 #pragma mark -

32

33 - (void) interpretKeyEvents:(NSArray *) eventArray {

34 NSMutableArray *newArray = [[NSMutableArray allocWithZone: ] init];

35 NSEnumerator *e = [eventArray objectEnumerator];

36 NSEvent *anEvent = ;

37

38 if( ! [self isEditable] ) {

39 [super interpretKeyEvents:eventArray];

40 return;

41 }

42

43 while( ( anEvent = [e nextObject] ) ) {

44 if( [self checkKeyEvent:anEvent] ) {

45 if( [newArray count] > 0 ) {

46 [super interpretKeyEvents:newArray];

nil

nil

nil

nil

nil

nil

Page 51: Finding software bugs with the Clang Static Analyzer

Ownership DFA

Page 52: Finding software bugs with the Clang Static Analyzer

Ownership DFA

Owned (+1)

Page 53: Finding software bugs with the Clang Static Analyzer

Ownership DFA

Owned (+2) Owned (+3)

retain

release

retain

release

retain

releaseOwned (+1)

Page 54: Finding software bugs with the Clang Static Analyzer

Ownership DFA

Released

release

Owned (+2) Owned (+3)

retain

release

retain

release

retain

releaseOwned (+1)

Page 55: Finding software bugs with the Clang Static Analyzer

Ownership DFA

Use after Release

any useReleased

release

Owned (+2) Owned (+3)

retain

release

retain

release

retain

releaseOwned (+1)

Page 56: Finding software bugs with the Clang Static Analyzer

Ownership DFA

Use after Release

any useReleased

release

Owned (+2) Owned (+3)

retain

release

retain

release

retain

releaseOwned (+1)

A memory leak occurs when we no longer reference an pointerOwned

Page 57: Finding software bugs with the Clang Static Analyzer

Ownership DFA

Use after Release

any useReleased

release

Owned (+2) Owned (+3)

retain

release

retain

release

retain

releaseOwned (+1)

A memory leak occurs when we no longer reference an pointerOwned

¬Owned ¬Owned (+1) ¬Owned (+2)

Invalid Release

retain

release

retain

release

release

retain

release

A leak occurs when we no longer referencea pointer with an excess retain count¬Owned

Page 58: Finding software bugs with the Clang Static Analyzer

Miscellanea

Checker-specific issues

• Autorelease pools• Objective-C 2.0 Garbage Collection• API-specific ownership rules• Educational diagnostics

Analysis issues

• Aliasing• Plenty of room for improvement

Page 59: Finding software bugs with the Clang Static Analyzer

Checker Results• Used internally at Apple• Announced in June 2008 (WWDC)

■ Hundreds of downloads of the static analyzer■ Thousands of bugs found

Page 60: Finding software bugs with the Clang Static Analyzer

Some Implementation Details

Page 61: Finding software bugs with the Clang Static Analyzer

Why Analyze Source Code?

Bug-finding requires excellent diagnostics

• Tool must explain a bug to the user

• Users cannot fix bugs they don’t understand• Need rich source and type information

What about analyzing LLVM IR?

• Loss of source information• High-level types discarded• Compiler lowers language constructs• Compiler makes assumptions (e.g., order of evaluation)

Page 62: Finding software bugs with the Clang Static Analyzer

Clang Libraries

Page 63: Finding software bugs with the Clang Static Analyzer

Clang Libraries

App LibrariesCore Libraries

Analysis

Rewrite

AST

Parse

Lex

Basic

CLI Driver IDE ?

LLVMGen

Sema

Page 64: Finding software bugs with the Clang Static Analyzer

App LibrariesCore Libraries

Analysis

Rewrite

AST

Parse

Lex

Clang Libraries

Basic

CLI Driver IDE ?

LLVMGen

Sema

Page 65: Finding software bugs with the Clang Static Analyzer

libAnalysis

Page 66: Finding software bugs with the Clang Static Analyzer

libAnalysis

Intra-Procedural Analysis

• Source-level Control-Flow Graphs (CFGs)• Flow-sensitive dataflow solver

■ Live Variables■ Uninitialized Values

• Path-sensitive dataflow engine■ Retain/Release checker■ Logic bugs (e.g., null dereferences)

• Various checks and analyses■ Dead stores■ API checks

Page 67: Finding software bugs with the Clang Static Analyzer

libAnalysis

Page 68: Finding software bugs with the Clang Static Analyzer

libAnalysis

Path Diagnostics (Bug-Reporting)

• PathDiagnosticClient■ Abstract interface to implement a “view” of bug reports■ Separates report visualization from generation■ HTMLDiagnostics (renders HTML, uses libRewrite)

• BugReporter■ Helper class to generate diagnostics for PathDiagnosticClient

Page 69: Finding software bugs with the Clang Static Analyzer

Looking Forward• Richer Diagnostics• Inter-procedural Analysis (IPA)• Lots of Checks• Scriptability• Multiple Analysis Engines

Page 70: Finding software bugs with the Clang Static Analyzer

Looking Forward• Richer Diagnostics• Inter-procedural Analysis (IPA)• Lots of Checks• Scriptability• Multiple Analysis Engines

http://clang.llvm.org