40

An Executable Formal Semantics of C with - Matching Logic

  • Upload
    others

  • View
    6

  • Download
    0

Embed Size (px)

Citation preview

Page 1: An Executable Formal Semantics of C with - Matching Logic

IntroductionCurrent Work

An Executable Formal Semantics of C with

Applications

Chucky Ellison

Department of Computer Science

University of Illinois

August 2011

Chucky Ellison An Executable Formal Semantics of C with Applications

Page 2: An Executable Formal Semantics of C with - Matching Logic

IntroductionCurrent Work

1 IntroductionIntroductionMotivation

2 Current WorkCurrent Work on CWork on Analysis Tools

Chucky Ellison An Executable Formal Semantics of C with Applications 2/33

Page 3: An Executable Formal Semantics of C with - Matching Logic

IntroductionCurrent Work

IntroductionMotivation

There is no formal semantics for C.

Chucky Ellison An Executable Formal Semantics of C with Applications 3/33

Page 4: An Executable Formal Semantics of C with - Matching Logic

IntroductionCurrent Work

IntroductionMotivation

There are partial semantics

Gurevich and Huggins (1993) [ASM]

Cook, Cohen, and Redmond (1994) [Denotational]

Cook and Subramanian (1994) [Denotational]

Norrish (1998) [Small- and big-step SOS]

Black (1998) [Axiomatic]

Papaspyrou (2001) [Denotational]

Blazy and Leroy (2009) [Big-step SOS]

But, they simplify or leave out large parts of the language:Nondeterminism, casts, bit�elds, unions, struct values, variadicfunctions, memory alignment, goto, dynamic memoryallocation (malloc()), . . .

Chucky Ellison An Executable Formal Semantics of C with Applications 4/33

Page 5: An Executable Formal Semantics of C with - Matching Logic

IntroductionCurrent Work

IntroductionMotivation

But, Previous De�nitions Leave out Features

De�nitionFeature GH CCR CR No Pa BL

Bit�elds G# # # G# #Enums G# # # #Floats # # # # G# Struct/Union G# Struct as Value # # # # #

Arithmetic G# # Bitwise # # # Casts G# G# # G# G# Functions G# Exp. Side E�ects # #Variadic Funcs. # # # # # #

Eval. Strategies # G# # #Over�ow # # # # # #Volatile # # # # # G#Concurrency # # # # # #

Break/Continue G# G# Goto G# # # # #Switch G# # # G#

Longjmp # # # # # #Malloc # # # # # #

: Fully DescribedG#: Partially Described#: Not Described

GH denotes Gurevich and Huggins (1993),CCR is Cook, Cohen, and Redmond (1994),CR is Cook and Subramanian (1994),No is Norrish (1998),Pa is Papaspyrou (2001), andBL is Blazy and Leroy (2009).

Chucky Ellison An Executable Formal Semantics of C with Applications 5/33

Page 6: An Executable Formal Semantics of C with - Matching Logic

IntroductionCurrent Work

IntroductionMotivation

No Semantics-Based Tools Either

There are many useful C analysis/veri�cation tools, including:

Lint/Purify/Coverity/Valgrind

Blast

Havoc

Slam

VCC

Frama-C/Caduceus

These tools are based on approximative models of C.

Chucky Ellison An Executable Formal Semantics of C with Applications 6/33

Page 7: An Executable Formal Semantics of C with - Matching Logic

IntroductionCurrent Work

IntroductionMotivation

The Need for Semantics Based Tools

Despite all this work on analyzing C programs. . .

There is still no formal semantics for C.

Hard to argue for the soundness of the tools

Most tools are not even based on an incomplete semantics.

Chucky Ellison An Executable Formal Semantics of C with Applications 7/33

Page 8: An Executable Formal Semantics of C with - Matching Logic

IntroductionCurrent Work

IntroductionMotivation

The Need for Semantics Based Tools

Despite all this work on analyzing C programs. . .

There is still no formal semantics for C.

Hard to argue for the soundness of the tools

Most tools are not even based on an incomplete semantics.

Chucky Ellison An Executable Formal Semantics of C with Applications 7/33

Page 9: An Executable Formal Semantics of C with - Matching Logic

IntroductionCurrent Work

IntroductionMotivation

The Need for Semantics Based Tools

Despite all this work on analyzing C programs. . .

There is still no formal semantics for C.

Hard to argue for the soundness of the tools

Most tools are not even based on an incomplete semantics.

Chucky Ellison An Executable Formal Semantics of C with Applications 7/33

Page 10: An Executable Formal Semantics of C with - Matching Logic

IntroductionCurrent Work

IntroductionMotivation

Our Contribution

1 A complete formal semantics for C;

2 Semantics-based analysis tools for C;

3 Constructive evidence that rewriting-based semantics scale.

Chucky Ellison An Executable Formal Semantics of C with Applications 8/33

Page 11: An Executable Formal Semantics of C with - Matching Logic

IntroductionCurrent Work

IntroductionMotivation

Our Contribution

1 A complete formal semantics for C;

2 Semantics-based analysis tools for C;

3 Constructive evidence that rewriting-based semantics scale.

Chucky Ellison An Executable Formal Semantics of C with Applications 8/33

Page 12: An Executable Formal Semantics of C with - Matching Logic

IntroductionCurrent Work

IntroductionMotivation

Our Contribution

1 A complete formal semantics for C;

2 Semantics-based analysis tools for C;

3 Constructive evidence that rewriting-based semantics scale.

Chucky Ellison An Executable Formal Semantics of C with Applications 8/33

Page 13: An Executable Formal Semantics of C with - Matching Logic

IntroductionCurrent Work

IntroductionMotivation

Outline

1 IntroductionIntroductionMotivation

2 Current WorkCurrent Work on CWork on Analysis Tools

Chucky Ellison An Executable Formal Semantics of C with Applications 9/33

Page 14: An Executable Formal Semantics of C with - Matching Logic

IntroductionCurrent Work

IntroductionMotivation

C Speci�cations

ANSI C (1989)

ISO/IEC 9899:1990 �C90�

ISO/IEC 9899:1999 �C99�

540 pp.62 person-years of work (from 1995�1999)Work continued until 2007About 50 new features over C90, and many �xes

ISO/IEC 9899:201x �C1X�

Adds �rst support for concurrency

Chucky Ellison An Executable Formal Semantics of C with Applications 10/33

Page 15: An Executable Formal Semantics of C with - Matching Logic

IntroductionCurrent Work

IntroductionMotivation

Do We Really Need Formal Analysis Tools?

Question.

What happens when the approximative models of C fall short?

Answer.

Bad programs get proved correct, or behaviors go missing.

Chucky Ellison An Executable Formal Semantics of C with Applications 11/33

Page 16: An Executable Formal Semantics of C with - Matching Logic

IntroductionCurrent Work

IntroductionMotivation

Two Unsequenced Writes to 'x'

int main(void) {

int x = 0;

return (x = 1) + (x = 2);

}

Unde�ned according to C standard

GCC4, MSVC: returns 4GCC3, ICC, Clang: returns 3

Both Frama-C and Havoc �prove� it returns 4

Chucky Ellison An Executable Formal Semantics of C with Applications 12/33

Page 17: An Executable Formal Semantics of C with - Matching Logic

IntroductionCurrent Work

IntroductionMotivation

What is Unde�ned Behavior?

unde�ned behavior Behavior, upon use of a non-portable orerroneous program construct or of erroneous data,[with] no requirements.

In essence, this refers to problematic situations that are hardto identify statically or expensive to identify dynamically

Implementations can do anything for unde�ned behavior,including failing to compile, crashing, or appearing to work

Examples: division by zero, referring to an object outside itslifetime, (x = 1) + (x = 2)

Chucky Ellison An Executable Formal Semantics of C with Applications 13/33

Page 18: An Executable Formal Semantics of C with - Matching Logic

IntroductionCurrent Work

IntroductionMotivation

Left Shift of Negative Number

int main(void){

return -5 << 2;

}

Unde�ned according to C standard

GCC, ICC, Clang: returns −20MSVC: returns 127

Both Frama-C and Havoc �prove� it returns −20

Chucky Ellison An Executable Formal Semantics of C with Applications 14/33

Page 19: An Executable Formal Semantics of C with - Matching Logic

IntroductionCurrent Work

IntroductionMotivation

Write to String Literal

int main(void) {

"foo"[0] = 'x';

return "foo"[0];

}

Unde�ned according to C standard

GCC: doesn't compileICC, Clang: segmentation faultMSVC: returns 'f'

Frama-C �proves� it returns 'x'

Chucky Ellison An Executable Formal Semantics of C with Applications 15/33

Page 20: An Executable Formal Semantics of C with - Matching Logic

IntroductionCurrent Work

IntroductionMotivation

Unde�ned Behaviors are Fundamental to C

This was just 3 unde�ned programs. There are over 190 explicitlyunde�ned categories of behaviors in C.

Chucky Ellison An Executable Formal Semantics of C with Applications 16/33

Page 21: An Executable Formal Semantics of C with - Matching Logic

IntroductionCurrent Work

IntroductionMotivation

Valid Nondeterminism

int r;

int f(int x) {

return (r = x);

}

int main(void) {

return f(1) + f(2), r;

}

De�ned (Could return 1 or 2)

GCC, ICC, MSVC, Clang: returns 2

Both Frama-C and Havoc �prove� it can only return 2

Chucky Ellison An Executable Formal Semantics of C with Applications 17/33

Page 22: An Executable Formal Semantics of C with - Matching Logic

IntroductionCurrent Work

IntroductionMotivation

Motivation Summary

When the models of C used by analysis tools are too simplistic

Tools can draw incorrect conclusions about programs

Hard to argue for soundness without a semantics to compareagainst

Chucky Ellison An Executable Formal Semantics of C with Applications 18/33

Page 23: An Executable Formal Semantics of C with - Matching Logic

IntroductionCurrent Work

IntroductionMotivation

Con�guration of C

Chucky Ellison An Executable Formal Semantics of C with Applications 19/33

Page 24: An Executable Formal Semantics of C with - Matching Logic

IntroductionCurrent Work

IntroductionMotivation

A Complete De�nition of C

We have the �rst arguably complete formal de�nition of aconforming freestanding implementation of C.

Conforming Must accept all portable programs, but can alsoaccept non-portable programs.

Freestanding A precisely de�ned subset of all possible C features.This is the subset of C used when writing the kernelof an operating system.

It includes only <float.h>

<iso646.h>, <limits.h>, <stdalign.h>,<stdarg.h>, <stdbool.h>, <stddef.h>, and<stdint.h>.

Chucky Ellison An Executable Formal Semantics of C with Applications 20/33

Page 25: An Executable Formal Semantics of C with - Matching Logic

IntroductionCurrent Work

IntroductionMotivation

A Complete De�nition of C

We have the �rst arguably complete formal de�nition of aconforming freestanding implementation of C.

Conforming Must accept all portable programs, but can alsoaccept non-portable programs.

Freestanding A precisely de�ned subset of all possible C features.This is the subset of C used when writing the kernelof an operating system.

It includes only <float.h>

<iso646.h>, <limits.h>, <stdalign.h>,<stdarg.h>, <stdbool.h>, <stddef.h>, and<stdint.h>.

Chucky Ellison An Executable Formal Semantics of C with Applications 20/33

Page 26: An Executable Formal Semantics of C with - Matching Logic

IntroductionCurrent Work

IntroductionMotivation

A Complete De�nition of C

We have the �rst arguably complete formal de�nition of aconforming freestanding implementation of C.

Conforming Must accept all portable programs, but can alsoaccept non-portable programs.

Freestanding A precisely de�ned subset of all possible C features.This is the subset of C used when writing the kernelof an operating system.

It includes only <float.h>

<iso646.h>, <limits.h>, <stdalign.h>,<stdarg.h>, <stdbool.h>, <stddef.h>, and<stdint.h>.

Chucky Ellison An Executable Formal Semantics of C with Applications 20/33

Page 27: An Executable Formal Semantics of C with - Matching Logic

IntroductionCurrent Work

IntroductionMotivation

A Complete De�nition of C

We have the �rst arguably complete formal de�nition of aconforming freestanding implementation of C.

Conforming Must accept all portable programs, but can alsoaccept non-portable programs.

Freestanding A precisely de�ned subset of all possible C features.This is the subset of C used when writing the kernelof an operating system. It includes only <float.h>

<iso646.h>, <limits.h>, <stdalign.h>,<stdarg.h>, <stdbool.h>, <stddef.h>, and<stdint.h>.

Chucky Ellison An Executable Formal Semantics of C with Applications 20/33

Page 28: An Executable Formal Semantics of C with - Matching Logic

IntroductionCurrent Work

Current Work on CWork on Analysis Tools

Outline

1 IntroductionIntroductionMotivation

2 Current WorkCurrent Work on CWork on Analysis Tools

Chucky Ellison An Executable Formal Semantics of C with Applications 21/33

Page 29: An Executable Formal Semantics of C with - Matching Logic

IntroductionCurrent Work

Current Work on CWork on Analysis Tools

Outline

1 IntroductionIntroductionMotivation

2 Current WorkCurrent Work on CWork on Analysis Tools

Chucky Ellison An Executable Formal Semantics of C with Applications 22/33

Page 30: An Executable Formal Semantics of C with - Matching Logic

IntroductionCurrent Work

Current Work on CWork on Analysis Tools

Our Current Work on C

We currently have a preliminary semantics that is more

complete than other semantics to date.

Tested against the GCC torture tests:

Of 1093 tests, 776 tests appear to be standards compliant. Ofthose, we pass 770 (>99%).

int f(void){

signed char c = -1;

return c < 0;

}

int main(void){

if (f() != 1) { abort(); }

return 0;

}

Chucky Ellison An Executable Formal Semantics of C with Applications 23/33

Page 31: An Executable Formal Semantics of C with - Matching Logic

IntroductionCurrent Work

Current Work on CWork on Analysis Tools

Our Current Work on C

We currently have a preliminary semantics that is more

complete than other semantics to date.

Tested against the GCC torture tests:

Of 1093 tests, 776 tests appear to be standards compliant. Ofthose, we pass 770 (>99%).

int f(void){

signed char c = -1;

return c < 0;

}

int main(void){

if (f() != 1) { abort(); }

return 0;

}

Chucky Ellison An Executable Formal Semantics of C with Applications 23/33

Page 32: An Executable Formal Semantics of C with - Matching Logic

IntroductionCurrent Work

Current Work on CWork on Analysis Tools

Our Current Work is Already More Complete

De�nitionFeature GH CCR CR No Pa BL ER

Bit�elds G# # # G# # Enums G# # # # Floats # # # # G# G#Struct/Union G# Struct as Value # # # # #

Arithmetic G# # Bitwise # # # Casts G# G# # G# G# Functions G# Exp. Side E�ects # # Variadic Funcs. # # # # # #

Eval. Strategies # G# # # Over�ow # # # # # # Volatile # # # # # G# #Concurrency # # # # # # G#

Break/Continue G# G# Goto G# # # # # Switch G# # # G#

Longjmp # # # # # # Malloc # # # # # #

: Fully DescribedG#: Partially Described#: Not Described

GH denotes Gurevich and Huggins (1993),CCR is Cook, Cohen, and Redmond (1994),CR is Cook and Subramanian (1994),No is Norrish (1998),Pa is Papaspyrou (2001),BL is Blazy and Leroy (2009), and

ER is Ellison and Ros,u (our current work) .

Chucky Ellison An Executable Formal Semantics of C with Applications 24/33

Page 33: An Executable Formal Semantics of C with - Matching Logic

IntroductionCurrent Work

Current Work on CWork on Analysis Tools

Some Statistics about Our Semantics

Mechanized in K Framework

150 syntactic operators

5900 source lines of code

1200 di�erent K rules

Only 80 rules for statements

Only 160 for expressions

500 rules for declarations and types!

Chucky Ellison An Executable Formal Semantics of C with Applications 25/33

Page 34: An Executable Formal Semantics of C with - Matching Logic

IntroductionCurrent Work

Current Work on CWork on Analysis Tools

Outline

1 IntroductionIntroductionMotivation

2 Current WorkCurrent Work on CWork on Analysis Tools

Chucky Ellison An Executable Formal Semantics of C with Applications 26/33

Page 35: An Executable Formal Semantics of C with - Matching Logic

IntroductionCurrent Work

Current Work on CWork on Analysis Tools

Generic Tools

These tools are provided �for free� by rewriting logic and Maude:

Interpreter

Debugger

State-space search

Our tests have shown these tools work just as well with C as withtools based on de�nitions of smaller languages.

Chucky Ellison An Executable Formal Semantics of C with Applications 27/33

Page 36: An Executable Formal Semantics of C with - Matching Logic

IntroductionCurrent Work

Current Work on CWork on Analysis Tools

Interpretation to Find Bugs

Chucky Ellison An Executable Formal Semantics of C with Applications 28/33

Page 37: An Executable Formal Semantics of C with - Matching Logic

IntroductionCurrent Work

Current Work on CWork on Analysis Tools

Search to Find Bugs

Chucky Ellison An Executable Formal Semantics of C with Applications 29/33

Page 38: An Executable Formal Semantics of C with - Matching Logic

IntroductionCurrent Work

Current Work on CWork on Analysis Tools

LTL-Based Model Checking

Chucky Ellison An Executable Formal Semantics of C with Applications 30/33

Page 39: An Executable Formal Semantics of C with - Matching Logic

IntroductionCurrent Work

Current Work on CWork on Analysis Tools

Test Case Reduction

Chucky Ellison An Executable Formal Semantics of C with Applications 31/33

Page 40: An Executable Formal Semantics of C with - Matching Logic

IntroductionCurrent Work

Current Work on CWork on Analysis Tools

Du�'s Device

Unstructured control �ow (goto, switchs)

int n = (count+7)/8;

switch(count%8) {

case 0: do{ *dest++ = *src++;

case 7: *dest++ = *src++;

case 6: *dest++ = *src++;

case 5: *dest++ = *src++;

case 4: *dest++ = *src++;

case 3: *dest++ = *src++;

case 2: *dest++ = *src++;

case 1: *dest++ = *src++;

} while(--n>0);

}

Chucky Ellison An Executable Formal Semantics of C with Applications 32/33