1 Today Random testing again Some background (Hamlet) Why not always use random testing? More Dominion & project? CUTE: “concolic” testing

1

Today

Random testing again• Some background (Hamlet)

• Why not always use random testing?

More Dominion & project?

CUTE: “concolic” testing

2

Random Testing

“Random testing is, of course, the most used and least useful method”• Original slang meaning of “random” to mean

“wrong” or “disorganized and useless”

We mean random in the mathematical sense• Take a stream of pseudo-random numbers

and map them into test operations/cases

3

Random Testing

Hamlet talks about one advantage of random testing (that often doesn’t really appear):• With random testing and an operational

profile giving usage patterns for the program, with probabilities

• random testing can establish statistically meaningful estimates of program reliability

“In program testing, with systematic methods we know what we are doing, but not what it means; only by giving up all systematization can the significance of testing be known.” - Hamlet, “Random Testing”

4

Operational Profiles & Reliability

Can make statements like:• “It’s 99% certain that P will fail no more than

1 in 1,000,000 times.”• “It’s 95% certain that P has a mean-time-to-

failure greater than 100 hours of operation.”• Real statistics!

Sadly, usable operational profiles with probabilities attached are very rare• And the numbers mean nothing if the profile

is something you make up

5

Random Testing

Hamlet also notes that random testing is a good “baseline” for other methods to compare to• Keeps us honest• If systematic is no better, then it may not be a very

good approach• What’s good about 80% (no loop) path coverage?

“If, on the other hand, a comparison with random testing as the standard were available, it might help us to write better standards, or to improve the significance of systematic methods.” - Hamlet, “Random Testing”

6

Hamlet’s Claims

Two cases “when only random testing will do” (Hamlet, Workshop on Random Testing 06)

• Well, maybe not only random testing

• Cases where systematic testing is meaningless (no plan has a rational basis)

• Cases where systematic testing is too difficult to carry out

• Hamlet emphasizes the dangers of adding systematic choice without justification: confusing what software should do with what it does do

7

Hamlet’s Claims

Danger of ignoring a test case because• “Oh come on, it couldn’t possibly fail to

handle that correctly” or• “Nobody would ever do that”

Compare to game theory: cases where if we really know something about opponent’s play we can take advantage• But, lacking that, random strategy may be

“inefficient” but is the only strategy that cannot be “gamed” if opponent knows what we’re up to

This is not to imply that programs we testare adversaries, “out to get us” – but it’ssometimes useful to act as if they are

8

...

int fs_read (int fd, char* buf, size_t nbytes) { if (buf == NULL) { errno = EINVAL; return -1; }

if (!in_table(fd)) { errno = EBADF; return -1; } assert(0); ...}

int main () { int i;

int fd = nondet_int(); int nbytes = nondet_size_t(); havoc(file_system_state); file_system_state_old = file_system_state; ...

int res = fs_read (fd, NULL, nbytes);

assert(file_system_state = old_file_system_state);}

Sidebar: Proving an Assumption> cbmc discharge.c

file discharge.c: ParsingConvertingChecking dischargeGenerating GOTO ProgramPointer AnalysisAdding Pointer ChecksStarting Bounded Model CheckingUnwinding loop 1 iteration 1Unwinding loop 1 iteration 2...size of program expression: 337 assignmentsremoved 71 assignmentsGenerated 389 claims, 5 remainingPassing problem to MiniSATPassing to decision procedureRunning MiniSATSolving with MiniSAT34550 variables, 110234 clausesSAT checker: negated claim is UNSATISFIABLE, i.e., holdsVERIFICATION SUCCESSFUL

9

Problems with Random Testing

Why not use random testing for everything?• Oracle problem: figuring out if a

random test is successful is often much harder than with a systematic test• Sometimes we can’t do differential testing

10


Why not use random testing for everything?• Generation problem: how do we make

a random input?• What, exactly, is a random C program?• Is a random C program going to fit any

sane (but unknown) operational profile?• Are these the bugs we care about most?• For some programs, producing well-

formed input that makes for interesting tests is fundamentally hard

11


Why not use random testing for everything?• Even with feedback, produces lots of

redundant or uninteresting operations• Not good at testing boundary

conditions where the boundaries are drawn from a large range• If the program only breaks when x = 2^31

don’t expect to find that randomly

12


Why not use random testing for everything?• Related problem: not good when an

error depends on an unlikely relationship between inputs• Program only fails when x + y = MAXINT?

• Good luck finding that if you don’t bake it into the “random” tester explicitly. . .

Documents

1 Today Random testing again Some background (Hamlet) Why not always use random testing? More Dominion & project? CUTE: “concolic” testing