35
Testing Concurrent Testing Concurrent Programs Programs COMP 402 - Production COMP 402 - Production Programming Programming Mathias Ricken Mathias Ricken Rice University Rice University Spring 2009 Spring 2009

Testing Concurrent Programs COMP 402 - Production Programming Mathias Ricken Rice University Spring 2009

Embed Size (px)

Citation preview

Page 1: Testing Concurrent Programs COMP 402 - Production Programming Mathias Ricken Rice University Spring 2009

Testing Concurrent ProgramsTesting Concurrent Programs

COMP 402 - Production ProgrammingCOMP 402 - Production Programming

Mathias RickenMathias Ricken

Rice UniversityRice University

Spring 2009Spring 2009

Page 2: Testing Concurrent Programs COMP 402 - Production Programming Mathias Ricken Rice University Spring 2009

Moore’s LawMoore’s Law

Page 3: Testing Concurrent Programs COMP 402 - Production Programming Mathias Ricken Rice University Spring 2009
Page 4: Testing Concurrent Programs COMP 402 - Production Programming Mathias Ricken Rice University Spring 2009

TimelinessTimeliness

CPU clock frequencies stagnateCPU clock frequencies stagnate

Multi-Core CPUs provide additional Multi-Core CPUs provide additional processing powerprocessing power– Multiple threads needed to use multiple coresMultiple threads needed to use multiple cores

Writing concurrent programs is difficult!Writing concurrent programs is difficult!

Page 5: Testing Concurrent Programs COMP 402 - Production Programming Mathias Ricken Rice University Spring 2009

Programming ExamplesProgramming Examples

Page 6: Testing Concurrent Programs COMP 402 - Production Programming Mathias Ricken Rice University Spring 2009

Unit TestingUnit Testing

Unit tests…Unit tests…– Test a part, not the whole programTest a part, not the whole program– Occur earlierOccur earlier– Automate testingAutomate testing– Serve as documentationServe as documentation– Prevent bugs from reoccurringPrevent bugs from reoccurring– Help keep the shared repository cleanHelp keep the shared repository clean

Effective with a single thread of controlEffective with a single thread of control

Page 7: Testing Concurrent Programs COMP 402 - Production Programming Mathias Ricken Rice University Spring 2009

Foundation of Unit TestingFoundation of Unit Testing

Unit tests depend on deterministic Unit tests depend on deterministic behaviorbehavior

Known input, expected output…Known input, expected output…

SuccessSuccess correct behaviorcorrect behaviorFailureFailure flawed codeflawed code

Outcome of test is meaningfulOutcome of test is meaningful

Page 8: Testing Concurrent Programs COMP 402 - Production Programming Mathias Ricken Rice University Spring 2009

Problems Due to ConcurrencyProblems Due to Concurrency

Thread scheduling is nondeterministic and Thread scheduling is nondeterministic and machine-dependentmachine-dependent– Code may be executed under different schedulesCode may be executed under different schedules– Different schedules may produce different resultsDifferent schedules may produce different results

Known input, expected output…Known input, expected output…

SuccessSuccess correct behaviorcorrect behavior in this schedulein this schedule, , may be may be flawedflawed in other schedule in other schedule

FailureFailure flawed codeflawed code

Success of unit test is meaninglessSuccess of unit test is meaningless

Page 9: Testing Concurrent Programs COMP 402 - Production Programming Mathias Ricken Rice University Spring 2009

Possible SolutionsPossible Solutions

Programming Language FeaturesProgramming Language Features– Ensuring that bad things cannot happenEnsuring that bad things cannot happen– May restrict programmersMay restrict programmers

Lock-Free AlgorithmsLock-Free Algorithms– Ensuring that if bad things happen, it’s okEnsuring that if bad things happen, it’s ok– May limit data structures availableMay limit data structures available

Comprehensive TestingComprehensive Testing– Testing if bad things happen in any scheduleTesting if bad things happen in any schedule– Does not prevent problems, but does not limit Does not prevent problems, but does not limit

solutions eithersolutions either

Page 10: Testing Concurrent Programs COMP 402 - Production Programming Mathias Ricken Rice University Spring 2009

ContributionsContributions

Improvements to JUnitImprovements to JUnit– Detect exceptions and failed assertions in Detect exceptions and failed assertions in

threads other than the main threadthreads other than the main thread

Annotations for Concurrency InvariantsAnnotations for Concurrency Invariants– Express complicated requirements about Express complicated requirements about

locks and threadslocks and threads

Tools for Schedule-Based ExecutionTools for Schedule-Based Execution– Record, deadlock monitorRecord, deadlock monitor– Random delays, random yieldsRandom delays, random yields

Page 11: Testing Concurrent Programs COMP 402 - Production Programming Mathias Ricken Rice University Spring 2009

Improvements to JUnitImprovements to JUnit

Uncaught exceptions and failed assertionsUncaught exceptions and failed assertions– Not caught in child threadsNot caught in child threads

Page 12: Testing Concurrent Programs COMP 402 - Production Programming Mathias Ricken Rice University Spring 2009

Sample JUnit TestsSample JUnit Tests

publicpublic class class Test Test extends extends TestCase {TestCase { public void public void testException() {testException() { throwthrow newnew RuntimeException("booh!"); RuntimeException("booh!"); }} public void public void testAssertion() {testAssertion() { assertEquals(0, 1);assertEquals(0, 1); }}}}

if (0!=1) throw new AssertionFailedError();

}}Both tests

fail.Both tests

fail.

Page 13: Testing Concurrent Programs COMP 402 - Production Programming Mathias Ricken Rice University Spring 2009

Problematic JUnit TestsProblematic JUnit Tests

publicpublic class class Test Test extends extends TestCase {TestCase { public void public void testException() {testException() { newnew Thread(new Runnable() { Thread(new Runnable() { public void run() {public void run() { throwthrow newnew RuntimeException("booh!"); RuntimeException("booh!"); }} }).start();}).start(); }}}}

newnew Thread(new Runnable() { Thread(new Runnable() { public void run() {public void run() { throwthrow newnew RuntimeException("booh!"); RuntimeException("booh!"); }}}).start();}).start();

throwthrow newnew RuntimeException("booh!"); RuntimeException("booh!");

Main thread

Child thread

Main thread

Child thread

spawns

uncaught!

end of test

success!

Page 14: Testing Concurrent Programs COMP 402 - Production Programming Mathias Ricken Rice University Spring 2009

Problematic JUnit TestsProblematic JUnit Tests

publicpublic class class Test Test extends extends TestCase {TestCase { public void public void testException() {testException() { newnew Thread(new Runnable() { Thread(new Runnable() { public void run() {public void run() { throwthrow newnew RuntimeException("booh!"); RuntimeException("booh!"); }} }).start();}).start(); }}}}

newnew Thread(new Runnable() { Thread(new Runnable() { public void run() {public void run() { throwthrow newnew RuntimeException("booh!"); RuntimeException("booh!"); }}}).start();}).start();

throwthrow newnew RuntimeException("booh!"); RuntimeException("booh!");

Main thread

Child threadUncaught exception,

test should fail but does not!

Page 15: Testing Concurrent Programs COMP 402 - Production Programming Mathias Ricken Rice University Spring 2009

Improvements to JUnitImprovements to JUnit

Uncaught exceptions and failed assertionsUncaught exceptions and failed assertions– Not caught in child threadsNot caught in child threads

Thread group with exception handlerThread group with exception handler– JUnit test runs in a separate thread, not main threadJUnit test runs in a separate thread, not main thread– Child threads are created in same thread groupChild threads are created in same thread group– When test ends, check if handler was invokedWhen test ends, check if handler was invoked

Page 16: Testing Concurrent Programs COMP 402 - Production Programming Mathias Ricken Rice University Spring 2009

Thread Group for JUnit TestsThread Group for JUnit Tests

publicpublic class class Test Test extends extends TestCase {TestCase { public void public void testException() {testException() { newnew Thread(new Runnable() { Thread(new Runnable() { public void run() {public void run() { throwthrow newnew RuntimeException("booh!"); RuntimeException("booh!"); }} }).start();}).start(); }}}}

newnew Thread(new Runnable() { Thread(new Runnable() { public void run() {public void run() { throwthrow newnew RuntimeException("booh!"); RuntimeException("booh!"); }}}).start();}).start();

throwthrow newnew RuntimeException("booh!"); RuntimeException("booh!");

Test thread

Child thread

invokeschecks

TestGroup’s Uncaught Exception Handler

Page 17: Testing Concurrent Programs COMP 402 - Production Programming Mathias Ricken Rice University Spring 2009

Thread Group for JUnit TestsThread Group for JUnit Tests

publicpublic class class Test Test extends extends TestCase {TestCase { public void public void testException() {testException() { newnew Thread(new Runnable() { Thread(new Runnable() { public void run() {public void run() { throwthrow newnew RuntimeException("booh!"); RuntimeException("booh!"); }} }).start();}).start(); }}}}

newnew Thread(new Runnable() { Thread(new Runnable() { public void run() {public void run() { throwthrow newnew RuntimeException("booh!"); RuntimeException("booh!"); }}}).start();}).start();

throwthrow newnew RuntimeException("booh!"); RuntimeException("booh!");

Test thread

Child thread

Test thread

Child thread

spawns

uncaught!

end of testfailure!

invokes group’s handler

Main thread

spawns and waits resumes

check group’s handler

Page 18: Testing Concurrent Programs COMP 402 - Production Programming Mathias Ricken Rice University Spring 2009

Improvements to JUnitImprovements to JUnit

Uncaught exceptions and failed assertionsUncaught exceptions and failed assertions– Not caught in child threadsNot caught in child threads

Thread group with exception handlerThread group with exception handler– JUnit test runs in a separate thread, not main threadJUnit test runs in a separate thread, not main thread– Child threads are created in same thread groupChild threads are created in same thread group– When test ends, check if handler was invokedWhen test ends, check if handler was invoked

Detection of uncaught exceptions and failed Detection of uncaught exceptions and failed assertions in child threads that occurred before assertions in child threads that occurred before test’s endtest’s end

Past tense: occurred!

Page 19: Testing Concurrent Programs COMP 402 - Production Programming Mathias Ricken Rice University Spring 2009

Child Thread Outlives ParentChild Thread Outlives Parent

publicpublic class class Test Test extends extends TestCase {TestCase { public void public void testException() {testException() { newnew Thread(new Runnable() { Thread(new Runnable() { public void run() {public void run() { throwthrow newnew RuntimeException("booh!"); RuntimeException("booh!"); }} }).start();}).start(); }}}}

newnew Thread(new Runnable() { Thread(new Runnable() { public void run() {public void run() { throwthrow newnew RuntimeException("booh!"); RuntimeException("booh!"); }}}).start();}).start();

throwthrow newnew RuntimeException("booh!"); RuntimeException("booh!");

Test thread

Child thread

Test thread

Child thread

spawns

uncaught!

end of testfailure!

invokes group’s handler

Main thread

spawns and waits resumes

check group’s handler

Page 20: Testing Concurrent Programs COMP 402 - Production Programming Mathias Ricken Rice University Spring 2009

Child Thread Outlives ParentChild Thread Outlives Parent

publicpublic class class Test Test extends extends TestCase {TestCase { public void public void testException() {testException() { newnew Thread(new Runnable() { Thread(new Runnable() { public void run() {public void run() { throwthrow newnew RuntimeException("booh!"); RuntimeException("booh!"); }} }).start();}).start(); }}}}

newnew Thread(new Runnable() { Thread(new Runnable() { public void run() {public void run() { throwthrow newnew RuntimeException("booh!"); RuntimeException("booh!"); }}}).start();}).start();

throwthrow newnew RuntimeException("booh!"); RuntimeException("booh!");

Test thread

Child thread

Test thread

Child thread

spawns

uncaught!end of test

success!

invokes group’s handler

Main thread

spawns and waits resumescheck group’s

handler

Too late!

Page 21: Testing Concurrent Programs COMP 402 - Production Programming Mathias Ricken Rice University Spring 2009

Improvements to JUnitImprovements to JUnit

Child threads are not required to terminateChild threads are not required to terminate– A test may pass before an error is reachedA test may pass before an error is reached

Detect if any child threads are still aliveDetect if any child threads are still alive– Declare failure if test thread has not waitedDeclare failure if test thread has not waited– Ignore daemon threads, system threads (AWT, RMI, Ignore daemon threads, system threads (AWT, RMI,

garbage collection, etc.)garbage collection, etc.)

Previous schedule is a test failurePrevious schedule is a test failure– Should be prevented by using Should be prevented by using Thread.join()Thread.join()

Page 22: Testing Concurrent Programs COMP 402 - Production Programming Mathias Ricken Rice University Spring 2009

Enforced JoinEnforced Join

publicpublic class class Test Test extends extends TestCase {TestCase {

public void public void testException() {testException() {

newnew Thread(new Runnable() { Thread(new Runnable() {

public void run() {public void run() {

throwthrow newnew RuntimeException("booh!"); RuntimeException("booh!");

}}

});});

t.start(); … t.join();t.start(); … t.join();

}}

}}

Thread t = Thread t = newnew Thread(new Runnable() { Thread(new Runnable() {

public void run() {public void run() {

throwthrow newnew RuntimeException("booh!"); RuntimeException("booh!");

}}

});});

t.start(); … t.join(); …t.start(); … t.join(); …

throwthrow newnew RuntimeException("booh!"); RuntimeException("booh!");

Test thread

Child thread

Page 23: Testing Concurrent Programs COMP 402 - Production Programming Mathias Ricken Rice University Spring 2009

Improvements to JUnitImprovements to JUnit

Child threads are not required to terminateChild threads are not required to terminate– A test may pass before an error is reachedA test may pass before an error is reached

Detect if any child threads are still aliveDetect if any child threads are still alive– Declare failure if test thread has not waitedDeclare failure if test thread has not waited– Ignore daemon threads, system threads (AWT, RMI, Ignore daemon threads, system threads (AWT, RMI,

garbage collection, etc.)garbage collection, etc.)

Previous schedule is a test failurePrevious schedule is a test failure– Should be prevented by using Should be prevented by using Thread.join()Thread.join()

Page 24: Testing Concurrent Programs COMP 402 - Production Programming Mathias Ricken Rice University Spring 2009

Testing ConcJUnitTesting ConcJUnitReplacement for junit.jar or as plugin JAR for Replacement for junit.jar or as plugin JAR for JUnit 4.2JUnit 4.2– Available as binary and source at Available as binary and source at

http://www.concutest.org/http://www.concutest.org/

Results from DrJava’s unit testsResults from DrJava’s unit tests– Child thread for communication with slave VM still Child thread for communication with slave VM still

alive in testalive in test– Several reader and writer threads still alive in low Several reader and writer threads still alive in low

level test (calls to level test (calls to join()join() missing) missing)

DrJava currently does not use ConcJUnitDrJava currently does not use ConcJUnit– Custom-made TestCase classCustom-made TestCase class– Does not check if Does not check if join()join() calls are missing calls are missing

Page 25: Testing Concurrent Programs COMP 402 - Production Programming Mathias Ricken Rice University Spring 2009

ConclusionConclusion

Improved JUnit now detects problems in Improved JUnit now detects problems in other threadsother threads– Only in chosen scheduleOnly in chosen schedule– Needs schedule-based executionNeeds schedule-based execution

Annotations ease documentation and Annotations ease documentation and checking of concurrency invariantschecking of concurrency invariants– Open-source library of Java API invariantsOpen-source library of Java API invariants

Support programs for schedule-based Support programs for schedule-based executionexecution

Page 26: Testing Concurrent Programs COMP 402 - Production Programming Mathias Ricken Rice University Spring 2009

Future WorkFuture Work

Schedule-Based ExecutionSchedule-Based Execution– Replay given scheduleReplay given schedule– Generate possible schedulesGenerate possible schedules– Dynamic race detectionDynamic race detection– Probabilities/durations for random Probabilities/durations for random

yields/sleepsyields/sleeps

Extend annotations to Floyd-Hoare logicExtend annotations to Floyd-Hoare logic– Preconditions, postconditionsPreconditions, postconditions– Representation invariantsRepresentation invariants

Page 27: Testing Concurrent Programs COMP 402 - Production Programming Mathias Ricken Rice University Spring 2009

Extra SlidesExtra Slides

Page 28: Testing Concurrent Programs COMP 402 - Production Programming Mathias Ricken Rice University Spring 2009

Test all possible schedulesTest all possible schedules– Concurrent unit tests meaningful againConcurrent unit tests meaningful again

Number of schedules (Number of schedules (NN))– tt: # of threads, : # of threads, ss: # of slices per thread: # of slices per thread

detail

Tractability of Comprehensive TestingTractability of Comprehensive Testing

Page 29: Testing Concurrent Programs COMP 402 - Production Programming Mathias Ricken Rice University Spring 2009

Extra: Number of SchedulesExtra: Number of Schedules

back

Product of s-combinations

For thread 1: choose s out of ts time slicesFor thread 2: choose s out of ts-s time slices…For thread t-1: choose s out of 2s time slicesFor thread t-1: choose s out of s time slices

Writing s-combinations using factorial

Cancel out terms in denominator and next numerator

Left with (ts)! in numerator and t numerators with s!

Page 30: Testing Concurrent Programs COMP 402 - Production Programming Mathias Ricken Rice University Spring 2009

If program is race-free, we do not have to If program is race-free, we do not have to simulate all thread switchessimulate all thread switches– Threads interfere only at “critical points”: lock Threads interfere only at “critical points”: lock

operations, shared or volatile variables, etc.operations, shared or volatile variables, etc.– Code between critical points cannot affect outcomeCode between critical points cannot affect outcome– Simulate all possible arrangements of blocks Simulate all possible arrangements of blocks

delimited by critical pointsdelimited by critical points

Run dynamic race detection in parallelRun dynamic race detection in parallel– Lockset algorithm (e.g. Eraser by Savage et al)Lockset algorithm (e.g. Eraser by Savage et al)

Tractability of Comprehensive TestingTractability of Comprehensive Testing

Page 31: Testing Concurrent Programs COMP 402 - Production Programming Mathias Ricken Rice University Spring 2009

Critical Points ExampleCritical Points Example

Thread 1

Thread 2

Local Var 1

Local Var 1

Shared Var

Lock

lock access unlock

lock access unlock

lock access unlock

All accesses protected by

lock

Local variables don’t need

locking

All accesses protected by

lock

All accesses protected by

lock

Page 32: Testing Concurrent Programs COMP 402 - Production Programming Mathias Ricken Rice University Spring 2009

Fewer critical points than thread switchesFewer critical points than thread switches– Reduces number of schedulesReduces number of schedules– Example:Example: Two threads, but no communicationTwo threads, but no communication

NN = 1 = 1

Unit tests are smallUnit tests are small– Reduces number of schedulesReduces number of schedules

Hopefully comprehensive simulation is tractableHopefully comprehensive simulation is tractable– If not, heuristics are still better than nothingIf not, heuristics are still better than nothing

Fewer SchedulesFewer Schedules

Page 33: Testing Concurrent Programs COMP 402 - Production Programming Mathias Ricken Rice University Spring 2009

LimitationsLimitationsImprovements only check chosen Improvements only check chosen scheduleschedule– A different schedule may still failA different schedule may still fail– Requires comprehensive testing to be Requires comprehensive testing to be

meaningfulmeaningful

May still miss uncaught exceptionsMay still miss uncaught exceptions– Specify absolute parent thread group, not Specify absolute parent thread group, not

relativerelative– Cannot detect uncaught exceptions in a Cannot detect uncaught exceptions in a

program’s uncaught exception handler (JLS program’s uncaught exception handler (JLS limitation)limitation)

details

Page 34: Testing Concurrent Programs COMP 402 - Production Programming Mathias Ricken Rice University Spring 2009

Extra: LimitationsExtra: Limitations

May still miss uncaught exceptionsMay still miss uncaught exceptions– Specify absolute parent thread group, not Specify absolute parent thread group, not

relative (rare)relative (rare)Koders.com: 913 matches Koders.com: 913 matches ThreadGroupThreadGroup vs. vs. 49,329 matches for 49,329 matches for ThreadThread

– Cannot detect uncaught exceptions in a Cannot detect uncaught exceptions in a program’s uncaught exception handler (JLS program’s uncaught exception handler (JLS limitation)limitation)

Koders.com: 32 method definitions for Koders.com: 32 method definitions for uncaughtExceptionuncaughtException method method

back

Page 35: Testing Concurrent Programs COMP 402 - Production Programming Mathias Ricken Rice University Spring 2009

Extra: DrJava StatisticsExtra: DrJava Statistics

20042004736736

61061036369090

511651164161416196596518.83%18.83%

10710711

Unit testsUnit testspassedpassedfailedfailednot runnot run

InvariantsInvariantsmetmetfailedfailed% failed% failed

KLOCKLOC““event thread”event thread”

20062006881881

8818810000

344123441230616306163796379611.03%11.03%

1291299999

back