Upload
brenden-maynard
View
27
Download
1
Tags:
Embed Size (px)
DESCRIPTION
Learning From Mistakes—A Comprehensive Study on Real World Concurrency Bug Characteristics. Shan Lu, Soyeon Park, Eunsoo Seo and Yuanyuan Zhou Appeared in ASPLOS’08. Presented by Michelle Goodstein LBA Reading Group 3/27/08. Introduction. Multi-core computers are common - PowerPoint PPT Presentation
Citation preview
Learning From Mistakes—A Learning From Mistakes—A Comprehensive Study on Real Comprehensive Study on Real World Concurrency Bug World Concurrency Bug CharacteristicsCharacteristics
Shan Lu, Soyeon Park, Eunsoo Seo and Yuanyuan ZhouAppeared in ASPLOS’08
Presented by Michelle GoodsteinLBA Reading Group 3/27/08
IntroductionIntroductionMulti-core computers are commonMore programmers are having to write
concurrent programsConcurrent programs have different bugs
than sequential programsHowever, without a study, hard to know
what those bugs areFirst real-world study of concurrency
bugs
IntroductionIntroductionKnowing the types of concurrent bugs
that actually occur in software will:◦Help create better bug detection schemes◦ Inform the testing process software goes
through◦Provide information to program language
designers
IntroductionIntroductionCurrent state of affairs◦ Repeating concurrent bugs is difficult◦ Test cases are critical to being able to diagnose a bug◦Most detection research focuses:
data races deadlock bugs some new work on detecting atomicity violations
Few studies on real world concurrency bugs◦Most use programs that were buggy by design for the
studyMost studies on bug characteristics focus on non-
concurrent bugs
MethodologyMethodology4 representative open-source
applications:◦MySQL◦Apache◦Mozilla◦OpenOffice
Each application has◦ 9-13 years of development history ◦1-4 million lines of code
MethodologyMethodologyRandomly selected bugs from bug
databases that contained at least one keyword related to concurrency (eg “race”, “concurrency”, “deadlock”, “synchronization”, etc.)
From these, randomly choose 500 bugs that have◦Root causes explained well and in detail◦Source code available◦Bug fix info available
MethodologyMethodologyRemove any bugs not truly caused by
concurrencyResult: 105 concurrency bugsSeparate study of deadlock and non-
deadlock bugs
MethodologyMethodologyEvaluated bugs in 3 dimensions◦Bug pattern: {atomicity-violation, order-
violation, other}◦Manifestation: required conditions for bug to
occur, # threads involved, # variables, # accesses
◦Bug fix strategy: Look at final patch, mistakes in intermediate patches, and whether TM can help
Results organized as a collection of findings
MotivationMotivation34/105 concurrency bugs cause program
crashes37/105 concurrency bugs cause programs
to hangConcurrency bugs are important
Bug PatternsBug Patterns
Findings: Bug PatternsFindings: Bug PatternsAtomicity Violation
Order Violation
Findings: Bug PatternsFindings: Bug PatternsMost (72/74) of the examined non-
deadlock concurrency bugs are either atomicity-violations or order-violationsFocusing on atomicity and order-violations
should detect most non-deadlock concurrency bugs
In fact, 24/74 are order violationsSince current tools don’t address order-
violation, new tools must be developed
Bug ManifestationsBug Manifestations
Findings: Bug ManifestationsFindings: Bug ManifestationsMost (101/105) bugs involved ≤ 2 threads• Most communication among a small number
of threads• Enforcing certain partial orderings among a
small number of threads can expose bugs• Heavy workloads can increase competition for
resources, and make it more likely to observe a partial ordering that causes a bug
Pairwise Testing can find many bugs
Findings: Bug ManifestationsFindings: Bug ManifestationsSome (7/31) bugs experience deadlock
bugs with only 1 thread!Easy to detect/avoid
Findings: Bug ManifestationsFindings: Bug ManifestationsMany (49/74) non-deadlock bugs involve
1 variable. However, 34% involve ≥ 2 variables Focusing on 1 variable is a good
simplificationHowever, new tools also necessary to
discover multivariable concurrency bugs
Findings: Bug ManifestationsFindings: Bug ManifestationsMost (30/31 ) deadlock bugs involved ≤ 2
resourcesPairwise testing of order among obtained
and released resources should help reveal deadlocks
Findings: Bug ManifestationsFindings: Bug ManifestationsMost (92%) bugs manifested if enforced certain
partial orderings among ≤ 4 memory accesses Testing small groups of accesses will be polynomial time and
expose most bugs
Bug FixesBug Fixes
Findings: Bug FixesFindings: Bug FixesAdding/changing locks only helps
minority (20/74) non-deadlock concurrency bug fixesLocks aren’t enough to fix all concurrency
bugs.Locks don’t promise ordering, just atomicityAddition of locks can hurt performance or
create new, deadlock bugs
Findings: Bug FixesFindings: Bug FixesMost common fix (19/31) to deadlock
bugs allows 1 thread to ignore acquiring a resource, like a lockThis may get rid of deadlock bugs, but create
other non-deadlock bugsCode may no longer be correct
Bug fixes: Buggy Patches Bug fixes: Buggy Patches 17/57 Mozilla bugs have ≥ 1 buggy patchOn average, release .4 buggy patches for
every final correct patchOf 23 distinct buggy patches for the 17 bugs:◦6 decrease probability of occurrence but do not
eliminate original bug◦5 create new concurrency bugs◦ 12 create new non-concurrency bugs
Findings: Bug fixesFindings: Bug fixesIn many (41/105) cases, TM can help
avoid concurrency bugs
Findings: Bug fixesFindings: Bug fixesAlso in many cases (44/105), TM might be
able to help with concurrency bugs◦Need to allow long regions, rollback of I/O,
strange “nature” of the code
Findings: Bug fixesFindings: Bug fixesIn 20/105 cases, TM provides little help◦TM cannot help with many order-violation bugsWhile TM could be useful in preventing
concurrency bugs, it will not fix all of them
ConclusionConclusion First real-world concurrent bug study Multiple findings on
◦ Type of concurrency bugs◦ Conditions for manifestation◦ Techniques for fixing concurrent bugs
Several heuristics proposed for:◦ Bug detection◦ Testing◦ Language Design (ie, TM)
Future work can focus on detecting common types of errors◦ Multi-variable bugs◦ Order violation bugs◦ Multiple-access bugs