10
It’s Not a Bug, It’s a Feature: Does Misclassification Affect Bug Localization? Pavneet Singh Kochhar, Tien-Duy B. Le, David Lo Singapore Management University

It’s Not a Bug, It’s a Feature:Does Misclassification Affect Bug Localization?

Embed Size (px)

Citation preview

Page 1: It’s Not a Bug, It’s a Feature:Does Misclassification Affect Bug Localization?

It’s Not a Bug, It’s a Feature:Does Misclassification Affect Bug Localization?

Pavneet Singh Kochhar, Tien-Duy B. Le, David LoSingapore Management University

Page 2: It’s Not a Bug, It’s a Feature:Does Misclassification Affect Bug Localization?

2/24

Misclassification of Issue Reports

BUG

Herzig et al. *• 1/3 issue reports are wrongly classified as bugs.• 40% of issue reports are misclassified.

* It’s not a Bug, it’s a Feature: How Misclassification Impacts Bug Prediction, K. Herzig, S. Just, A. Zeller, ICSE 2013

DOCUMENTATIONIMPROVEMENT

REFACTORING

BACKPORTCLEANUP

DESIGN DEFECT

TASK

TEST

Page 3: It’s Not a Bug, It’s a Feature:Does Misclassification Affect Bug Localization?

3/24

Bug Localization

Thousands of Source Code Files

GOAL: Find the buggy files ------>

Page 4: It’s Not a Bug, It’s a Feature:Does Misclassification Affect Bug Localization?

4/24

Our Study

• Dataset (3 Software Systems): HTTPlient, Jackrabbit, Lucene-Java

• Over 7000 issue reports• 3 Research Questions• Suggest mitigation strategy

How misclassification impacts bug localization?

Page 5: It’s Not a Bug, It’s a Feature:Does Misclassification Affect Bug Localization?

7/24

RQ1– Effect of Misclassification on Bug Localization

Projects Reported Actual DifferenceHTTPClient 0.429 0.419 -2.33%

Jackrabbit 0.302 0.339 12.25%

Lucene-Java 0.301 0.322 6.98%

Difference of -2.33% to 12.25% between MAP scoresSignificant differences (Mann-Whitney Wilcoxon test)

Mean Average Precision (MAP) Scores

Page 6: It’s Not a Bug, It’s a Feature:Does Misclassification Affect Bug Localization?

7/24

RQ2– Effect of Different Misclassification Types

Actual to Reported HC JB LJ OverallNone 0.429 0.302 0.301 0.312

IMPROVEMENT to BUG 0.416 0.299 0.295 0.307

TEST to BUG 0.429 0.328 0.313 0.334

13 different categories: BUG, RFE, IMPROVEMENT, DOCUMENTATION, REFACTORING, BACKPORT, CLEANUP, SPEC, TASK, TEST, BUILD_SYSTEM, DESIGN_DEFECT, and OTHERS

TEST to BUG & IMPROVEMENT to BUG have the most impact.

Mean Average Precision (MAP) Scores

Note: HC- HTTPClient, JB- Jackrabbit, LJ- Lucene-Java

Page 7: It’s Not a Bug, It’s a Feature:Does Misclassification Affect Bug Localization?

7/24

RQ3– Mitigation Strategy

Remove issue reports where no source code files are modified.Remove issue reports which explicitly mention the buggy files in summary or description.

Page 8: It’s Not a Bug, It’s a Feature:Does Misclassification Affect Bug Localization?

7/24

Conclusion

Difference of -2.33%, 12.25% & 6.98% between MAP scores for 3 projects.TEST to BUG and IMPROVEMENT to BUG have significant impactMitigation:

Remove issue reports which do not change source code files.

Remove issue reports which specify buggy files in summary or description section.

Page 9: It’s Not a Bug, It’s a Feature:Does Misclassification Affect Bug Localization?

Appendix (Statistical Analysis)

• Mann-Whitney-Wilcoxon (MWW) test: Given a significance level = 0.001,if p-value <, then the test rejects the null hypothesis.

9

Page 10: It’s Not a Bug, It’s a Feature:Does Misclassification Affect Bug Localization?

Appendix (RQ2 Results)

10

Actual to Reported HC JB LJ Overall

None 0.429 0.302 0.301 0.312

RFE to BUG 0.427 0.303 0.304 0.313

DOCUMENTATION to BUG 0.430 0.304 0.305 0.315

IMPROVEMENT to BUG 0.416 0.299 0.295 0.307

REFACTORING to BUG 0.428 0.301 0.301 0.311

BACKPORT to BUG 0.430 0.303 0.300 0.313

CLEANUP to BUG 0.429 0.303 0.303 0.314

SPEC to BUG 0.435 0.302 0.301 0.312

TASK to BUG 0.432 0.302 0.301 0.312

TEST to BUG 0.429 0.328 0.313 0.334

BUILD_SYSTEM to BUG 0.429 0.306 0.303 0.315

DESIGN_DEFECT to BUG 0.424 0.301 0.301 0.311

OTHERS to BUG 0.439 0.303 0.301 0.313