MIning Software Repositories (MSR) 2010 presentation

Predicting the Severity of a Reported Bug

Ahmed Lamkanfi, Serge Demeyer | Emanuel Giger | Bart GoethalsAnsymo | s.e.a.l. | ADReM

Proceedings of the 2010 7th IEEE Working Conference on Mining Software Repositories, p.1-10

Predicting the Severity of a Reported Bug

Ahmed Lamkanfi, Serge Demeyer | Emanuel Giger | Bart GoethalsAnsymo | s.e.a.l. | ADReM

Proceedings of the 2010 7th IEEE Working Conference on Mining Software Repositories, p.1-10

Severity of a bug is important

✓ Critical factor in deciding how soon it needs to be fixed, i.e. when prioritizing bugs

Priority is business

✓ Severity varies:➡ trivial, minor, normal major, critical and blocker

➡ clear guidelines exist to classify severity of bug reports

✓ Both a short and longer description of the problem

✓ Bugs are grouped according to products and components➡ e.g.: plug-ins, bookmarks are components of

product Firefox

Can we accurately predict the severity of a reported bug by analyzing its textual descriptions?

Also the following questions:

Potential indicators?

Short versus long description?

Per component versus cross-component?

Approach

We use text mining to classify bug reports

• Bayesian classifier: based on the probabilistic occurrence of words

• training and evaluation period

• in first instance, per component

Non-severe bugs(trivial, minor)

Severe bugs(major, critical, blocker)

Non-severe bugs(trivial, minor)

Severe bugs(major, critical, blocker)

Default(normal)

Evaluation of the approach:✓ precision and recall:

Cases drawn from the open-source community✓ Mozilla, Eclipse and GNOME

Results

How does the basic approach perform?➡ per component and using short description

Non-severeNon-severe SevereSeverecomponent precision recall precision recall

Mozilla: Layout 0.701 0.785 0.752 0.653

Mozilla: Bookmarks 0.692 0.703 0.698 0.687

Eclipse: UI 0.707 0.633 0.668 0.738

Eclipse: JDT-UI 0.653 0.714 0.685 0.621

GNOME: Calendar 0.828 0.783 0.794 0.837

GNOME:Contacts 0.767 0.706 0.728 0.785

What keywords are good indicators of severity?

Component Non-severe Severe

Mozilla Firefox- Generalinconsist, favicon, credit,

extra, consum, licens, underlin, typo, inspector,

titlebar

Fault, machin, reboot, reinstal, lockup, seemingli, perman,

instantli, segfault, compil

Eclipse JDT UIdeprec, style, runnabl,

system, cce, tvt35, whitespac, node, put, param

hang, freez, deadlock, thread, slow, anymor,

memori, tick, jvm, adapt

GNOME Mailermnemon, outbox, typo, pad,

follow, titl, high, acceler, decod, reflec

deadlock, sigsegv, relat, caus, snapshot, segment,

core, unexpectedli, build, loop

How does the approach perform when using the longer description?

Mozilla: Layout 0.583 0.961 0.890 0.314

Mozilla: Firefox general 0.578 0.948 0.856 0.308

Eclipse: UI 0.548 0.976 0.892 0.197

Eclipse: JDT-UI 0.547 0.973 0.881 0.195

Eclipse: JDT-Text 0.570 0.988 0.955 0.257

How does the approach perform when using the longer description?

Mozilla: Layout 0.583 0.961 0.890 0.314

Mozilla: Firefox general 0.578 0.948 0.856 0.308

Eclipse: UI 0.548 0.976 0.892 0.197

Eclipse: JDT-UI 0.547 0.973 0.881 0.195

Eclipse: JDT-Text 0.570 0.988 0.955 0.257

How does the approach perform when combining bugs from different components?

Non-severeNon-severe SevereSevere

component precision recall precision recall

Mozilla 0.704 0.750 0.733 0.685

Eclipse 0.693 0.553 0.628 0.755

GNOME 0.817 0.737 0.760 0.835

How does the approach perform when combining bugs from different components?

Non-severeNon-severe SevereSevere

component precision recall precision recall

Mozilla 0.704 0.750 0.733 0.685

Eclipse 0.693 0.553 0.628 0.755

GNOME 0.817 0.737 0.760 0.835

Much larger training set necessary✓± 2000 reports instead of ± 500 per severity!

Conclusions

✓ It is possible to predict the severity of a reported bug

✓Short description better source for predictions

✓Cross-component approach works, but requires more training samples

MIning Software Repositories (MSR) 2010 presentation

Technology

Mining the Categorized Software Repositories to Improve the Analysis of Security Vulnerabilitiesseal/publications/2014FASE.pdf · Mining the Categorized Software Repositories to Improve

Ecological Impact Assessment Report on the MSR Sand Mining ... · MSR plans to increase the footprint area of the plant site and road already existing in the ... presented here, along

Knowledge Collaboration by Mining Software Repositories

Mining Software Repositories - uni-saarland.de · Mining Software Repositories Monday, 17 May, 2010. Predicting Defects for Eclipse ... Bugzilla fixed bug 1234 Strong Candidate fixed

nd International Workshop on Mining Software Repositories ...2005.msrconf.org/MSR2005ProceedingsFINAL_ACM.pdf · Program Committee MSR 2005 Alexander Dekhtyar, University of Kentucky,

MSR-Documentation MSR-Report€¦ · · 2015-02-27MSR-Documentation MSR-Report Concepts of MSRREP.DTD MSR-MEDOC Arbeitsgruppe DTD, Roman Reimer, STZ XI-Works MSR-Report msrrep-sp

Model-based Mining of Software Repositories

the MSR-INRIA Joint Centre 2006-2012pauillac.inria.fr/~levy/talks/12massoulie/massoulie.pdf · the MSR-INRIA Joint Centre msr-inria.inria.fr 2006-2012. Politics INRIA MSR Cambridge

Making sense of transactional memory Tim Harris (MSR Cambridge) Based on joint work with colleagues at MSR Cambridge, MSR Mountain View, MSR Redmond, the

Mining Software Repositories for Accurate Authorship

Towards Mining Software Repositories Research that Matters

Mining Deep Web Repositories - ACAD/WWW HomePageshome.gwu.edu/~nzhang10/deepwebmining/files/tutorial.pdf · Mining Deep Web Repositories ... o Focused/Topical “Crawling” ... "Web-Scale

MSR mining challenge 2015 - Quick Trigger

Mining Unstructured Data in Software Repositories: …gbavota/papers/fose_mud.pdf · Mining Unstructured Data in Software Repositories: Current and Future ... one would expect to

Mining Software Repositories

Mining Component Repositories for Installability Issues

Mining Software Data - uni-saarland.de · What is Mining Software Repositories (MSR)? • Gather and exploit data produced by developers (and other sw stakeholders) in the software

Mining Test Repositories for Automatic Detection of UI

Pinyan Lu, MSR Asia Yajun Wang, MSR Asia

How can repositories support the text mining of their content and why?