34
Predicting the Severity of a Reported Bug Ahmed Lamkanfi, Serge Demeyer | Emanuel Giger | Bart Goethals Ansymo | s.e.a.l. | ADReM Proceedings of the 2010 7th IEEE Working Conference on Mining Software Repositories, p.1-10

MIning Software Repositories (MSR) 2010 presentation

Embed Size (px)

DESCRIPTION

 

Citation preview

Page 1: MIning Software Repositories (MSR) 2010 presentation

Predicting the Severity of a Reported Bug

Ahmed Lamkanfi, Serge Demeyer | Emanuel Giger | Bart GoethalsAnsymo | s.e.a.l. | ADReM

Proceedings of the 2010 7th IEEE Working Conference on Mining Software Repositories, p.1-10

Page 2: MIning Software Repositories (MSR) 2010 presentation

Predicting the Severity of a Reported Bug

Ahmed Lamkanfi, Serge Demeyer | Emanuel Giger | Bart GoethalsAnsymo | s.e.a.l. | ADReM

Proceedings of the 2010 7th IEEE Working Conference on Mining Software Repositories, p.1-10

Page 3: MIning Software Repositories (MSR) 2010 presentation
Page 4: MIning Software Repositories (MSR) 2010 presentation
Page 5: MIning Software Repositories (MSR) 2010 presentation

Severity of a bug is important

✓ Critical factor in deciding how soon it needs to be fixed, i.e. when prioritizing bugs

Page 6: MIning Software Repositories (MSR) 2010 presentation

Priority is business

Page 7: MIning Software Repositories (MSR) 2010 presentation

Seve

rity

is

tech

nic

al

Page 8: MIning Software Repositories (MSR) 2010 presentation

✓ Severity varies:➡ trivial, minor, normal major, critical and blocker

➡ clear guidelines exist to classify severity of bug reports

Page 9: MIning Software Repositories (MSR) 2010 presentation

✓ Severity varies:➡ trivial, minor, normal major, critical and blocker

➡ clear guidelines exist to classify severity of bug reports

✓ Both a short and longer description of the problem

Page 10: MIning Software Repositories (MSR) 2010 presentation

✓ Severity varies:➡ trivial, minor, normal major, critical and blocker

➡ clear guidelines exist to classify severity of bug reports

✓ Both a short and longer description of the problem

✓ Bugs are grouped according to products and components➡ e.g.: plug-ins, bookmarks are components of

product Firefox

Page 11: MIning Software Repositories (MSR) 2010 presentation

Can we accurately predict the severity of a reported bug by analyzing its textual descriptions?

Page 12: MIning Software Repositories (MSR) 2010 presentation

Can we accurately predict the severity of a reported bug by analyzing its textual descriptions?

Also the following questions:

Page 13: MIning Software Repositories (MSR) 2010 presentation

Can we accurately predict the severity of a reported bug by analyzing its textual descriptions?

Also the following questions:

Potential indicators?

Page 14: MIning Software Repositories (MSR) 2010 presentation

Can we accurately predict the severity of a reported bug by analyzing its textual descriptions?

Also the following questions:

Potential indicators?

Short versus long description?

Page 15: MIning Software Repositories (MSR) 2010 presentation

Can we accurately predict the severity of a reported bug by analyzing its textual descriptions?

Also the following questions:

Potential indicators?

Short versus long description?

Per component versus cross-component?

Page 16: MIning Software Repositories (MSR) 2010 presentation

Approach

Page 17: MIning Software Repositories (MSR) 2010 presentation

We use text mining to classify bug reports

• Bayesian classifier: based on the probabilistic occurrence of words

• training and evaluation period

• in first instance, per component

Page 18: MIning Software Repositories (MSR) 2010 presentation

We use text mining to classify bug reports

• Bayesian classifier: based on the probabilistic occurrence of words

• training and evaluation period

• in first instance, per component

Page 19: MIning Software Repositories (MSR) 2010 presentation

We use text mining to classify bug reports

• Bayesian classifier: based on the probabilistic occurrence of words

• training and evaluation period

• in first instance, per component

Page 20: MIning Software Repositories (MSR) 2010 presentation

We use text mining to classify bug reports

• Bayesian classifier: based on the probabilistic occurrence of words

• training and evaluation period

• in first instance, per component

Non-severe bugs(trivial, minor)

Severe bugs(major, critical, blocker)

Page 21: MIning Software Repositories (MSR) 2010 presentation

We use text mining to classify bug reports

• Bayesian classifier: based on the probabilistic occurrence of words

• training and evaluation period

• in first instance, per component

Non-severe bugs(trivial, minor)

Severe bugs(major, critical, blocker)

Default(normal)

Un

de

cid

ed

Page 22: MIning Software Repositories (MSR) 2010 presentation

Evaluation of the approach:✓ precision and recall:

Cases drawn from the open-source community✓ Mozilla, Eclipse and GNOME

Page 23: MIning Software Repositories (MSR) 2010 presentation

Results

Page 24: MIning Software Repositories (MSR) 2010 presentation

How does the basic approach perform?➡ per component and using short description

Page 25: MIning Software Repositories (MSR) 2010 presentation

How does the basic approach perform?➡ per component and using short description

Non-severeNon-severe SevereSeverecomponent precision recall precision recall

Mozilla: Layout 0.701 0.785 0.752 0.653

Mozilla: Bookmarks 0.692 0.703 0.698 0.687

Eclipse: UI 0.707 0.633 0.668 0.738

Eclipse: JDT-UI 0.653 0.714 0.685 0.621

GNOME: Calendar 0.828 0.783 0.794 0.837

GNOME:Contacts 0.767 0.706 0.728 0.785

Page 26: MIning Software Repositories (MSR) 2010 presentation

What keywords are good indicators of severity?

Page 27: MIning Software Repositories (MSR) 2010 presentation

What keywords are good indicators of severity?

Component Non-severe Severe

Mozilla Firefox- Generalinconsist, favicon, credit,

extra, consum, licens, underlin, typo, inspector,

titlebar

Fault, machin, reboot, reinstal, lockup, seemingli, perman,

instantli, segfault, compil

Eclipse JDT UIdeprec, style, runnabl,

system, cce, tvt35, whitespac, node, put, param

hang, freez, deadlock, thread, slow, anymor,

memori, tick, jvm, adapt

GNOME Mailermnemon, outbox, typo, pad,

follow, titl, high, acceler, decod, reflec

deadlock, sigsegv, relat, caus, snapshot, segment,

core, unexpectedli, build, loop

Page 28: MIning Software Repositories (MSR) 2010 presentation

How does the approach perform when using the longer description?

Page 29: MIning Software Repositories (MSR) 2010 presentation

How does the approach perform when using the longer description?

Non-severeNon-severe SevereSeverecomponent precision recall precision recall

Mozilla: Layout 0.583 0.961 0.890 0.314

Mozilla: Bookmarks 0.536 0.963 0.820 0.166

Mozilla: Firefox general 0.578 0.948 0.856 0.308

Eclipse: UI 0.548 0.976 0.892 0.197

Eclipse: JDT-UI 0.547 0.973 0.881 0.195

Eclipse: JDT-Text 0.570 0.988 0.955 0.257

Page 30: MIning Software Repositories (MSR) 2010 presentation

How does the approach perform when using the longer description?

Non-severeNon-severe SevereSeverecomponent precision recall precision recall

Mozilla: Layout 0.583 0.961 0.890 0.314

Mozilla: Bookmarks 0.536 0.963 0.820 0.166

Mozilla: Firefox general 0.578 0.948 0.856 0.308

Eclipse: UI 0.548 0.976 0.892 0.197

Eclipse: JDT-UI 0.547 0.973 0.881 0.195

Eclipse: JDT-Text 0.570 0.988 0.955 0.257

Page 31: MIning Software Repositories (MSR) 2010 presentation

How does the approach perform when combining bugs from different components?

Page 32: MIning Software Repositories (MSR) 2010 presentation

How does the approach perform when combining bugs from different components?

Non-severeNon-severe SevereSevere

component precision recall precision recall

Mozilla 0.704 0.750 0.733 0.685

Eclipse 0.693 0.553 0.628 0.755

GNOME 0.817 0.737 0.760 0.835

Page 33: MIning Software Repositories (MSR) 2010 presentation

How does the approach perform when combining bugs from different components?

Non-severeNon-severe SevereSevere

component precision recall precision recall

Mozilla 0.704 0.750 0.733 0.685

Eclipse 0.693 0.553 0.628 0.755

GNOME 0.817 0.737 0.760 0.835

Much larger training set necessary✓± 2000 reports instead of ± 500 per severity!

Page 34: MIning Software Repositories (MSR) 2010 presentation

Conclusions

✓ It is possible to predict the severity of a reported bug

✓Short description better source for predictions

✓Cross-component approach works, but requires more training samples