56
FROM BUGS TO DECISION SUPPORT - Selected Research Highlights Markus Borg SSE Meeting Nov 10, 2015

From Bugs to Decision Support - Selected Research Highlights

Embed Size (px)

Citation preview

FROM BUGS TO

DECISION SUPPORT

- Selected Research Highlights

Markus Borg

SSE Meeting Nov 10, 2015

PHD STUDENT POSITION

”Aligning RE and V&V”

Bjarnason, Runeson, Borg, et al. Challenges and Practices in Aligning Requirements with Verification

and Validation: A Case Study of Six Companies, Empirical Software Engineering, 19(6), 2014.EMSE

14

”LARGE-SCALE REQUIREMENTS TO

TEST LINKING”

• Traceability recovery

• Establish trace links in system after-the-fact

• ”if two artifacts share much text, they are more

likely to be associated by a link” - Giuliano Antoniol (2002)

TRACEABILITY RECOVERY TOOLS

LUND CONTRIBUTIONS

• Johan Natt och Dag’s PhD Thesis

• ”Textual approach” to find similar requirements

• Tool: ReqSimile

• A systematic review of traceability recovery:• Overview of techniques, evaluations, and results

• https://sites.google.com/site/tracerepo/

Issue

reports

Represent

Calculate angles in vector

space

TEXTUAL SIMILARITY FROM THE 60S

PREVIOUS FOCUS ON REQTS.

Borg, Runeson, and Ardö. Recovering from a Decade: A Systematic Mapping of Information

Retrieval Approaches to Software Traceability, Empirical Software Engineering, 19(6), 2014.

EMSE

14

ISSUE REPORTS AS HUBS

Issue tracker

ISSUE MANAGEMENT MODEL

Change control board

Developers

Resolution

New issue

Q1. Has this issue been

reported before?

Change control board

Developers

Resolution

New issue

Q1. Has this issue been

reported before?

Q2. Who should investigate

the issue?Change control board

Developers

Resolution

New issue

Q1. Has this issue been

reported before?

Q3. Will this issue result in

a code change?

Q2. Who should investigate

the issue?Change control board

Developers

Resolution

New issue

Q4. How long time will it take

to correct this bug?

Q1. Has this issue been

reported before?

Q3. Will this issue result in

a code change?

Q2. Who should investigate

the issue?Change control board

Developers

Resolution

New issue

Q4. How long time will it take

to correct this bug?

Q1. Has this issue been

reported before?

Q3. Will this issue result in

a code change?

Q2. Who should investigate

the issue?

Q5. If we make a corrective code

change, what is the impact?

Change control board

Developers

Resolution

New issue

Q1. HAS THIS ISSUE BEEN

REPORTED BEFORE?

ISSUE DUPLICATE DETECTION

Merge

FINDING TEXTUAL DUPLICATES

• Per’s second most cited paper (ICSE’07 – 278 cit.)

• Applied ReqSimile approach to issue reports

• Evaluated at Sony Ericsson Mobile Communications

?

1.

2.

3.

4.

5.

CONTINUED PER’S WORK

• Now standard feature of issue trackers • Bugzilla, HP Quality Center, JIRA

• Replication• Apache Lucene search engine library

• Issue reports from Android

• Fundamental for textual analysis in later thesis work• Good at finding similar issue report

• Highly scalable solution – fast!

Borg, Runeson, Johansson, and Mäntylä. A Replicated Study on Duplicate Detection: Using

Apache Lucene to Search among Android Defects, In Proc. of the 8th Int’l Symp. On Empirical

Software Engineering and Measurement (ESEM), 2014.

ESEM

14

NETWORK ANALYSIS OF ISSUE

REPORTS

Borg, Pfahl, and Runeson. Analyzing Networks of Issue Reports, In Proc. of the 17th

European Conf. on Software Maintenance and Reengineering (CSMR), 2013.

CSMR

13

NETWORKS ARE POWERFUL!

Fundamental for

artifact ranking in

later thesis work

Q2. WHO SHOULD

INVESTIGATE THE ISSUE?

Q1. Has this issue been

reported before?

Q2. Who should investigate

the issue?Change control board

Developers

Resolution

New issue

AUTOMATED ISSUE

ASSIGNMENT

• Supervised machine learning

• Train on historical bugs

Issue tracker

FEATURE SELECTION

• How to represent an issue report?

COLLABORATION WITH ERICSSON

• Ensemble learners for team assignment

• Features• 100 dominant terms

• Development site

• Submitter type

• System version

• Priority

Leif Jonsson

EXPERIMENTAL SETUP

Ericsson

Company A

4 x

Jonsson, Borg, Broman et al. Automated Bug Assignment: Ensemble-based Machine

Learning in Large-scale Industrial Contexts, Empirical Software Engineering, 2015.EMSE

15

> 50,000

RESULTS

• Prediction accuracy in line with humans• But instantaneous!

• At least 2.000 bug reports in the training set

A WORD OF WARNING…

• Some systems need fresh training data

Q3. WILL THIS ISSUE RESULT

IN A CODE CHANGE?

Q1. Has this issue been

reported before?

Q3. Will this issue result in

a code change?

Q2. Who should investigate

the issue?Change control board

Developers

Resolution

New issue

MSC THESIS AT SONY

• Limited resources to deal with issues

• 1,000s of bugs reported in projects

• Resources to resolve roughly 10%

• Will the bug cause a code change?

HIGHEST PREDICTIVE POWER?

• Machine learning to findpatterns

• Empirical validation ofprevious suspicions

1. mastership

2. fix for

3. ratl mastership

4. externalsupplier

5. ratl keysite

6. project

7. proj id

8. attachment share saved

9. found during

10. business priority

11. found in product

12. found by

13. abc rank

14. detection

15. impact

16. occurrence

17. priority

18. is platform

19. qa state

Standard

priority far

down the list…• Prediction accuracy 75%

Gulin and Olofsson. Development of a Decision Support

System for Defect Reports, MSc Thesis, Lund

University, 2014.

MSc.

Thesis

14

Q4.

How long time will it take to

correct this bug?

Q4. How long time will it take

to correct this bug?

Q1. Has this issue been

reported before?

Q3. Will this issue result in

a code change?

Q2. Who should investigate

the issue?Change control board

Developers

Resolution

New issue

UNSUPERVISED MACHINE LEARNING

Automatic

clustering of

issue reports

REPLICATION OF RAJA (2013)

• ”Textual clusters of issue reports have significantly different resolution times”

• Conceptual replication• Fully automatic clustering

• Issue reports from large projects

• Confirmed statistical differences

OPERATIONALIZATION

1. Put new issue in the right cluster

2. Resolution time based on cluster avg.

Assar, Borg, and Pfahl. Using Text Clustering to Predict Defect Resolution Time: A Conceptual

Replication and an Evaluation of Prediction Accuracy, Empirical Software Engineering, 2015.

EMSE

15

Q5.

If we make a corrective change,

what is the impact?

Q4. How long time will it take

to correct this bug?

Q1. Has this issue been

reported before?

Q3. Will this issue result in

a code change?

Q2. Who should investigate

the issue?

Q5. If we make a corrective code

change, what is the impact?

Change control board

Developers

Resolution

New issue

De la Vara, Borg, Wnuk, and Moonen. Safety Evidence Change Impact Analysis in Practice,

In revision, Transactions on Software Engineering, 2015.TSE

(In rev.)

RECOMMENDATION SYSTEM

• “a software application that aims to support users

in their decision-making while interacting with

large information spaces”

Decision-support system

TWO MAIN APPROACHES

RECOMMENDATIONS BASED ON

HISTORICAL IMPACT (NON-CODE)

Reqs.Tests

Borg, Gotel, and Wnuk. Enabling Traceability Reuse for Impact Analyses: A Feasibility Study in

a Safety Context, In Proc. of the Int’l WS on Traceability in Emerging Forms of Software

Engineering (TEFSE), 2012.

EMSE

15

1. Mine issue tracker

2. Create network of

previous impact

3. Index text with

4. Calculate centrality measures

IDENTIFY POTENTIAL IMPACT

Find similar issues

using Apache Lucene

Design Doc. X.Y

Req. X.Y

Test case UTC56

Req. Z.Y

Design Doc. X.Y

Follow links to create

candidate impact set

RANK THE POTENTIAL IMPACT

Use centrality measures

to rank candidate impact

1. Requirement X.Y

2. Design Document X.Y

3. Test case UTC56

4. Design Document X.Y

5. Requirement Z.Y

Borg and Runeson. Changes, Evolution and Bugs – Recommendation Systems for Issue

Management, In Recommendation Systems in Software Engineering, Robillard, Maalej,

Walker, and Zimmermann (Ed.), 2013.

RSSE

book

ImpRec

https://github.com/mrksbrg/ImpRec

DELIVER THE RECOMMENDATIONS

TWO UNIT CASE STUDY

• Deployed ImpRec in two teams

• Malmö and Bangalore

• Increases awareness and findability

• Helps project newcomers

Borg, Wnuk, Regnell, and Runeson. Supporting Change Impact Analysis Using a

Recommendation System: An Industrial Case Study in a Safety-Critical Context, In

submission, 2015.

In

subm.

PARAMETER TUNING

• Finding a feasible parameter

setting is difficult

• Presented framework to do it in R

• Factorial designs

• Response surface methodology

Borg. TuneR: A Framework for Tuning Software Engineering Tools with Hands-On

Instructions in R, In revision, Journal of Software: Evolution and Process, 2015.JSEP

(in Rev)

SUMMARY

Bug trackerMachine

Learning

“Humans obscured by bug overload, but

machine learning benefits from plentiful

training data. Practitioners confirm value of

developed tools.”

Tiny Transactions on Computer Science

(@TinyToCS), Volume 3, 2015

Bug tracker

Machine

Learning

Thank you!

[email protected]

mrksbrg.com

@mrksbrg