Prioritizing Test Cases for Regression Testing

Sebastian Elbaum University of Nebraska, Lincoln

Alexey Malishevsky Oregon State University

Gregg Rothermel Oregon State University

ISSTA 2000

Defining Prioritization

• Test scheduling

• During regression testing stage

• Goal: maximize a criterion/criteria– Increase rate of fault detection– Increase rate of coverage– Increase rate of fault likelihood exposure

Prioritization Requirements

• Definition of goal• Increase rate of fault detection

• Measurement criterion• % Of faults detected over life of test suite

• Prioritization technique• Randomly• Total statements coverage• Probability of exposing faults

Previous Work

• Goal– Increase rate of fault detection

• Measurement– APFD:

• weighted average of the • percentage of • faults detected over life of test suite

– Scale: 0 - 100 (higher means faster detection)

Previous Work (2)

A-B-C-D-E C-E-B-A-DE-D-C-B-A

X XX X X XX X X X X X X

XX X X

Faults

TESTS1 2 3 4 5 6 7 8 9 10

Measuring Rate of Fault Detection

Previous Work (3)

# Label Prioritize on

1 random randomized ordering

2 optimal optimize rate of fault detection

3 st total coverage of statements

4 st addtl coverage of statements not yet covered

5 st fep probability of exposing faults

6 st fep addtl probability of faults, adjusted to consider previous test cases

Prioritization Techniques

Summary Previous Work

• Performed empirical evaluation of general prioritization techniques– Even simple techniques generated gains

• Used statement level techniques

• Still room to improve

Research Questions

1. Can version specific TCP improve the rate of fault detection?

2. How does fine technique granularity compare with coarse level granularity?

3. Can the use of fault proneness improve the rate of fault detection?

Addressing RQ

• New family of prioritization techniques

• New series of experiments1. Version specific prioritization

– Statement– Function

2. Granularity

3. Contribution of fault proneness

• Practical implications

Additional Techniques

# Label Prioritize on7 fntotal coverage of functions

8 fnaddtl coverage of functions not yet covered

9 fnfeptotal probability of exposing faults

10 fnfepaddtl probability of exposing faults, adjusted to consider previous tests

11 fnfitotal probability of fault likelihood

12 fnfiaddtl probability of fault likelihood, adjusted to consider previous tests

13 fnfifeptotal combined probabilities of fault existence and fault exposure

14 fnfifepaddtlcombined probabilities of fault existence/exposure, adjusted on previous coverage

Family of Experiments

• 8 programs • 29 versions• 50 test suites per program

– Branch coverage adequate

• 14 techniques– 2 control “techniques” – optimal & random– 4 statement level– 8 function level

“Generic” Factorial Design

Techniques

Program

50 Test Suites

29 Versions

Independence of code

Independenceof suite

composition

Independence of changes

Experiment 1a – Version SpecificRQ1: Prioritization works on version specific at stat. level.

– ANOVA: Different average APFD among stat. level techniques

– Bonferroni: St-fep-addtl significantly better

Group Technique Value

A St-fep-addtl 78.88

B St-fep-total 76.99

B St-total 76.30

C St-addtl 74.44

Random 59.73

Experiment 1b – Version Specific

RQ1: Prioritization works on version specific at function level.– ANOVA: Different average APFD among function level techniques– Bonferroni: Fn-fep not significantly different than Fn-total

A Fn-fep-addtl 75.59

A Fn-fep-total 75.48

A Fn-total 75.09

B Fn-addtl 71.66

Experiment 2: Granularity• RQ2: Fine granularity has greater prioritization potential

– Techniques at the stat. level are significantly better than functional level

– However, “best” functional level are better than “worse” statement level

80total

fep-totalfep-addtl

random

Statement

Function

Experiment 3: Fault Proneness• RQ3: Incorporating fault likelihood did not significantly

increased APFD. – ANOVA: Significant differences in average APFD values among all

functional level techniques

– Bonferroni: “Surprise”. Techniques using fault likelihood did not rank significantly better

A Fn-fi-fep-addtl 76.34

A B Fn-fi-fep-total 75.92

A B Fn-fi-total 75.63

A B Fn-fep-addtl 75.59

A B Fn-fep-total 75.48

B Fn-total 75.09

C Fn-fi-addtl 72.62

C Fn-addtl 71.66

Reasons:

–For small changes fault likelihood does not seem to be worth it.

–We believe it will be worthwhile for larger changes. Further exploration required.

Practical Implications

APFD:Optimal = 99%Fn-fi-fep-addtl = 98%Fn-total = 93%Random = 84%

Time:Optimal = 1.3Fn-fi-fep-addtl = 2.0 (+.7)Fn-total = 11.9 (+10.6)Random = 16.5 (+15.2)

Conclusions

• Version specific techniques can significantly improve rate of fault detection during regression testing

• Technique granularity is noticeable– In general, statement level is more powerful but,– Advanced functional level techniques are better

than simple statement level techniques

• Fault likelihood may not be helpful

Working on …

• Controlling the threats – More subjects– Extending model

• Discovery of additional factors

• Development of guidelines to choose “best” technique

Backup Slides

Threats

• Representativeness– Program– Changes – Tests and process

• APFD as a test efficiency measure• Tools correctness

Experiment Subjects

Program LOC Test Suite Avg. Size

replace 516 19 printtok1 402 16 totinfo 346 7 printtok2 483 12 schedule1 299 8 schedule2 297 8 tcas 138 6 space 6218 155

FEP Computation

• Probability that a fault causes a failure

• Works with mutation analysis– Insert mutants– Determine how many mutant are exposed

by a test case

FEP(t,s) = # of mutants of s exposed by t# of mutants of s

FI Computation

• Fault likelihood

• Associated with measurable software attributes

• Complexity metrics– Size, Control Flow, and Coupling– Generated fault index

• principal component analysis

OverallGroup Technique Value

A Optimal 94.24

B St-fep-addtl 78.88

C St-fep-total 76.99

D C Fn-fi-fep-addtl 76.34

D C St-total 76.30

D E Fn-fi-fep-total 75.92

D E Fn-fi-total 75.63

D E Fn-fep-addtl 75.59

D E Fn-fep-total 75.48

F E Fn-total 75.09

F St-addtl 74.44

G Fn-fi-addtl 72.62

G Fn-addtl 71.66

H Random 59.73

Prioritizing Test Cases for Regression Testing

Documents

Geographically Weighted Regression (GWR) · Introduction Regression models are typically “global”. In some cases it can make sense to fit more flexible “local” models. In

Prioritizing Dal Watersheds

Scalable Statistical Bug Isolation › courses › csep503 › ... · “Prioritizing test cases for regression testing” Aug. 2000 –F. Vokolos & P. Frankl, “Pythia: a regression

The Use of Peters-Belson Regression in Legal Cases

Prioritizing personalization for_growth

Regression Analysis(Cases 1-3)

Prioritizing the Test Cases of Web Services by APFD Metric=.pdf · APFD Metric Manali Gupta1, Shweta Rathour2 1, 2ITS Engineering College, Greater Noida, India Abstract: Regression

Prioritizing Product Ideas

Prioritizing Web usability

Analyzing Cases of Resilience Success and … Cases of Resilience Success ... 3.3 Analyzing Data 17 ... validated business case for prioritizing and implementing specific resilience

Prioritizing Infrastructure Investment

University College Londoncrest.cs.ucl.ac.uk/cow/57/slides/cow57_Briand.pdf• ISSRE 2003, Leon and Podgurski, filtering and prioritizing test cases, relying on code coverage • FSE

Prioritizing Versions for Performance Regression Testing: The …bergel.eu/MyPapers/Sand20-Lito.pdf · 2020. 2. 14. · Prioritizing Versions for Performance Regression Testing: The

Prioritizing People

PRIORITIZING ISSUES IN THE BATTLEFIELD DEVELOPMENT PLAN · PRIORITIZING ISSUES IN THE BATTLEFIELD DEVELOPMENT PLAN by ... Prioritizing Issues in the Battlefield Development ... relationship

Prioritizing Process Improvements

PRIORITIZING PRINCIPALS GUIDEBOOK

Nov 6, 2008 Presented by Amy Siu and EJ Park. Application Release 1 R1 Test Cases Application Release 2 R2 Test Cases R1 Test Cases 2 Regression testing

Prioritizing Agency Needs

Addressing IT Operation Analytics ITOA Use Cases with ......Addressing IT operations analytics (ITOA) use cases with AppDynamics 15 Prioritizing issues and opportunities Once operations