View
5
Download
0
Category
Preview:
Citation preview
Integrating Artificial Intelligenceiin
Software Testing
Roni Stern and Meir Kalech, ISE department, BGU
Niv Gafni, Yair Ofir and Eliav Ben-Zaken, Software Eng., BGU
1
AbstractAbstract
Di i
Artificial Intelligence
DiagnosisPlanning
Software Engineering
TestingTesting
2
Testing Planningg g
Diagnosis
3
Testing is Important• Quite a few software development paradigms• All of them recognize the importance of testing• All of them recognize the importance of testing• There are several types of testing
i i bl k b (f i l) i– E.g., unit testing, black-box (functional) testing ….
Tester• Runs tests • Follows a test plan
Programmer• Write programs• Follows spec
4
• Follows a test plan• Goal: find bugs!
• Follows spec.• Goal: fix bugs!
Handling a BugHandling a Bug
1. Bugs are reported by the tester- “Screen A crashed on step 13 of test suite 4…”p
2. Prioritized bugs are assigned to programmers3. Bugs are diagnosed
- What caused the bug?g
4. Bugs are fixedH f ll- Hopefully…
5
Why is Debugging Hard?Why is Debugging Hard?
• The developer needs to reproduce the bug• Reproducing (correctly) a bug is non-trivialReproducing (correctly) a bug is non trivial
– Bug reports may be inaccurate or missingS f h h i l– Software may have stochastic elements• Bug occurs only once in a while
– Bugs may appear only in special cases
Let the tester provide more information!
6
7
Introducing AI!Introducing AI!es
ter Run a test suite
Fi d b mer Diagnose
Fi h bTe Find a bug
rogr
am
Fix the bug
Pr
Test
er Run a test suiteFind a bug
AI Compute Diagnoses m
mer Fix the bug
T Find a bugPlan next tests
Prog
ram
8
Test Diagnose and Plan (TDP)Test, Diagnose and Plan (TDP)r r
Test
er Run a test suiteFind a bug
AI Compute DiagnosesPl t t t am
mer Fix the bug
Plan next tests
Prog
ra
1. The tester runs a set of planned tests (test suite)2. The tester finds a bug3. AI generates possible diagnoses4. If there is only one candidate – pass to programmer5 Else AI plans new tests to prune candidates5. Else, AI plans new tests to prune candidates
9
DiagnosisDiagnosis
Find the reason of a problem given observed symptoms
Requirement: knowledge of the diagnosed systemGi b• Given by experts
• Learned by AI techniques
10
Model Based DiagnosisModel-Based Diagnosis
• Given: a model of how the system works
I f h t tInfer what went wrong
• Example:Example:
IF ok(battery) AND ok(ignition) THEN start(engine)
• What if the engine doesn’t start?
11
Where is MBD applied?Where is MBD applied?
i• Printers (Kuhn and de Kleer 2008)
• Vehicles(Pernestål et. al. 2006)
• Robotics (Steinbauer 2009)(Steinbauer 2009)
• Spacecraft - Livingstone (Williams and Nayak, 1996; Bernard et al., 1998)
….
12
Software is Difficult to ModelSoftware is Difficult to Model
• Need to code how the software should behave– Specs are complicated to modelp p
• Static code analysis ignores runtimeP l hi– Polymorphism
– Instrumentation
…
13
Zoltar [Ab t l 2011’]Zoltar [Abrue et. al. 2011’]
• Construct a model from the observations• Observations should include execution tracesObservations should include execution traces
– Functions visited during executionOb d b / b– Observed outcome: bug / no bug
• Weaker modelOk(function1) function1 outputs valid valueA bug entails that at least one comp was not OkA bug entails that at least one comp was not Ok
14
Execution MatrixExecution Matrix
• A key component in Zoltar is the execution matrix• It is built from the observed execution traces
– Observation 1 (BUG) : F1 F5 F6 – Observation 2 (BUG) : F2 F5( )– Observation 3 (OK) : F2 F6 F7 F8
F1 F2 F3 F4 F5 F6 F7 F8 BugF1 F2 F3 F4 F5 F6 F7 F8 Bug
Obs1 1 0 0 0 1 1 0 0 1
Obs2 0 1 0 0 1 0 0 0 1
Obs3 0 1 0 0 0 1 1 1 0Obs3 0 1 0 0 0 1 1 1 0
15
Diagnosis = Hitting Sets of ConflictsDiagnosis = Hitting Sets of Conflicts
In every BUG trace at least one function is faulty• Observations:Observations:
– Observation 1 (BUG) : F1 F5 F6 Ob i 2 (BUG) F2 F5– Observation 2 (BUG) : F2 F5
• Conflicts:– Ok(F1) AND Ok(F5) AND Ok(F6)
Ok(F2) AND Ok(F5) Hitting sets of Conflicts– Ok(F2) AND Ok(F5)
• Possible diagnoses: {F5} ,{F1,F2},{F6,F2}
Hitting sets of Conflicts
16
Software Diagnosis with ZoltarSoftware Diagnosis with Zoltar
P i itiBug Report Prioritize Diagnosesg
ZoltarSet of
Candidate Diagnoses
17
Plan More TestsPlan More Tests
Bug Report
AIEngineSet of
Possible Diagnoses
Suggest New Test to P C did
18
Prune Candidates
Plan More TestsPlan More Tests
Bug Report
AIEngine A single diagnosisA single diagnosis
Suggest New Test to P C did
19
Prune Candidates
Pacman ExamplePacman Example
20
Pacman ExamplePacman Example
21
Pacman ExamplePacman Example
22
Pacman ExamplePacman Example
23
Pacman ExamplePacman Example
24
Pacman ExamplePacman Example
25
Pacman ExamplePacman Example
26
Execution TraceExecution Trace
F1 F2 F3F1Move
F2Stop
F3Eat Pill
27
Which Diagnosis is Correct?Which Diagnosis is Correct?
28
KnowledgeKnowledge
• Most probable cause: Eat Pill
29
KnowledgeKnowledge
• Most probable cause: Eat Pill• Most easy to check: StopMost easy to check: Stop
30
ObjectiveObjective
Plan a sequence of testsf d h b dto find the best diagnosis
with minimal tester actionswith minimal tester actions
31
1 Highest Probability (HP)1. Highest Probability (HP)
h b f f• Compute the prob. of every function Ci
= Sum of prob. of candidates containing Cip g i
• Plan a test to check the most probable function
0.2F1
0.4 0.2
0 4
0.4F5F2
F6
32
0.4
1 Highest Probability (HP)1. Highest Probability (HP)
h b f f• Compute the prob. of every function Ci
= Sum of prob. of candidates containing Cip g i
• Plan a test to check the most probable function
0.2F1
0.4 0.6
0 4F5
F2
F6
33
0.4
2 Lowest Cost (LC)2. Lowest Cost (LC)
d l f h ( )• Consider only functions Ci with 0<P(Ci)<1• Test the closest function from this set
F1
F5F2
F6
34
3 Entropy-based3. Entropy-based• A test α = a set of components
Ω set of candidates that assume α will pass– Ω+ = set of candidates that assume α will pass – Ω- = set of candidates that assume α will fail– Ω? = set of candidates that are indifferent to α?
• Prob. α will pass = prob. of candidates in Ω+
• Choose test with lowest Entropy(Ω+ ,Ω ,Ω?)py( + , - , ?)= highest information gain
0.2F1
0.4 0.2
0 4
0.4F5F2
F6
35
0.4
4 Planning Under Uncertainty4. Planning Under Uncertainty
• Outcome of test is unknownwe would like to minimize expected cost
• Formulate as Markov Decision Process (MDP)– States: set of performed tests and their outcomesStates: set of performed tests and their outcomes– Actions: Possible tests – Transition: Pr(Ω ) Pr(Ω ) and Pr(Ω?)Transition: Pr(Ω+), Pr(Ω-) and Pr(Ω?)– Reward (cost): Cost of performed tests
• Current solver uses myopic sampling• Current solver uses myopic sampling
36
Preliminary Experimental ResultsPreliminary Experimental Results
• Synthetic code• SetupSetup
– Generated random call graphs with 300 nodesG d f h d ll h– Generate code from the random call graphs
– Injected 10/20 random bugs – Generate 15 random initial tests
• Run until a diagnosis with prob >0 9 is found• Run until a diagnosis with prob.>0.9 is found– Results on 100 instances
37
Preliminary Experimental ResultsPreliminary Experimental Results
• HP and Entropy perform worse than (LC) and MDP• probability based (HP and Entropy) perform worse than the cost based approach. • MDP which considers cost and probability outperforms the others
38
• MDP which considers cost and probability outperforms the others• 20 bugs is more costly for all algorithms than 10 bugs
Preliminary Experimental Results NUnitsPreliminary Experimental Results - NUnits
• Real code• Well-known testing framework for C#Well known testing framework for C#• Setup
– Generated call graphs with 302 nodes from NUnits– Injected 10,15,20,25,30 random bugs j g– Generate 15 random initial tests
• Run until a diagnosis with prob >0 9 is found• Run until a diagnosis with prob.>0.9 is found– Results on 110 instances
39
Preliminary Experimental Results NUnitsPreliminary Experimental Results - NUnits
• Similar results• The cost increases with the probability bound
40
p y• MDP which considers cost and probability outperforms the others
ConclusionConclusion
• The test, diagnose and plan (TDP) paradigm:– AI diagnoses observed bug with MBD algorithmg g g– AI plans further tests to find correct diagnosis
E t t i AI• Empowers tester using AI– Bug diagnosis done by tester+AI
41
SCM Server Logs Source CodeSCM Server Logs Source Code
iAI Engine
QA Tester Developer
42
Recommended