Upload
bob-binder
View
333
Download
1
Embed Size (px)
DESCRIPTION
Overview of Software Testing, Guest Lecture at University of Illinois at Chicago. November 10, 2010
Citation preview
Software Testing: Models, Patterns, Tools
Guest Lecture, UIC CS 540
November 16, 2010
Robert V. Binder
Overview
• Test design pattern fly-by
• Levels of testing
• Case study: CBOE Direct
• Q & A
TEST DESIGN PATTERNS
4
Test Design Patterns • Software testing, c. 1995
– A large and fragmented body of knowledge
– Few ideas about testing OO software
• Challenges – Re-interpret wealth of knowledge for OO
– Address unique OO considerations
– Systematic presentation
– Uniform analytical framework
• Patterns looked like a useful schema – Existing templates didn’t address unique testing
issues
5
Some Footprints
1995 Design Patterns 2003 Briand’s Experiments
1995 Beizer, Black Box Testing
2003 Dot Net Test Objects
1995 Firesmith PLOOT 2003 Microsoft Patterns Group
1995 McGregor 2004 Java Testing Patterns
1999 TOOSMPT 2005 JUnit Anti Patterns
2000 Tutorial Experiment 2007 Test Object Anti Patterns
2001 POST Workshops (4) 2007 Software QS-TAG
6
Test Design Patterns
• Pattern schema for test design
– Methods
– Classes
– Package and System Integration
– Regression
– Test Automation
– Oracles
7
Test Design Patterns
• Pattern schema for test design
Name/Intent
Context
Fault Model
Strategy
Entry Criteria
Exit Criteria
Consequences
Known Uses
Test Model
Test Procedure
Oracle
Automation
© 2000 Robert V. Binder, all rights reserved
8
Test Design Patterns
• Method Scope
– Category-Partition
– Combinational Function
– Recursive Function
– Polymorphic Message
• Class/Cluster Scope
– Invariant Boundaries
– Modal Class
– Quasi-Modal Class
– Polymorphic Server
– Modal Hierarchy
Modal Class: Implementation and Test Models
α
ω
Two PlayerGam e ( )
~( ) ~ ( )
Gam e S tarte d
P la ye r 1
Served
P la ye r 2
Served
p2 _S tart ( ) /
s im u lateVolle y( )
p1 _W insVolle y( ) /
s im u lateVolle y( )
p1 _W insVolle y( )
[th is.p 1_ Score ( ) < 20 ] /
th is.p 1AddPoin t( )
s im u lateVolle y( )
p1 _S tart ( ) /
s im u lateVolle y( )
p2 _W insVolle y( )
[th is.p 2_ Score ( ) < 20 ] /
th is.p 2AddPoin t( )
s im u lateVolle y( )
p2 _W insVolle y( ) /
s im u lateVolle y( )p1 _W insVolle y( )
[th is.p 1_ Score ( ) == 20] /
th is.p 1AddPoin t( )
p2 _W insVolle y( )
[th is.p 2_ Score ( ) == 20] /
th is.p 1AddPoin t( )
p1 _IsW in ner( ) /
retu rn TR UE;
P la ye r 1
W o n
P la ye r 2
W o n p2 _IsW in ner( ) /
retu rn TR UE;
Ga m e S ta rted
P la yer 3
Serve d
p 3_ S tart ( ) /
s im ulateVo lley( )
p 1_ W insVo lle y( ) /
s im ulateVo lley( )
p 3_ W insVo lle y( )
[this.p 3_ Sco re ( ) < 2 0] /
th is.p3 Add Point ( )
s im ulateVo lley( )
p 2_ W insVo lle y( ) /
s im ulateVo lley( )
p 3_ W insVo lle y( )
[this.p 3_ Sco re ( ) == 2 0] /
th is.p3 Add Point ( )
P la yer 3
W on p 3_ IsW in ne r( ) /
return TR UE;
αTh ree P la yerGa m e ( ) / Two P la yerG am e ( )
p 3_ W insVo lle y( ) /
s im ulateVo lley( )
Tw oP layerG am e ( )
ω~( )
TwoPlayerGame
ThreePlayerGame
+TwoPlayerGame()+p1_Start( )+p1_WinsVolley( )-p1_AddPoint( )+p1_IsWinner( )+p1_IsServer( )+p1_Points( )+p2_Start( )+p2_WinsVolley( )-p2_AddPoint( )+p2_IsWinner( )+p2_IsServer( )+p2_Points( )+~( )
+ThreePlayerGame()+p3_Start( )+p3_WinsVolley( )-p3_AddPoint( )+p3_IsWinner( )+p3_IsServer( )+p3_Points( )+~( )
TwoPlayerGame
ThreePlayerGame
Game Started
Player 1Served
Player 2Served
p2_Start( ) /simulateVolley( )
p1_WinsVolley( ) /simulateVolley( )
p1_WinsVolley( )[this.p1_Score( ) < 20] /this.p1AddPoint( )simulateVolley( )
p1_Start( ) /simulateVolley( )
p2_WinsVolley( ) [this.p2_Score( ) < 20] /this.p2AddPoint( )simulateVolley( )
p2_WinsVolley( ) /simulateVolley( )
p1_WinsVolley( )[this.p1_Score( ) == 20] /this.p1AddPoint( )
p2_WinsVolley( )[this.p2_Score( ) == 20] /this.p1AddPoint( )
p1_IsWinner( ) /return TRUE; Player 1
WonPlayer 2Won
p2_IsWinner( ) /return TRUE;
α
ω
ThreePlayerGame( ) /TwoPlayerGame( )
~( )
Player 3Served
p3_WinsVolley( )[this.p3_Score( ) < 20] /this.p3AddPoint( )simulateVolley( )
p3_WinsVolley( )[this.p3_Score( ) == 20] /this.p3AddPoint( )
Player 3Won
p3_IsWinner( ) /return TRUE;
~( )
p3_Start( )/simulateVolley( )
p3_WinsVolley( ) /simulateVolley( )
p1_WinsVolley( ) /simulateVolley( )
p2_WinsVolley( ) /simulateVolley( )
p3_WinsVolley( ) /simulateVolley( )
~( )
9
Test Plan and Test Size
• K events
• N states
• With LSIFs
– KN tests
• No LSIFs
– K × N3 tests
Player 1 Served
Player 2 Served
Player 3 Won
omega
Player 3 Won
Player 3 Served
Player 3 Served
Player 1 Served
Player 2 Won
omega
Player 2 Won
Player 3 Served
Player 2 Served
Player 2 Served
Player 1 Served
Player 1 Won
omega
Player 1 Won
Player 3 Served
Player 2 Served
Player 1 Served
Gam eStartedalpha
3 p2_Start( )
6 p1_WinsVolley( )[this.p1_Score( ) < 20]
2 p1_Start( )
9 p2_WinsVolley( ) [this.p2_Score( ) < 20]
8 p2_WinsVolley( )
7 p1_WinsVolley( ) [this.p1_Score( ) == 20]
10 p2_WinsVolley( ) [this.p2_Score( ) == 20]
1 ThreePlayerGame( )
4 p3_Start( )
5 p1_WinsVolley( )
14 p1_IsWinner( )
15 p2_IsWinner( )
17 ~( )
12 p3_WinsVolley( ) [this.p3_Score( ) < 20]
13 p3_WinsVolley( ) [this.p3_Score( ) == 20]
16 p3_IsWinner( )
11 p3_WinsVolley( )
1
2
3
4
8
11
7
6
9
11
10
5
12
13
8
5
17
14
17
15
17
16
*
*
*
*
*
*
10
© 2000 Robert V. Binder, all rights reserved
11
Test Design Patterns • Subsystem Scope
– Class Associations
– Round-Trip Scenarios
– Mode Machine
– Controlled Exceptions
• Reusable Components
– Abstract Class
– Generic Class
– New Framework
– Popular Framework
© 2000 Robert V. Binder, all rights reserved
12
Test Design Patterns • Intra-class Integration
– Small Pop
– Alpha-Omega Cycle
• Integration Strategy
– Big Bang
– Bottom up
– Top Down
– Collaborations
– Backbone
– Layers
– Client/Server
– Distributed Services
– High Frequency
© 2000 Robert V. Binder, all rights reserved
13
Test Design Patterns • System Scope
– Extended Use Cases
– Covered in CRUD
– Allocate by Profile
• Regression Testing
– Retest All
– Retest Risky Use Cases
– Retest Profile
– Retest Changed Code
– Retest Within Firewall
© 2000 Robert V. Binder, all rights reserved
14
Test Oracle Patterns • Smoke Test
• Judging – Testing By Poking Around
– Code-Based Testing
– Post Test Analysis
• Pre-Production
• Built-in Test
• Gold Standard – Custom Test Suite
– Random Input Generation
– Live Input
– Parallel System
• Reversing
• Simulation
• Approximation
• Regression
• Voting
• Substitution
• Equivalency
© 2000 Robert V. Binder, all rights reserved
15
Test Automation Patterns • Test Case Implementation
– Test Case/Test Suite Method
– Test Case /Test Suite Class
– Catch All Exceptions
• Test Control
– Server Stub
– Server Proxy
• Test Drivers
– TestDriver Super Class
– Percolate the Object Under Test
– Symmetric Driver
– Subclass Driver
– Private Access Driver
– Test Control Interface
– Drone
– Built-in Test Driver
© 2000 Robert V. Binder, all rights reserved
16
Test Automation Patterns • Test Execution
– Command Line Test Bundle
– Incremental Testing Framework (e.g. Junit)
– Fresh Objects
• Built-in Test
– Coherence idiom
– Percolation
– Built-in Test Driver
Percolation Pattern
• Enforces Liskov Subsitutability
• Implement with No Code Left Behind
Base
+ Base()+ ~Base()+ foo()+ bar()
# invariant()# fooPre()# fooPost()# barPre()# barPost()
Derived1
+ Derived1()+ ~Derived1() + foo()+ bar()+ fum()
# invariant()# fooPre()# fooPost()# barPre()# barPost()# fumPre()# fumPost()
+ Derived2()+ ~Derived2()+ foo()+ bar()+ fee()
# invariant()# fooPre()# fooPost()# barPre()# barPost()# feePre()# feePost()
Derived2
17
18
Ten Years After …
• Many new design patterns for hand-crafted test automation – Elaboration of Incremental Test Framework (e.g. JUnit)
– Platform-specific or application-specific
– Narrow scope
• Few new test design patterns
• No new oracle patterns
• Attempts to generate tests from design patterns
• To date 10,000+ copies of TOOSMPT
19
What Have We Learned?
• TP are effective for articulation of insight and practice – Requires discipline to develop – Supports research and tool implementation
• Do not “work out of the box” – Requires discipline in application – Enabling factors
• Irrelevant to the uninterested, undisciplined – Low incremental benefit – Readily available substitutes
• Broadly influential, but not compelling
TEST AUTOMATION LEVELS
© 2004 mVerify Corporation 21
What is good testing?
• Value creation (not technical merit)
– Effectiveness (reliability/quality increase)
– Efficiency (average cost per test)
• Levels – 1: Testing by poking around
– 2: Manual Testing
– 3: Automated Test Script/Test Objects
– 4: Model-based
– 5: Full Test Automation
Each Level 10x Improvement
© 2004 mVerify Corporation 22
Level 1: Testing by Poking Around
Manual
“Exploratory”
Testing
•Low Coverage
•Not Repeatable
•Can’t Scale
•Inconsistent System Under Test
© 2004 mVerify Corporation 23
System Under Test
Manual
Test Design/
Generation
Test Setup
Level 2: Manual Testing
Manual
Test Input
Test Results
Evaluation
•1 test per hour
•Not repeatable
© 2004 mVerify Corporation 24
System Under Test
Manual
Test Design/
Generation
Test Setup
Level 3: Automated Test Script
Test Script
Programming
Test Results
Evaluation
•10+ tests per hour
•Repeatable
•High change cost
© 2004 mVerify Corporation 25
System Under Test
Model-based
Test Design/
Generation
Test Setup
Level 4: Automated Model-based
Automatic
Test
Execution
Test Results
Evaluation
•1000+ tests per hour
•High fidelity
© 2004 mVerify Corporation 26
System Under Test
Level 5: Total Automation
Automated
Test Results
Evaluation
Automated
Test Setup
Automatic
Test
Execution
•10,000 TPH
Model-based
Test Design/
Generation
MODEL-BASED TESTING OF CBOE DIRECT
CBOE Direct ® • Electronic technology platform built and maintained in-house by
Chicago Board Options Exchange (CBOE) – Multiple trading models configurable by product – Multiple matching algorithms (options, futures, stocks, warrants,
single stock futures) – Best features of screen-based trading and floor-based markets
• Electronic trading on CBOE, the CBOE Futures Exchange (CFX), and the CBOE Stock Exchange (CBSX), others
• As of April 2008: – More than 188,000 listed products – More than 3.8 billion industry quotes handled from OPRA on peak day – More than two billion quotes on peak day – More than 684,000 orders on peak day – More than 124,000 peak quotes per second – Less than 5 ms response time for quotes
© 2004 mVerify Corporation 29
Development
• Rational Unified process – Six development increments
– 3 to 5 months
– Test design/implementation parallel with app dev
• Three + years, version 1.0 live Q4 2001 • About 90 use-cases, 650 KLOC Java • CORBA/IDL distributed objects • Java (services and GUI), some XML • Oracle DBMS • HA Sun server farm • Many legacy interfaces
© 2004 mVerify Corporation 30
Test Models Used
• Extended Use Case
– Defines feature usage profile
– Input conditions, output actions
• Mode Machine
– Use case sequencing
• Invariant Boundaries
Stealth Requirements Engineering
© 2004 mVerify Corporation 31
Behavior Model
• Extended Use Case pattern
1 2 3 4 5
ConditionsVariable/Object Value/State
Widget 1 Query T T
Widget 2 Set Time T T
Widget 3 DEL T
Host Name Pick Valid T F T F DC
Host Name Enter Host Name
ActionsVariable/Interface Value/Result
Host Name Display No Change T T T T
Deleted T
Added
Host Time Display No Change T
Host Time T T
CE Time Display Last Local Time T T T T
Host Time T
Error Message F T F T F
Relative Frequency 0.35 0.20 0.30 0.10 0.05
Test Input Conditions for automatic test input generation
Required Actions for automatic result checking
Usage Profile controls statistical distribution of test cases
Logic combinations control test input data selection
© 2004 mVerify Corporation 32
Load Model
• Vary input rate, any quantifiable pattern – Arc
– Flat
– Internet Fractal
– Negative ramp
– Positive ramp
– Random
– Spikes
– Square wave
– Waves
Actual “Waves” Load Profile
MBT Challenges/Solutions
• One time sample not effective, but fresh test suites too expense
• Simulator generates fresh, accurate sample on demand
• Too expensive to develop expected results
• Oracle generates expected on demand
• Too many test cases to evaluate
• Comparator automates checking
• Profile/Requirements change
• Incremental changes to rule base
• SUT Interfaces change • Common agent
interface
© 2004 mVerify Corporation 34
Simulator
• Discrete event simulation of user behavior
• 25 KLOC, Prolog
– Rule inversion
– “Speaks”
• Load Profile
– Time domain variation
– Orthogonal to operational profile
• Each event assigned a "port" and submit time
© 2004 mVerify Corporation 35
Test Environment
• Simulator, etc. on typical desktop
• Dedicated, but reduced server farm
• Live data links
• ~10 client workstations for automatic test agents
– Adapter, each System Under Test (SUT) Interface
– Test Agents execute independently
• Distributed processing/serialization challenges
– Loosely coupled, best-effort strategy
– Embed sever-side serialization monitor
© 2004 mVerify Corporation 36
Automated Run Evaluation
• Post-process evaluation
• Oracle accepts output of simulator
• About 500 unique rules (20 KLOC Prolog)
• Verification – Splainer: result/rule backtracking tool (Prolog, 5 KLOC)
– Rule/Run coverage analyzer
• Comparator (Prolog, 3 KLOC) – Extract transaction log
– Post run database state
– End-to-end invariants
© 2004 mVerify Corporation 37
Daily Test Process
• Plan each day's test run – Load profile, total volume
– Configuration/operational scenarios
• Run Simulator – 100,000 events per hour – FTP event files to test agents
• Test agents submit • Run Oracle/Comparator • Prepare bug reports
1,000 to 750,000 unique tests per day
© 2004 mVerify Corporation 38
Technical Achievements
• AI-based user simulation generates test suites
• All inputs generated under operational profile
• High volume oracle and evaluation
• Every test run unique and realistic (about 200)
• Evaluated functionality and load response with
fresh tests
• Effective control of many different test agents
(COTS/ custom, Java/4Test/Perl/Sql/proprietary)
© 2004 mVerify Corporation 39
Technical Problems
• Stamp coupling
– Simulator, Agents, Oracle, Comparator
• Re-factoring rule relationships, Prolog limitations
• Configuration hassles
• Scale-up constraints
• Distributed schedule brittleness
• Horn Clause Shock Syndrome
© 2004 mVerify Corporation 40
Results
• Revealed about 1,500 bugs over two years
– ~ 5% showstoppers
• Five person team, huge productivity increase
• Achieved proven high reliability
– Last pre-release test run: 500,000 events in two hours, no failures detected
– No production failures
Q & A