Kai Pan, Xintao Wu University of North Carolina at Charlotte Generating Program Inputs for Database...
Preview:
Citation preview
- Slide 1
- Kai Pan, Xintao Wu University of North Carolina at Charlotte
Generating Program Inputs for Database Application Testing Tao Xie
North Carolina State University 26th IEEE/ACM International
Conference on Automated Software Engineering Nov 11, 2011 Lawrence,
Kansas
- Slide 2
- 2 Functional Testing Test Generation Program Inputs
Background
- Slide 3
- 3 Test Generation Program Inputs Background Database States
Functional Testing
- Slide 4
- 4 Program inputs Database An Example
- Slide 5
- Motivation 5
- Slide 6
- Represent real-world objects characteristics, helping detect
faults that could cause failures in real-world settings Reduce cost
of generating new database records 6 Benefits to use an existing
database state
- Slide 7
- Dynamic Symbolic Execution (DSE) Execute the program in both
concrete and symbolic way (also called concolic testing) Collect
constraints along executed path as path condition Negate part of
the path condition and solve the new path condition to lead to new
path DSE tools for various program languages Pex for.NET from
Microsoft Research 7
- Slide 8
- Motivation 8 Path Condition: C1: Query construction
constraints
- Slide 9
- Motivation 9 Path Condition: C1: Query construction constraints
C2: Query/DB constraints
- Slide 10
- Motivation 10 Path Condition: C1: Query construction
constraints C2: Query/DB constraints C3: Result manipulation
constraints
- Slide 11
- Motivation 11 Path Condition: C1: Query construction
constraints C2: Query/DB constraints C3: Result manipulation
constraints C1 ^ C2 ^ C3
- Slide 12
- Motivation 12 Path Condition: C1: Query construction
constraints C2: Query/DB constraints C3: Result manipulation
constraints C1 ^ C2 ^ C3 A hard part
- Slide 13
- Motivation 13 How to derive high-covering program input values
based on a given database state?
- Slide 14
- Outline Background Approach Evaluation Conclusion and future
work 14
- Slide 15
- SQL query forms Fundamental structure: SELECT, FROM, WHERE,
GROUP BY, and HAVING clauses. SELECT select-list FROM from-list
WHERE qualification (GROUP BY grouping-list) (HAVING
group-qualification) 15
- Slide 16
- SQL query forms (contd) Nested query: a query with another
query embedded within it Nested query can be unnested into
equivalent single level canonical queries SELECT S.sname FROM
Sailors S FROM Sailors S, Reserves R WHERE EXISTS ( SELECT * WHERE
R.sid=S.sid AND R.bid=103 FROM Reserves R WHERE R.bid=103 AND
R.sid=S.sid) 16 transoformation rules A nested query Its canonical
form
- Slide 17
- SQL query forms of focus WHERE clause consisting of a
disjunction of conjunctions SELECT C1, C2,..., Ch FROM from-list
WHERE (A11 AND... AND A1n) OR... OR (Am1 AND... AND Amn) 17
- Slide 18
- Outline Background Approach Evaluation Conclusion and future
work 18
- Slide 19
- Illustrative example 19
- Slide 20
- Apply DSE on the existing database 20 Step1: DSE chooses
type=0, zip=0 executed query: Q1: SELECT C.SSN, C.income, M.balance
FROM customer C, mortgage M WHERE M.year=15 AND C.zipcode=1 AND
C.SSN=M.SSN Execution of Q1 zero record, not covering loop
body
- Slide 21
- Apply DSE on the existing database (contd) 21 Step2: DSE flips
type == 0 to type != 0 type=1, zip=0 executed query: Q2: SELECT
C.SSN, C.income, M.balance FROM customer C, mortgage M WHERE
M.year=30 AND C.zipcode=1 AND C.SSN=M.SSN Execution of Q2 zero
record not covering loop body
- Slide 22
- Apply DSE on the existing database (contd) 22 However, An input
like type=0, zip=27694 executed query: Q3: SELECT C.SSN, C.income,
M.balance FROM customer C, mortgage M WHERE M.year=15 AND
C.zipcode=27695 AND C.SSN=M.SSN Execution of Q3 one record {C.SSN =
001, C.income = 50000, M.balance = 20000}. Covering Line14=true and
Line18=false
- Slide 23
- Apply DSE on the existing database (contd) 23 Furthermore, An
input like type=0, zip=28222, executed query: Q4: SELECT C.SSN,
C.income, M.balance FROM customer C, mortgage M WHERE M.year=15 AND
C.zipcode=28223 AND C.SSN=M.SSN Execution of Q4 one record {C.SSN =
002, C.income = 150000, M.balance = 30000}. As a result,
Line14=true and Line18=true
- Slide 24
- Assist DSE to generate program inputs 24 How to derive
high-covering program input values based on a given database
state?
- Slide 25
- Our idea: construct auxiliary queries 25 Auxiliary query :
SELECT C.zipcode, FROM customer C, mortgage M WHERE M.year=15 AND
C.SSN=M.SSN e.g., result set includes fzip=27695. From fzip=zip+1,
we derive zip=27694!
- Slide 26
- Our idea: construct auxiliary queries (contd) 26 Auxiliary
query : SELECT C.zipcode, FROM customer C, mortgage M WHERE
M.year=15 AND C.SSN=M.SSN e.g., result set includes fzip=27695.
From fzip=zip+1, we derive zip=27694! Cover Line14=true and
Line18=false! true false
- Slide 27
- Our idea: construct auxiliary queries (contd) 27 Auxiliary
query : SELECT C.zipcode, FROM customer C, mortgage M WHERE
M.year=15 AND C.SSN=M.SSN e.g., result set includes fzip=27695.
From fzip=zip+1, we derive zip=27694! Cover Line14=true and
Line18=false! true false Act like Constraint Solver for Program
Constraints +DB State Constraints
- Slide 28
- Approach Collect query construction constraints on program
variables used in the executed queries from the program code
28
- Slide 29
- Approach (contd) Collect query construction constraints on
program variables used in the executed queries from the program
code Collect result manipulation constraints on comparing with
record values in the querys result set (such as if (diff>100000)
) 29
- Slide 30
- Construct auxiliary queries 30 SELECT C.SSN, C.income,
M.balance FROM customer C, mortgage M WHERE M.year=15 AND
C.zipcode=fzip AND C.SSN=M.SSN For path Line04=true, Line14=true,
construct the abstract query: true
- Slide 31
- Construct auxiliary queries 31 SELECT C.SSN, C.income,
M.balance FROM customer C, mortgage M WHERE M.year=15 AND
C.zipcode=fzip AND C.SSN=M.SSN For path Line04=true, Line14=true,
construct the abstract query: true Our target
- Slide 32
- Construct auxiliary queries 32 SELECT C.SSN, C.income,
M.balance FROM customer C, mortgage M WHERE M.year=15 AND
C.zipcode=fzip AND C.SSN=M.SSN SELECT C.zipcode true Construct
auxiliary query
- Slide 33
- Construct auxiliary queries 33 SELECT C.SSN, C.income,
M.balance FROM customer C, mortgage M WHERE M.year=15 AND
C.zipcode=fzip AND C.SSN=M.SSN SELECT C.zipcode FROM customer C,
mortgage M true Construct auxiliary query
- Slide 34
- Construct auxiliary queries 34 SELECT C.SSN, C.income,
M.balance FROM customer C, mortgage M WHERE M.year=15 AND
C.zipcode=fzip AND C.SSN=M.SSN SELECT C.zipcode FROM customer C,
mortgage M WHERE M.year=15 AND C.SSN=M.SSN Construct auxiliary
query true
- Slide 35
- Generate program input values 35 Run auxiliary query: SELECT
C.zipcode, FROM customer C, mortgage M WHERE M.year=15 AND
C.SSN=M.SSN fzip:27695 or 28223
- Slide 36
- Generate program input values 36 Run auxiliary query: SELECT
C.zipcode, FROM customer C, mortgage M WHERE M.year=15 AND
C.SSN=M.SSN fzip: 27695 or 28223 zip: 27694 or 28222
- Slide 37
- 37 type=0, zip=27694 covers Line04=true, Line14=true, but
Line18=false true false Input combinations: type: 0 or !0 X zip:
27694 or 28222 Generate program input values
- Slide 38
- Approach (contd) Not enough! Program variables in branch
condition after executing the query may be data-dependent on
returned record values. How to cover Line18 true branch? 38
- Slide 39
- Approach (contd) To cover path Line04=true, Line14=true,
Line18=true We need to extend previous auxiliary query 39 true
- Slide 40
- Construct auxiliary queries 40 SELECT C.zipcode, FROM customer
C, mortgage M WHERE M.year=15 AND C.SSN=M.SSN (----how to
extend?----) We extend the WHERE clause true
- Slide 41
- Construct auxiliary queries 41 SELECT C.zipcode, FROM customer
C, mortgage M WHERE M.year=15 AND C.SSN=M.SSN (----how to
extend?----) We extend the WHERE clause true
- Slide 42
- Construct auxiliary queries 42 SELECT C.zipcode, FROM customer
C, mortgage M WHERE M.year=15 AND C.SSN=M.SSN AND C.income - 1.5 *
M.balance > 100000 We extend the WHERE clause true
- Slide 43
- Generate program input values 43 Run auxiliary query: SELECT
C.zipcode, FROM customer C, mortgage M WHERE M.year=15 AND
C.SSN=M.SSN AND C.income - 1.5 * M.balance > 100000
fzip=28223
- Slide 44
- Generate program input values 44 Run auxiliary query: SELECT
C.zipcode, FROM customer C, mortgage M WHERE M.year=15 AND
C.SSN=M.SSN AND C.income - 1.5 * M.balance > 100000 fzip=28223
zip=28222
- Slide 45
- Other issues (aggregate calculation) Extend auxiliary query
with GROUP BY and HAVING clauses. 45 Involve multiple records
- Slide 46
- Other issues (aggregate calculation) SELECT C.zipcode,
sum(M.balance) FROM customer C, mortgage M WHERE M.year=15 AND
C.SSN=M.SSN AND C.income - 1.5 * M.balance > 100000 GROUP BY
C.zipcode HAVING sum(M.balance) > 500000 46
- Slide 47
- Other issues (cardinality constraints) SELECT C.zipcode FROM
customer C, mortgage M WHERE M.year=15 AND C.SSN=M.SSN AND C.income
- 1.5 * M.balance > 100000 GROUP BY C.zipcode HAVING COUNT(*)
>= 3 Use a special DSE technique for dealing with input-
dependent loops P. Godefroid and D. Luchaup. Automatic partial loop
summarization in dynamic test generation. In ISSTA 2011. 47
- Slide 48
- Outline Background Approach Evaluation Conclusion and future
work 48
- Slide 49
- Research questions RQ1 (Effectiveness): What is the percentage
increase in code coverage by the program inputs generated by Pex
with our approachs assistance? RQ2 (Cost): What is the cost of our
approachs assistance? 49
- Slide 50
- Evaluation subjects Two open source database applications
RiskIt 4.3K LOC, database: 13 tables, 57 attributes, and >1.2
million records 17 DB-interacting methods selected for testing
UnixUsage 2.8K LOC, database: 8 tables, 31 attributes, and >0.25
million records 28 DB-interacting methods selected for testing
50
- Slide 51
- Evaluation setup Measurement for test generation effectiveness:
code coverage cost: number of runs/paths, execution time Procedure
run Pex w/o our approachs assistance perform our algorithms to
generate new additional test inputs 51
- Slide 52
- Evaluation results: RiskIt 52 Higher code coverage
- Slide 53
- Evaluation results: RiskIt 53 Low additional cost Pex (only)
timeout: 120 seconds Even given longer time, no new coverage
observed for Pex (only)
- Slide 54
- Evaluation results: RiskIt 54 Pex (only) timeout: 120 seconds
Even given longer time, no new coverage observed for Pex
(only)
- Slide 55
- Preliminary Evaluation(contd) Evaluation results:
UnixUsage
- Slide 56
- Summary of evaluation results RQ1: Effectiveness RiskIt: 26%
higher block coverage over Pex only UnixUsage: 35% higher block
coverage over Pex only RQ2: Cost RiskIt: #runs/paths: 131 more over
1135 (Pex) execution time: 517 secs more over 1781 (Pex) UnixUsage
#runs/paths: 93 more over 1197 (Pex) execution time: 580 secs more
over 1718 (Pex) 56
- Slide 57
- Outline Background Approach Evaluation Conclusion 57
- Slide 58
- Conclusion A new approach that formulates auxiliary queries to
bridge gap between program/DB constraints. Act like a constraint
solver for program constraints + DB constraints Empirical
evaluations on 2 open source DB apps our approach can assist DSE to
generate program inputs effectively achieving higher code coverage
with low additional cost. 58
- Slide 59
- Future Work To construct auxiliary queries directly from
embedded complex queries (e.g., nested queries), rather than from
their transformed norm forms. To handle complex program context
such as multiple queries. 59
- Slide 60
- Acknowledgment: This work was supported in part by U.S.
National Science Foundation under CCF-0915059 for Kai Pan and
Xintao Wu, and under CCF-0915400 for Tao Xie. Thank you! Questions?
60
- Slide 61
- Related Work All previous related work addresses a different
problem: constructing both program inputs and database states (from
scratch) M. Emmi, R. Majumdar, and K. Sen. Dynamic test input
generation for database applications. In ISSTA, 2007. K. Taneja, Y.
Zhang, and T. Xie. MODA: Automated test generation for database
applications via mock objects. In ASE, 2010. 61