View
3
Download
0
Category
Preview:
Citation preview
SnEApproach for Reducing Cycle Time &Improving
Quality of Performance Tests
QAI – STC 2013
Parag Chandekar
Performance Test Lead
Email:paragchandekar@eaton.com
Eaton Technologies
EON Free Zone, Cluster B, Wing 3 ,Plot no. 1, survey
no. 77 MIDC, Kharadi Knowledge Park , Pune,
Maharashtra411014, India
Introduction
Software testing life cycle (STLC) consists of unit testing, system, integration, regression, user acceptance and
Performance testing (PT). PT is often considered as the end of this cycle. There are occasions when the timeframe
allocated for PTis way too less to accomplish any meaningful testing. This could be because of it being completely
neglected during the planning phase, and being added much later, or due to planning flaws, or due to delays in
phases prior to PT( like phases of software development , functional testing etc.)which pushes PT even further. This
short duration is ofteninsufficient for thorough performance testing.Many a time, adhoc approaches are followed to
cover up test scenarios, and often there are possibilities of performance issues leaking into production.
The short duration is insufficient for the execution of the current PT approach and it certainlyleaves no room for
retesting of performance testing bugs or retesting post tuning changes to address and fix issues reported out of PT. If
PT issues are reported, these too are required to be fixed at the same timeframe, putting excessive pressure on the
resources to deliver it.
The current performance testing life cycle (PTLC) as described in next sectionmight not fit in appropriately into
such situations and requires modification in order to address these issues.
This paper discusses a structured approach that can be followed while conducting PT in such short/stringent
timelines. It additionally provides insights onpossibilities of reducing the duration in various phases, thus
optimizingthe overall performance testing cycle time helping in early detection and fixing of issues.
Current PT Process
The current PT approach which is widely followed in the industrymainly comprises of 6 steps.
Requirement gathering
Planning/scenario design
Scripting
Execution
Analysis and reporting
Tuning ( Application/Database/Operating system )
Figure1 : Current Performance Testing process
Effort Distribution across Phases
Effort distribution data was collected from previous performance testing projects within Eaton and even from the
prior work experience with other companies. Percentage of effort spent across various PT phases and the below
graph was plotted.
Figure 2: Effort/Time for various PT phases
Looking at the percentage time spent across the various phases, one can see that requirement gathering and planning
take around 26% of the overall efforts. Additionally, there is not much benefit in execution cycle time by optimizing
the requirement gathering and test design preparation; as these can be done earlier in the project parallely with
development and functional testing.
Then comes the test scripting phase, where the in- scope call flows are required to be scripted. The scripting phase
consumes close to 20-30% of the entire efforts of PT. If we do not consider test planning/requirement gathering
time, the scripting and execution duration take 75% of the overall project time. These two phases do provide an
opportunity to reduce the cycle time.
Shortcomings of Current process
Given shorter timelines for performance testing, the current approach would fail, as one might end up spending all
the time in planning and scripting rather than actually executing tests.
Adding to the woes, scripting phase also suffers from delays related to waiting for call flow details of scripts, until
which, scripting cannot commence. Execution too sometimes gets delayed due to unavailability of volumetric
confirmation and test data unavailability for some or all the scripts. Moreover, in the current process (in most of the
cases) execution phase is initiated only once all the scripts are prepared. As the scripting phase itself is considerably
long, it may be quite some time before the first test is run. And as we know, it’s only the execution phase where one
can find out defects related to performance and capacity.This delay in execution could have some really serious
effects on the project.
One other pitfall of the current process is that the performance defects detected during the execution phase need to
be fixed and retested. Sometimes these retesting efforts are not estimated and clearly it pushes the timelines further.
Additionally, it is also important to allocate sufficient time for analysis. A huge amount of data can be returned
when executing nonfunctional tests and analyzing it takes considerable time. As the analysis process is technical,it
should not be rushed as it wouldincrease the risk of misdiagnosis and the resulting recommendations may be
misleading. However, in the current performance testing process this activity is often compromised due to lack of
time
Opportunities for improvement
However, there are a lot of places where the current process can be optimized.In the current process, issues get
highlighted only during the execution phase. For example, issues related to load balancer (LB) configurations
directed towards one server instead of being equally distributed across all the servers gets highlighted during
execution.
From static testing, which is done earlier in the PT lifecycle, we can find a lot of bugs in the system architecture
affecting the performance; however, we can only assess the system performance once we start testing. Additionally,
testers get a feel of the system only when they would have executed a few tests themselves. Moreover,in projects
where performance testing is conducted for the first time, a lot of tests get added after seeing the results of a few
tests and the tester gets a better understanding of the application while testing it. This additional effort puts an
additional burden on the already loaded team to complete it in the stipulated time.
PROPOSED SOLUTION
An approach is being proposed whereinexecution starts as soon as the first script is ready.The philosophy,“Script n
Execute" would help in identifying issues on the system earlier before the actual planned execution cycle, where we
run all the scripts together, begins.Some restructuring of tests are suggested in the SnE approach which will help in
finding defects quicker. The approach categorizestests into 3 different types.
Type-A: Tests can be conducted with even a single working script; this will help in finding out issues
much earlier. For instance, a single script execution finds out issues in login event, these issues will
invariably be encountered by other scripts too. Early defect fixing will help in reducing the overall test
cycle.In Type A tests formal reports might not be generated however statistics can be compiled in a sheet.
Time spent in execution of single scripts in the SnE approach is limited to 20% of the time spent in making
the script.
Example of tests that fall under Type-A:
Test Environment Configuration and limits tests
LB configuration checks, test environment related issues get highlighted in the first day
or Script &Execute phase
Loadrunner server utilization checks. The benchmarking for load generation capacity of
loadrunner servers can be done with a single script.
Application Specific Tests
Application configuration checks (e.g., threshold checks)
Application capacity
Breaking point of that call flow
Checking effects of caching and whether the response times rise or fall
Tests to identify if a particular layer is not a bottleneck
Benchmark response times for that call flow
Test with gradual rampup for a very long time
Concurrency tests(tests by placing rendezvous points on all the transaction checkpoints)
End User Tests
Tests with user load emulation with variousbandwidths.
Memory Leaks
Running long duration tests overnight will help in finding out potential issues, like
memory leaks, disk space issues etc.
Type-B: Tests are run with realistic volumes and all the scripts are required here. This emulates real traffic
as one expects in production. These tests are designed in load test tool based on the inputs from Test type
A,
o Mixed tests having volumes of all the scripts/scenarios sent simultaneously
o Tests with accurate, volumetric input from the test type-A on response times per call flow can
helpin designing realistic load
E.g.,if we run a breakpoint test with our first script, from the graphs we would know the concurrent user
load at which the throughput begins to drop.Type B test for finding out peak performance point can take
this value and ramp it upby 20% more load.
Example of tests that fall under Type-B:
Mixed tests having volumes of all the scripts/scenarios sent simultaneously
Mixed Normal load test
Mixed Peak load Test
Mixed Stress Test
Mixed Soak Test
Tests which require inputs from Type-A tests
Type-C: Test which not planned and need to be done on the basis of inputs from Type-B Tests.
Example of tests that fall under Type-C:
Dynamic tests designed on basis of inputs from the system
Tests which might require inputs from Type-B tests
Figure 3:“Script n Execute” Process Diagram
Finalizing the order of script creation
As we all know,Risk Based Testing (RBT) helps in prioritizing the E2E call flows which will need performance
testing. However,SnE will help in prioritization of order in which the scripts will be prepared to get the maximum
out of this approach.
Scripting and testing a call flow which has the most volume of transactions and concurrency makes more sense than
one with less concurrency and volumes. Below is a simple flow chart that would help in selecting which call flows
to script first. At run time during scripting phase, decision might need to be taken to script the call flow which has
the highest priority as well as, which test data is available for test execution.
Figure 4: Script prioritization flowchart
Benefits
Below are various benefits of this approach:
Defects are detected earlier: As can be seen from the diagram below, that defects can be found out
earlier in the SnE approach as compared to the current process.With faster performance defect
detections, they get raised earlier, which allows enough time for fixing and retesting them.
Figure 5: Current and SnE process comparision
Confidence is gained as soon as tests are started &tests completed earlier:
More than 50% confidence is gained with the SnE approach by the time all the scripts are ready
for the mixed tests. The dynamic testing scenarios (Type-C) help in gaining that extra bit of
confidence on the application as compared to the current process. Additionally as most of the tests
get executed during the SnE phase itself, the duration of testing gets reduced by almost 25%
percent.
Figure 6: Confidence level comparision between current process and SnE approach
Build enough data on performance test environment:These tests might also help in building of data
or volumes on the servers before the actual type-B and type-C tests (for e.g., mixed load testing) are
conducted on the system. The system monitoring all through the type-A test execution can help in
identifying additional monitoring needs before conducting the realistic volume type-B tests.
Test environment issues get highlighted earlier:Issues like load balancer misconfigurations etc. can
get highlighted earlier. Monitoring on the servers should be enabled before the SNE phase to get the
maximum benefit from it. A lot performance data can be collected from the system under test during
the SnE phase and this would help in finding issues related to memory, cpu disk etc. while Type-A
tests are executed.
Testers get more understanding of the environment:It is a known fact that the more you test, the
more understanding you get of the test environment. While running type-A tests, the testers will be
able to get further understanding on the test parameters.
More time spent on tuning: If the issuesarehighlighted earlier, tuning can also start earlier.
Tests are done right the first time and time is not wasted on retests: The test type-A provide inputs
for type-B and type-C tests reducing the chances of errors in configuring the tests, thus avoids retests
and saves time.
From the type-A tests, one can get a lot of insight into the response times for the various call flows.
Unit testing the PTscripts: Moreover, this should also help us in finding out issues in performance
testing earlier. We do not expect our type-B tests to fail due to memory leak issues, if any, in the
scripts which get highlighted during the type-A test execution itself.
Now this input can help in designing the complex type-B tests for accurate emulation of real life workload.
This additionally saves a lot of effort incurred due to retests if the tests do not go as designed in first
place,i.e, it helps to emulate realistic tests (using right pacing).
Demonstration of SnE Approach
Let’s take an example of a type-A test and prove its benefits.
Breaking pointtest:Contrary to normal practice, the SnE puts the breaking point test in the type-A category as this
can be executed using a single script. Additionally, executing this worst case scenario first will give higher
confidence of the system and know its capacity upfront rather than on the last day of testing.
The breaking point test is executed with an objective of finding the point at which the application produces
noticeable exceptions instead of providing valid responses to requests. The coarse grain test was run with 40
concurrent-connections ramping up every five minutes for up to 480 concurrent-connections and then was run for
five minutes before stopping all concurrent-connections simultaneously.
Figure 7: Errors per Second Vs Concurrent Connections graph
As seen in the graph above, “Queue exceed maximum capacity of: '256' elements” errors begin appearing at the 45-
minute mark in the Performance center logs and continue through the remainder of the test. At this point, there are
400 concurrent-connections on the system. This load level is the Breaking Point of the application for this scenario.
Application
breaking point
Figure 8: Transactions per Second Vs Concurrent Connnection graph
In the Transaction per Second graph above, the TPS rate peaks early in the test (at six-minute mark) with 50
concurrent-connections. After that point, the level remains relatively steady until the breaking point when some
drop in TPS was seen, however, remained constant after this for the remainder of the test.
So from this test we know that the system cannot handle more than 480 concurrent user load. The actual figure with
type-B test when run with all the scripts together will be very close to this figure. By executing this test first we have
found out an approximate value of the system capacity with the first test itself.
Additionally, this test has also provided us input on the rampup setting to be used for the type-Bpeak performance
point test. We roughly know that peak performance point lies somewhere between 50-60 concurrent connections.
However, a more granular type-B test needs to be done to prove it.
Below is the result for the type-B – Peak Point Test.
The objective of the Peak point test is to find an accurate value for point of peak performance and SLA point. As
we already know that the peak point roughly lies between 50-60 concurrent connections we ramp up only 5
connections after every 5mins up to 100 concurrent-connections.
At roughly 6mins into
the test the
throughput peaks up
Figure 9 : Transaction per Second Vs Concurrent Connections graph
The graph indicates that TPS increased with the number of concurrent-connections put in the system. At around 50
concurrent-connections (at 45-minute-mark) optimal point occurred, after this point the TPS remained constant in
spite of the increase in concurrent connections.
Type-A test inputs helped us to design the peak point test accurately and we were able to determine that 50
concurrent-connections is the optimal point the very first time we ran the test. Additionally, we were able to know
the breaking point of the application with our first script itself, if this was one of the objectives of this testing
exercise this would have also resulted in good test coverage.
Prerequisites for implementing SnE approach successfully
Listed below are few of the pre-requisites need to successfully implement SnE approach
Test data needs to be available at-least for a few scripts to start with the test execution
Volumetric information should be available as this is the basis of prioritizing the call flows which need
to be scripted first
Access to servers for monitoring during the scripting phase itself
LB or infra /DBA support needs to be agreed before hand
The system should be functionally stable while scripting or testing is being done on the system
Testers should have the liberty to attempt and break the system and infra/DBA support should be
available to bring the system back in shape
Shared environment cannot be used for performance testing and scripting
Overhead
The SnE approach might have a bit of overhead on the phases prior to scripting. Below are the various phases and
the additional tasks that might need to be done on it.
Planning phase: While defining the strategy and making the test plan, some thought might need to be put in the
kind of tests can be done as part of Test type-A&B. The prerequisites for SnE phase should be readied before the
phase itself. Test data requirements too should be planned, finalized and requested in this phase itself.
Test environment: The approach on script on one environment and test on the other cannot be used with this
approach. That is we would be required a separate or isolated test environment for testing.
Scripting phase: This approach can have a bit of overhead on the scripting process as the person would need to put
some efforts in executing the scripts. There are few options available wherein this overhead can be reduced to
around 20% of scripting effort so basically we are saving in on a maximum of 50% of the time in identifying the
defects, but spending a little over 20% duration of scripting in getting it completed.
If the PT project is being tracked for scripting progress then we might see a lesser completion percentage of
scripting as compared to the current process implementation. However we can see the percentage progress in the
execution during the scripting phase itself.
Types of projects which can benefit from this approach
Almost all projects can benefit from the SnE approach, however the below projects will have maximum benefits:
Projects which have a very small execution window
Projects where testing is attempted for the first time
Projects where only a few resources are working for the entire PT cycle, eventually leading to
elongated test scripting duration
Conclusion
SnE approach is the next level of “Fail fast with Agility” where in addition to prioritizing tests,tests are executed
earlier to identify and fix defects as early as possible. I.e.Instead of waiting for all the scripts to be completed,
performance testing is started in a structured way right from the time the first performance testing script is ready.
This approach also realizes the power of one; i.e., things that can be achieved by a single performance script.
Author’s biography
Parag leads the Performance Testing vertical within QA-COE at Eaton Technologies. He
has close to 10 years of extensive experience in performance consultancy, management,
testing and tuning; and has worked on all major industry standard or open-source
performance testing/engineering tools. Prior to Eaton he was working as project
manager/Consultant with Performance-Engineering group within Techmahindra and
handled many critical telecom/banking projects for varied clients across the globe
Parag leads the Performance. He completed his BE in computer science from RGPV University, Bhopal and has a Master’s
degree in Telecommunication and Software engineering from Birla Institute of Technology and Science, Pilani.
Recommended