P E R F O R M A N C E S U S P E C T S H O W ... - PerfTestPlus

HOW TO IDENTIFY THE USUAL

PERFORMANCE SUSPECTS

Pho

toIl

lust

rati

onby

The

Des

ign

Div

a

So you found an odd pattern in your scatter chart that appearsto be a bottleneck. What do you do now? How do you gath-er enough information to refute the inevitable response,“The application is fine, your tool/test is wrong”? And howdo you present that information conclusively up front so youcan get right down to working collaboratively with the devel-opment team to solve the problem?

Through years of experience as a performance testingconsultant, I’ve learned something very important. It’s notsafe to assume that most people really know what a bottle-neck is—even if you’ve been doing performance testing fora long time and use the term “bottleneck” in everydayspeech—particularly when it is contrasted with a system fail-ure or a slow spot. To provide a teaching tool, I coined “Scott’sEight Basic Rules for Software Bottlenecks,” which I’ll sharewith you below.

“Webster’s Millennium Dictionary of English” (LexicoPublishing Group, 2003) defines bottleneck as: “n: a nar-rowing that reduces the flow through a channel; v 1: slowdown or impede by creating an obstruction; 2: become nar-row, like a bottleneck….”

This definition is understandable and hints at the originof the term (referring to the narrow part of a jar or bottle),but it is most useful to us if we note what it doesn’t say as wellas what it does say:

• “reduces the flow,” not “ceases the flow”• “slow down or impede by creating an obstruction,” not

“stop by creating an obstruction”• “become narrow,” not “become impassable”I summarize this as follows:Rule 1: A bottleneck is a slowdown, not a stoppage. A stop-

Bottlenecks are likely to be

lurking in your application.

Here’s how you as a performance

tester can find them. By Scott Barber

www.stpmag.com • 21

When You’re Out to Fix Bottlenecks, Be Sure You’re Able to Distinguish ThemFrom System Failures and Slow Spots

Scott Barber is chief technology officer at PerfTestPlus, Inc. His spe-cialty is context-driven performance testing and analysis for distributedmultiuser systems. Contact him at [email protected].

continued on page 24 >

24 • Software Test & Performance MAY 2005

PERFORMANCE TUNING

page is a failure. Something else thedictionary definition doesn’t mentionthat will become relevant to us is “load”or “volume.” The fact that the defini-tion says nothing about the cause of abottleneck suggests that it shouldn’tbe assumed that a bottleneck existsonly under load or volume. In myexperience, most folks assume thatbottlenecks don’t exist unless a certain“trigger load” is applied to a system.This is both contrary to the definitionof bottleneck and often untrue whenapplied to software systems, thus:

Rule 2: Bottlenecks don’t only existunder load. A useful comparison canbe made, at least conceptually, betweenthe flow of water through a pipe systemand the flow of activity through a soft-ware system. To that end, let’s take alook at bottlenecks from a hydrody-namics perspective.

Figure A shows the simplest possi-ble version of a bottleneck in a pipe(you may notice that the drawing atfirst glance resembles a bottle). Onecan see that more water could flowthrough the section of pipe on the leftthan on the right, if they existed in iso-lation. The arrows in this diagramdepict velocity, or the speed at whichthe water is actually moving throughthe pipe; the shorter the arrow, theslower the flow. Notice that the wateris moving more slowly in the larger-diameter side of the pipe. This is a clas-sic example of a queue.

In terms of the flow rate, the waterin the wider part of the pipe must vir-tually stop and wait for an opening inthe narrower section of the pipe, andthus it flows very slowly until it reach-es the “release point,” roughly wherethe narrower section of the pipe

begins. Once it reaches the releasepoint, the water starts flowing muchfaster. That “stop and wait” period isa queue. The bottleneck is the causeof the queue; it is not the queue itself.What’s important to note here is thatthe place where the pipe narrows is thebottleneck, but in fact the water flowsmost slowly just before the pipe beginsto narrow. This brings us to:

Rule 3: The symptoms of the bot-tleneck are almost never observed atthe actual location of the bottleneck.In hydraulics, there’s another usefulconcept: the “critical” bottleneck,defined as the one bottleneck that,unless resolved, will dictate the flowcharacteristics of a system. In FigureB, you can see three sets of obstaclesrestricting the flow of water throughthe pipe. It’s easy to see that obstacle2 is restricting the flow the most. Inthis case, obstacle 2 is the critical bot-tleneck, meaning that removingobstacles 1 and 3 won’t improve theflow of water through the pipe. Moresimply put:

Rule 4: The critical bottleneck is theone bottleneck along a particular userpath, the removal of which will improveboth performance and the ability tofind other bottlenecks. Exploring crit-

ical bottlenecks introduces us to theconcept of multiple paths through asystem. When you extend a systembeyond a single pipe into a closed sys-tem, you often add alternative pathsthrough that system. Figure C is anexample of a closed hydraulic system.

The difficulty of detecting bottle-necks in a system increases nearly expo-nentially with the number of possiblepaths through that system. In Figure Cyou can see that if there were a bottle-neck in the pipe on the right side, thewater could flow through the pipe inthe center instead. This could lead tothe appearance of a bottleneck in thecenter pipe, even though the bottle-neck isn’t there (see Rule 3). Thus it’simportant to remember:

Rule 5: If you have multiple pathsthrough a system and think there’s abottleneck, you should isolate each pathand evaluate it separately. In the systemdepicted in Figure C, you can see someitems other than pipes—a pump, valvesand a reservoir. If you think of the pipesas your network and the other items asyour hardware (Web server, routers andso forth), you quickly come up with:

Rule 6: The bottleneck is more like-ly to be found in the hardware than inthe network, but the network is easi-er to check. When people startedusing the term “bottleneck,” the con-cept of software hadn’t even beeninvented. That fact alone should makeus realize that the term probably hasa special meaning when applied tosoftware. Often, the term is used torefer to anything perceived to be slowin a software system, but this use isimprecise and should be avoided.Consider this scenario: Suppose onepage on a Web site has several largeimages on it. When a user requeststhis page, it may take a long time to

A: A SIMPLE PIPE BOTTLENECK

16 SQ. IN. 4 SQ. IN.

VEL=4VEL=1A

B

B: MULTIPLE BOTTLENECKS

1. 2. 3.

< continued from page 21


PERFORMANCE TUNING

load. But unless downloading thegraphics causes some other activity inthe system to slow down, it’s not a bot-tleneck; it’s just a slow page, or whatI call a “slow spot.”

A definition in “Load Testing foreConfidence” (a free book from SegueSoftware authored by Stefan Asböck)describes a bottleneck as “a point in aWeb application where congestion anddelay occur, slowing down the process-ing of requests and causing users toexperience unacceptable service delays.”

The key to this definition is the word“congestion.” Unless an observed slow-ness causes some other activity in thesystem to slow down, it’s not a bottle-neck; it’s just a slow spot. The next rulesummarizes this:

Rule 7: Unless other activitiesand/or users are affected by theobserved slowness or its cause, it’s nota bottleneck but a slow spot. In thecourse of defining “bottleneck,” we’vemade some distinctions that I’d like tospend a little more time on. The firstis the distinction between a bottleneckand a failure. I think I’ve made it clearthat a bottleneck is a slowdown, not astoppage, meaning that the expectedoutcome is eventually achieved, regard-less of how long it takes. For example,if you wait a long time but the request-ed Web page does eventually display

properly, you’ve encountered either aslow spot or a bottleneck. If, however,you wait and eventually are presentedwith an error page instead of therequested Web page, this is a failure.

The interesting twist to this distinc-

tion is that sometimes a very minorchange can transform the failure backinto a bottleneck. Consider the exam-ple above. It’s entirely possible that inthe second situation above, a timeout(failure) occurred due to a Web serversetting. Changing that setting and re-executing your test may result in allactivities being completed successfully,but taking an unacceptable amount oftime and slowing down all users—inother words, a bottleneck.

For our purposes, any time an erroroccurs, whether caused by a bottleneckor not, that error is a failure (you mayprefer to call it a bug, a defect, an issueor an area of interest) and should bereported as such. When that failure pre-vents other users from completing theirtasks in the expected manner, it’s a crit-ical failure.

The main difference between a bot-tleneck and a slow spot is that a bot-tleneck has widely felt performanceeffects. A single large graphic can causean annoying slow spot that may needto be resolved, but unless there are justtons of people downloading that graph-ic (a bottleneck caused by a popularactivity) or your Web server is under-powered (a bottleneck caused by insuf-ficient infrastructure), it’s just a slowspot that has no real effect on the restof the system.

I’m making a big deal of these dis-tinctions because as we continue ourdiscussion about bottlenecks, we’ll seethat we will often find failures and slowspots while we’re chasing bottlenecks,and we’ll need to distinguish among

them in order to be able to take appro-priate action.

Identifying Bottleneck SuspectsThere are as many ways of identifyingbottleneck suspects as there are peoplewho’ve observed a slowdown whenworking on a computer. It’s our job asperformance testers to identify as manyof those suspects as possible and thensort, categorize, verify, test, exploit andtry to help resolve them.

Let’s discuss some common ways toidentify bottleneck suspects. For now,we won’t worry about whether these sus-pects might turn out to be failures orslow spots instead of bottlenecks.

Most performance testers like toidentify bottleneck suspects visually,using graphs and charts. To that end,let’s take a look at some common ones.

Figure D is an example of a “re-sponse time by test execution” chart.Looking at this chart, you’ll immedi-ately be able to see that there is a dra-matic slowdown in response timesbetween 150 and 200 users. While thisdoesn’t tell us what the bottleneck is, itdoes give us a good indication of whereto focus our search for one.

Scatter charts, such as the oneshown in Figure E, are my favoriteanalysis tool, because of the ease withwhich you can identify bottleneck sus-pects using them. Simply put, any pat-tern that shows more than one dot(outlier) outside of your predefinedacceptable performance levels revealsa potential bottleneck.

In addition to charts and graphs,user observations are another keymethod of identifying bottleneck sus-pects. Personal observation is one ofyour best tools for identifying suspect-ed bottlenecks. As you’re creatingscripts, you’re using the application.You get to “feel” what the performanceis like, and you get a good idea of whattypes of activities cause the applicationto perform differently (better or worse)before you ever execute your first loadtest. These observations are extreme-ly valuable, not only as a method of val-idating your scripts, but also as a meansof identifying bottleneck suspects. Don’tassume that your tool is better at detect-ing bottlenecks than you are. The ulti-mate users of the system are people,not load-generation tools, and that

C: A CLOSED HYDRAULIC SYSTEM

RESERVOIR

RELEASEVALVE

PUMP

DIRECTIONALCONTROL VALVES

alone makes your opinion—based onobservation—more valuable than thenumbers the tool reports.

Ultimately, other people will startusing the system, generally while test-ing it. These folks will find all kindsof failures, slow spots and bottlenecks,whether they realize it or not. It’simportant to talk to them and evenobserve them periodically to see andhear what they think of the systemfrom a performance perspective. Asimple comment such as “I don’tremember that search taking thatlong in the last version” is a big redflag that a bottleneck may be hidingsomewhere in the search activity. Thebest part about that flag is that thesearch could have seemed pretty fastto you if you’d never used the pre-ceding version. Don’t assume thatyour personal observation will resultin the same suspects as the observa-tions of a casual user will reveal.

Confirming SuspectsAfter you identify a list of bottlenecksuspects, the next step is to confirmthem. Confirming a suspect won’t nec-essarily confirm that the suspect is abottleneck; rather, it will only verifythat you’ve encountered some kind ofperformance issue that warrants fur-ther research.

The key to confirming a suspect isthe ability to reproduce it. Until youcan do that, the suspect is uncon-firmed; and although unconfirmedsuspects aren’t necessarily invalid,they’re generally given lower prioritythan confirmed suspects.

The single most important require-ment for confirming a bottleneck sus-pect is the ability to reproduce theresults that you or others have iden-tified as indicating the suspect—thatis, the symptoms. If the symptomscan’t be reproduced, it’s often thecase that the observed condition thatled to identifying the suspect wascaused by something unrelated to thetest. The easiest way to do this is typ-ically to reproduce those results byrepeating the exact test that displayedthe results in the first place.

If the suspected bottlenecks werefirst observed while you were using atool, you should try to do whatever youcan to reproduce the symptoms man-

ually. It’s always possible that your testitself is causing a symptom that realusers won’t encounter. Always validatethe accuracy of your scripts before con-sidering a suspected bottleneck thatwas detected by a script to have beenconfirmed. Even then, you’ll want totry to reproduce that suspected bot-tleneck manually—both while no oneelse is on the system, and while the testis executing with the load at which thesuspect bottleneck was first detected.The ability to observe the symptomsunder one or both of these scenariosconfirms a bottleneck suspect.

Additionally, if you observe symp-toms of a bottleneck using tools, be sureyou can reproduce those symptomswith similar tests—preferably with somevariances, such as time of day, load,varying data or additional activities thatseem to be unrelated to the symptoms.The ability to reproduce the symptomsin similar situations is a strong indica-tor that the issue deserves furtherresearch and is therefore a confirmedbottleneck suspect. These tests alsotend to provide extremely useful infor-mation to the developers, who ulti-mately will be tuning the performance.

While you’re confirming your sus-pects, you should try to reproduce thesymptoms with the simplest test (man-ual or automated) possible. For in-stance, try to reproduce the symptomswithout load, or without performingany other activities while logged in as

that user. It’s not absolutely necessaryto be able to recreate the symptoms ofthe suspected bottleneck with a mini-malist test, but it will answer one of thefirst questions that the stakeholders arebound to ask and will aid in your abili-ty to demonstrate the suspect.

Just as with reproducing resultsusing minimalist tests, reproducingresults or symptoms using not-so-simi-lar tests will help you demonstrate theexistence of the suspect. It’s also a bigstep toward identifying the differencesamong a slow spot, a failure and a bot-tleneck. For example, a not-so-similartest may show that searching for a bookand searching for a store near you ona retail site are both slow. If only search-ing for a store near you were slow, youmight be tempted to think that some-thing specific to that search was slow—a slow spot—but knowing that bothtypes of searches are slow may lead youto think that the database is poorlytuned—that is, a bottleneck.

Reporting Confirmed SuspectsEffective reporting of confirmed sus-pects is extremely important—and tricky. You’re often metwith skepticism, disbelief,defensiveness or dismis-siveness. The first thingto remember is thatyou’re not alone.Every performancetester who’s ever report-

D: RESPONSE TIME BY TEST EXECUTION

Res

pons

eTi

me

(Sec

onds

)

60

50

40

30

20

10

0

Home Page 5.59 9.08 10.14 10.11 48.21 11.42 24.85 33.89

Page 1 0.75 0.74 1.49 2.37 41.22 6.80 14.10 27.09

1 LANUser

50 LANUsers

100 LANUsers

150 LANUsers

200 LANUsers

100128KbsUsers

10056.6Kbs

Users

10028.8Kbs

Users

PERFORMANCE TUNING

MAY 2005 www.stpmag.com • 27

ed a suspected issue has faced this. Thesecond thing to remember is that ifyou’ve followed the approach outlinedabove, you have a confirmed, repro-ducible suspect to report. If you reportit well, no one will be able to refute thatit’s a valid suspect. On the other hand,if you present your suspects poorly,overstate or understate them, or don’treport them at all, they may never getaddressed. It’s our job as performancetesters to ensure that these suspects gettaken seriously and addressed appro-priately. Above all, if you want to be tak-en seriously, remember this:

Rule 8: When reporting bottlenecksuspects, don’t assume that you knowthe cause; just report the symptoms.More specifically, don’t report sus-pected bottlenecks in a way thatimplies fault. Do describe all of thesymptoms you’ve identified, not justthe one you think is most relevant.Don’t speculate as to the cause of thebottleneck, even if you think youknow what it is. Do describe all of theways you’ve found to cause the symp-toms. Don’t get defensive when chal-lenged—it might really be the fault ofyour test—and always be prepared tosupport your claims.

Some other hints that I’ve founduseful when reporting suspected bot-tlenecks include:

• Report verbally. Nothing improves your credibility or intra-team communication more than taking the time to discuss the symptoms

rather than just e-mailing an observation.

• Report visually. As the old adage goes, “a picture is worth a thousand words.” Just make sure that the picture tells the story you want it to.

• Report via demonstration. Virtually everyone will want to “see and feel” the bottleneck, especially man-agers, executives and end users, none of whom have a great interest in technospeak.

What the Development TeamNeeds to KnowAfter you report the symptoms of sus-pected performance issues you’ve iden-tified, the developers may recognizethe symptoms and be able to resolvethem in short order. If not, they’regoing to need more information. Let’sreview some of the most common ques-tions the development team is likely tohave.

How did you get that to happen? See thesection on reproducing results, above.

Which related user activities produce thesame symptoms? Related activities canhelp developers determine what object,function, module or hardware com-ponent is causing the symptoms.

Which other activities are affected by thebottleneck? This also helps the develop-ers narrow down the potential causes.

How many (or how few) users were per-forming the activity before and during theappearance of the symptoms? Bottlenecks

don’t only exist under load, but know-ing the load at which they becomenoticeable is critical to the tuningprocess.

What was the distribution in time (arrivalrate) of the symptoms? Unrealistic usermodels can easily cause symptoms thatsimply would not occur in production.Arrival rates are the most commonlyexaggerated part of user models.

What were other users doing before andduring the appearance of the symptoms?Often there is a single trigger event thatcauses the bottleneck, and finding itcan be quite challenging.

What data did you use to create the symp-toms? Sometimes something as seem-ingly innocent as the distribution ofusers can cause bottlenecks.

What was the value of a particular meas-urement? No matter how much infor-mation you have, there is always some-thing else that will help the developers.

Can you re-execute that test and tell meif…? Always be ready to modify yourparameters and run the test again.

After presenting your findings anddetermining what other informationthe developers will need to help themresolve the issue, the next step is todesign tests that will produce the infor-mation the developers are looking for.

When you’re designing the tests,don’t worry about how you’ll developthem with the tools you have available.Thinking about the capabilities of thetools at your disposal while designingwill almost always lead you down theroad of designing tests that are easy toimplement, instead of tests that willprovide immediate value. Instead, askyourself “what if” questions. For exam-ple: “What would happen to thesesymptoms if I eliminated all other useractivity? If I added more user activity?If I used different data? If I changedthe load characteristics? If I testedfrom multiple IP addresses? If I useda different navigation path to get tothis activity?”

It is also generally useful to askdevelopers to speculate as to whatkind of test they think will generateresults that will be most useful tothem. Of course, while you are askingyourself and the developers thesequestions, remember to continue eval-uating objects with slow responses,and think in terms of distinguishing

E: RESPONSE VS.TIME INTO TEST SCATTER


PERFORMANCE TUNING

MAY 2005 www.stpmag.com • 29

PERFORMANCE TUNING

failures, slow spots and bottlenecksfrom one another.

Armed with the answers to thesequestions, we can now get to the heartof the matter. How do you build tests tocollect this next level of informationusing the tools you have? Unfortunately,there’s no cookbook answer. Every pieceof information is found in a differentway, and even that changes from appli-cation to application, platform to plat-form, and development style to devel-opment style. As it turns out, there areseveral directions you can follow. Youcan modify existing tests, create new testswith the same tool, create new tests witha different tool, or use test harnessesin combination with either the same ordifferent tools.

I’ve seen and taken part in lots ofdebates about just what a test harnessis. Let’s agree that for our purposes,the term “test harness” means anyhelper application or application mod-ification created for the purpose ofmaking it easier to use the tool or toolsat your disposal to collect informationabout a performance issue.

Test harnesses can be used in manysituations. For instance, a simple Webpage with an input box and a Submitbutton that connects to the database,but bypasses the rest of the applica-tion, is a powerful tool. In this case,you could record a script that passesin various SQL statements and recordthe response time. This test harnesswill allow you to quickly eliminate thedatabase as the cause of the issue with-out having to go through the wholebattery of tests.

It’s unlikely that you’ll be the onedeveloping test harnesses. You’ll haveto work closely with your developmentstaff to create them. You should con-sider having test harnesses built when-ever you can’t find another way to iso-late a piece of information, even if itseems like you should be able to dothis using your other tools. Often,once you start discussing test har-nesses with your developers, they’llcome up with ideas for test harnessesthat will provide performance infor-mation they otherwise wouldn’t beable to obtain easily.

Is It Time to Tune?More times than I would have thought,

when I report bottleneck suspects,someone in the room responds,“Oops. I know what that is. Scott, I’llcall you in a few hours and ask you torerun your tests. I’m pretty sure thiswill be fixed.”

As far as I’m concerned as a per-formance tester, this is the ideal situa-tion. Nine times out of ten I go back tomy desk and work on something elsefor an hour or two, the developer calls,I rerun the test, and that suspected per-formance issue is gone.

The truth is, while you’re reportingthe symptoms of suspected performanceissues, you’ll generally find yourselfengaged in conversations about thecause of the symptoms. If there’s con-sensus as to both the cause of the symp-toms and how the problem might beresolved, the attempt should be made toresolve it (that is, to tune) immediately.If either the cause or the resolution isunclear, you’ll want to continue modi-fying tests, exploiting and researching.

Teamwork Pays OffNow that we’ve discussed detectingperformance suspects, distinguishingbetween failures, slow spots and bot-tlenecks, and how to start trackingdown performance bottlenecks to alevel of detail great enough for devel-opers to tune them, I’d like to reem-phasize the importance of increasingyour level of interaction with the devel-opment team. Without a good rela-tionship with the development team,it’s unlikely that you’ll ever be certainof anything more concrete than sus-pects, symptoms and hunches. Byworking closely together, you and yourdevelopment team should be able todetect and tune bottlenecks quicklyand efficiently.

In the next issue of Software Test &Performance, we’ll go on to discussthe activity that occurs between thetime you determine that there is a per-formance bottleneck (bottleneck de-tection) and it gets tuned (bottleneckresolution). In this activity, our goal isto work with the tuners to providethem with the information they needto do their job effectively, thus mini-mizing the often agonizing trial-and-error process that is commonly a hugetime sink during the testing and tun-ing process. ý

Advanced Software QA• test design • test automation• test management

Automation that is• maintainable• scaleable• reusable

Global management• shared repository• distributed teams

TestArchitect is for• Business Analysts• Test Engineers• Automation Engineers• Build Engineers• Managers and Leads

Download evaluation copy and whitepaper

Test earlier. Test faster.Automate smarter.

TestArchitect™ 2

Tel +1 800 322 0333Fax +1 650 572 2822 [email protected]

www.logigear.com

Documents

P E R F O R M A N C E S U S P E C T S H O W ... - PerfTestPlus