Patterns f or Performance and Operability

Patterns f or Performance

and Operability Building and Testing Enterprise Software

Chris Ford

Ido Gileadi

Sanjiv Purba

Mike Moerman

A Auerbach Publications Taylor &. Francis Group

Boca Raton New York

Auerbach Publications is an imprint of the Taylor & Francis Group, an informa business

Contents

Dedications v

The Purpose of This Book xv

Acknowledgments xvii

About the Authors xix

1 Introduction 1 Production Systems in the Real World 1

Case 1—The Case of the Puzzlingly Poor Performance 2 Case 2—The Case of the Disappearing Database 5

W h y Should I Read This Book? 7 The Non-Functional Systems Challenge 8 What Is Covered by Non-Functional Testing 9 Planning for the Unexpected 10 Patterns for Operability in Application Design 11

Ensuring Data and Transaction Integrity 11 Capturing and Reporting Exception Conditions in a Consistent

Fashion 11 Automated Recovery from Exception Conditions 14 Application Availability and Health 14

Summary 14

2 Planning and Project Initiation 17 The Business Case for Non-Functional Testing 17

What Should Be Tested 17

How Far Should the System Be Tested? 19 Justifying the Investment 20 Negative Reasoning 21

Scoping and Estimating 22 Determining the Scope of Non-Functional Testing 22 Estimating Effort and Resource 26 Estimating the Delivery Timeline 29

vii

viii • Contents

Test and Resource Planning 33 Test Types and Base Requirements 33 Test Environments 36 The Test Team 37

Communication Planning 39 Setting Expectations 39

Summary 40

3 Non-Functional Requirements 41 What Are Non-Functional Requirements? 43 Do I Need Non-Functional Requirements? 43 Roles and Responsibilities 44 Challenging Requirements 45 Establishing a Business Usage Model 46

Quantifying Human and Machine Inputs 46 Expressing Load Scenarios 54

Non-Functional Requirements 56 An Important Clarification 56 Performance Requirements 58 Operability Requirements 62 Availability Requirements 64

Archive Requirements 65 Summary 67

4 Designingfor Operability 69 Error Categorization 70

Design Patterns 71 Retry for Fault Tolerance 71 Software Fuses 74

Software Valves 75 System Health Checks 78

The Characteristics of a Robust System 80 Simple Is Better 80 Application Logging 81 Transparency: Visibility into System State 83 Traceability and Reconciliation 84 Resume versus Abort 86 Exception Handling 87

Infrastructure Services 91 Design Reviews 91

The Design Checklist 91 The Operability Review 92 Summary 94

Contents • ix

5 Designing for Performance 95 Requirements 95

Hie "Ilities" 95 Architecture 101

Hotspots 101 Patterns 102

Divide and Conquer 102 Load Balancing 102 Parallelism 103 Synchronous versus Asynchronous Execution 107 Caching 109

Antipatterns 112 Overdesign 114 Overserialization 114 Oversynchronization 117 User Session Memory Consumption 118

Algorithms 119

Technology 120 Programming Languages 120 Distributed Processing 123 XML 125

Software 126 Databases 127 Application Servers 129 Messaging Middleware 129 ETLs 132

Hardware Infrastructure 134 Resources 134

Summary 136

6 Test Planning 139 Defining Your Scope 140

System Boundaries 140 Scope of Operability 142

Scope of Performance 145 Load Testing Software 145

Product Features 146 Vendor Products 147

Additional Testing Apparatus 149 TestBeds 150

Test-Case Data 150 Test Environments 151

Isolation 151

x • Contents

Capacity 153 Change Management 154

Historical Data 154 Summary 157

7 Test Preparation and Execution 159 Preparation Activities 159

Script Development 160 Validating the Test Environment 164 Establishing Mixed Load 164 Seeding the Test Bed 167 Tuning the Load 167

Performance Testing 171 Priming Effects 172 Performance Acceptance 173 Reporting Performance Results 176 Performance Regression: Baselining 177 Stress Testing 181

Operability Testing 181 Boundary Condition Testing 182 Failover Testing 183 Fault Tolerance Testing 186

Sustainability Testing 188

Challenges 192 Repeatable Results 193 Limitations 193

Summary 194

8 Deployment Strategies 195 Procedure Characteristics 196 Packaging 197

Configuration 197 Deployment Rehearsal 198

Rollout Strategies 198 The Pilot Strategy 198 The Phased Rollout Strategy 199 The Big Bang Strategy 199 The Leapfrog Strategy 200

Case Study: Online Banking 200 Case Study: The Banking Front Office 202 Back-Out Strategies 204

Complete Back-Out 204 Partial Back-Out 204

Contents • xi

Logical Back-Out 204 Summary 205

9 Resisting Pressure from the Functional Requirements Stream 207 A Question of Degree 208 Pressures from the Functional Requirements Stream 209 Attention 212

Human Resources 212 Hardware Resources 213 Software Resources 213 Issue Resolution 213

Defining Success 213 Setting the Stage for Success 214

Framework 215

Roles and Responsibilities 216 Raw Resources Required by the Non-Functional Requirements

Stream 216 Performance Metrics 221 Setting Expectations 222 Controls 222 The Impact of Not Acting 223

Summary 223

1 0 Operations Trending and Monitoring 225 Monitoring 225

Attributes of Effective Monitoring 227

Monitoring Scope 228 Infrastructure Monitoring 230 Container Monitoring 233 Application Monitoring 238 End-User Monitoring 239

Trending and Reporting 241 Historical Reporting 241 Performance Trending 241

Error Reporting 243 Reconciliation 244 Business Usage Reporting 245

Capacity Planning 245

Planning Inputs 245 Best Practice 248 Case Study: Online Dating 248 Maintaining the Model 255 Completing a Capacity Plan 255

xii • Contents

Summary 256

11 Troubleshooting and Crisis Management 257 Reproducing the Issue 257 Determining Root Cause 258 Troubleshooting Strategies 259

Understanding Changes in the Environment 259 Gathering All Possible Inputs 261 Approach Based on Type of Failure 263 Predicting Related Failures 265 Discouraging Bias 268 Pursuing Parallel Paths 268 Considering System Age 269 Working Around the Problem 269

Applying a Fix 270 Fix versus Mitigation versus Tolerance 270 Assessing Level of Testing 271

Post-Mortem Review 272

Reviewing the Root Cause 272 Reviewing Monitoring 272

Summary 275

12 Common Impediments to Good Design 277 Design Dependencies 277 What Is the Definition of Good Design? 279

What Are the Objectives of Design Activities? 279 Rating a Design 281

Testing a Design 286 Contributors to Bad Design 287

Common Impediments to Good Design 287 Confusing Architecture with Design 288 Insufficient Time/Tight Timeframes 288 Missing Design Skills on the Project Team 288 Lack of Design Standards 288

Personal Design Preferences 289 Insufficient Information 289 Constantly Changing Technology 289 Fad Designs 290 Trying to Do Too Much 290 The 80/20 Rule 290 Minimalistic Viewpoint 290 Lack of Consensus 291 Constantly Changing Requirements 291

Contents • xiii

Bad Decisions/Incorrect Decisions 291 Lack of Facts 291 External Impacts 291 Insufficient Testing 291 Lack of Design Tools 292 Design Patterns Matter 292 Lack of Financial Resources 292

Design Principles 292 Summary 293

References 295

Index 297

Documents

Patterns f or Performance and Operability