8
Why does my perfectly working App Crash and Burn in Production? Matt Kramer Project Manager, STL Boeing Scalability Test Lab 206-240-4260 cell

Why does my perfectly working App Crash and Burn in Production? Matt Kramer Project Manager, STL Boeing Scalability Test Lab 206-240-4260 cell

Embed Size (px)

Citation preview

Page 1: Why does my perfectly working App Crash and Burn in Production? Matt Kramer Project Manager, STL Boeing Scalability Test Lab 206-240-4260 cell

Why does my perfectly working App Crash and Burn in Production?

Matt Kramer

Project Manager, STL

Boeing Scalability Test Lab

206-240-4260 cell

Page 2: Why does my perfectly working App Crash and Burn in Production? Matt Kramer Project Manager, STL Boeing Scalability Test Lab 206-240-4260 cell

The Business Need for Load & Stress Time, Money, & Chaos impacts on core businesses

Can the application support the users it needs to? Lost productivity if the product is slow

Will it slow down when user counts get too high? Impacts on customer retention if your product becomes unstable or slows

to a crawlWhat kind of response times can we expect in production?

Life span of productsHow many users can the system support? Do we need more servers?

Getting product support form 3rd party solution providers while they are still under contract and before the big checks have been writtenDid the partner deliver on the contracted deliverable and should we pay them?

Support costsWill it still take 30 seconds to login if we have 300 people on the system?

Page 3: Why does my perfectly working App Crash and Burn in Production? Matt Kramer Project Manager, STL Boeing Scalability Test Lab 206-240-4260 cell

Key Challenges

Getting good End User information- Ask a lot of questions, dig, mine the database for session info

Developing a good load profile- How many users, how long will they be on the system, how often will they complete tasks, are there peaks caused by work hours?

Getting a good, production like environment- Is it networked correctly, if not a full prod copy then at least have everything to scale.

Getting workable code- in enough time to develop results before a release- the age old balance between functionality and access.

Under the hood of the Database- Getting good a Good view into what is happening in the database. I3 for Oracle solutions is great.

Page 4: Why does my perfectly working App Crash and Burn in Production? Matt Kramer Project Manager, STL Boeing Scalability Test Lab 206-240-4260 cell

Product Pre-Screening (if possible) Code Quality Product Overall Maturity Coding standards SQL Quality Architecture

Page 5: Why does my perfectly working App Crash and Burn in Production? Matt Kramer Project Manager, STL Boeing Scalability Test Lab 206-240-4260 cell

Risk of OutSourced and 3rd Party Revenue - Cost = Profit. Capitalism as a system is always looking for ways to reduce cost. Even if a

company has a product Good Talent makes good Products Good Talent costs $ - Ranked by cost (not by value)

1st world Talent 1st world mediocrity 3rd world Talent 3rd world mediocrity

Talent Vacuum- Industry wide issue with more demand than talent History- India as a traditional supply of individual talent not products. Talent market in India- short supply, workers are more likely to switch jobs for promotions, the largest

and most aggressive customers tend to get the best resources for as long as they are noisy. Communication Gaps caused by

Language Culture Time zones distance

Dirty Laundry- Technical Salespeople don’t know, or won’t share the dirty secrets The Deal vs the Real-Technologically illiterate executives commonly make purchasing decisions Lack of Access- Getting access to fixes, answers, or details is next to impossible

Page 6: Why does my perfectly working App Crash and Burn in Production? Matt Kramer Project Manager, STL Boeing Scalability Test Lab 206-240-4260 cell

Load & Performance Risk Factors Code Maturity Over all architecture Has the product been migrated from another OS, platform, from client server? Has there been any re-architecture of the product recently Degree of Product Customization possible Is the product cross platform? Are there implementations in use of the type and size that your company needs (users &

transaction size)? Are there implementations in use with similar data set, database size, and data growth curve? Stability of the underlying technologies being used The number and complexity of integrations with different products Degree of change in recent releases High turn over of the development and support staff writing and supporting the application Quality of the Database schema, normalization, upkeep, best practices Degree that the product is being customized for your company

Page 7: Why does my perfectly working App Crash and Burn in Production? Matt Kramer Project Manager, STL Boeing Scalability Test Lab 206-240-4260 cell

Information Needed (what will Production look like?)

Architectural understanding User types- Batch jobs or other system impacts Integrations Networking Any load balancing

Page 8: Why does my perfectly working App Crash and Burn in Production? Matt Kramer Project Manager, STL Boeing Scalability Test Lab 206-240-4260 cell

Typical Issues/BottleNecks SQL statements without any indexes that quietly increase response times. Memory not being released by processes- shows up in longer test runs De-normalized databases which cause lots of large multi-table joins and slow

response times- Load Balancing or Clustering solutions not fit for the volume of data they are

supporting- common with Master/slave configurations or applications not meant to support clustering. Check the resource usage on the different servers.

Un-tuned Servers or services- memory allocation, buffer sizes Code touching Technology Solution weaknesses- windows hot-folders with

extreme amounts of data Poor Architecture Chatty applications- How large and how many round trips does a transaction

take? Lots of round trips for an application act as a multiplier on response times Reports- At large Companies Executive reports or reporting applications can

have huge impacts on the database.