Shunra app cloud_whitepaper

Deploying Your Application in the Cloud:Strategies to Proactively Mitigate Performance Risk

A Shunra Software Best Practices White Paper By Marty Brandwin

WAN. Web. Mobile. Cloud.Conf idence in Application Performance™

On Black

© 2011 Shunra Software Ltd. All rights reserved. Shunra is a registered trademark of Shunra Software.

A Shunra Software White Paper

Corporations worldwide are shifting technology resources and infrastructure to the Cloud. These businesses expect to realize gains in operational efficiency and scalability as a result of the Cloud’s elasticity, and they expect to reduce capital expenditures on IT infrastructure as they migrate to an operational pay-as-you-go expense and offload typical infrastructure management responsibilities (and costs) to the Cloud provider.

Today, organizations recognize the value and significant gains that Cloud computing offers. They are also knowledgeable enough to recognize the risks involved with Cloud deployments, such as the potential bottlenecks and points of failure that are introduced as application topology and dependencies now include extra hops to the Cloud. Other risks include network latency, data security, bandwidth limitations, reliance on third party content delivery networks, and potential development costs if application architecture or components require refactoring. The end result of all of these possible impairments is reduced application performance and a poor user experience.

Cloud computing, therefore, is not an instant “win”. It is critical to analyze the potential tradeoffs that may be necessary when moving an application, or some of its components, to the Cloud. It is also vital to be proactive in determining the impact these changes will have on application performance and, most importantly, user experience.

Is my application Cloud-ready?

When analyzing an existing application for its Cloud-readiness, it is imperative to break down the application into its core dependencies, components and functionality. With each “piece” of the application, organizations must weigh the unique benefits and risks to determine whether the Cloud paradigm is the best option – whether each component will function as expected in the Cloud, whether it is scalable, what costs will be incurred to maintain the component in the Cloud, and how end users will experience it.

Client Server

Login Request

Session Initiation

Login Reply

Initiated Session

Login PageDownload

Login Page Request

SporadicAcknowledgements

Session Teardown

Session ClosedSession Closed

3 Seconds

Client Server

Login Request

Session Initiation

Login Reply

Initiated Session

Login PageDownload

Login Page Request

SporadicAcknowledgements

Session Teardown

Session ClosedSession Closed

30 Seconds

1 msec Latency for LOCAL User 50 msec Latency for REMOTE User

Additional latency introduced by extra hops to the Cloud has an additive e�ect that can impair end user experience.

Typically, preparing an application for the Cloud requires one of two application development efforts: re-architecting application components with a SaaS-like infrastructure, or building new components and applications that leverage Cloud APIs for design, process and workflow. Both situations introduce costs and performance risk to the application.



The introduction of minimal additional latency can create significant performance bottlenecks when a large number of application calls are occurring.

Cloud infrastructure changes mean existing investments in architecture, data structure and performance engineering may not be leverageable. Re-architecting the middleware and back-end tiers of an application to leverage Cloud APIs can be a significant undertaking. Application development and management platforms must be capable of supporting the Cloud model throughout all stages of the application development lifecycle. Without appropriate planning for the development, refactoring and management of applications deployed to the Cloud, organizations may be forced to seek out ad hoc solutions that represent additional costs and corporate investment, offsetting at least some of the expected gains from a Cloud migration.

Most importantly, all of these changes put a burden on the QA/Testing team. Not only does application functionality in the Cloud need to be validated, so does performance and adherence to service level objectives (SLOs). While the application performs well in the traditional datacenter, the variability of hosting it in the Cloud introduces new performance risk.

Complicating the migration, and critical to accurately assessing application topology changes, is the requirement to have a thorough understanding of the services and architecture offered by the Cloud provider and the role of third-party vendors that may be working with the provider (content delivery networks, for example). Service level guarantees and other performance metrics are increasingly easy to establish and monitor, though it is much more difficult to anticipate unplanned outages, and resulting application behavior, in the Cloud as opposed to the traditional data center.

Moving from the traditional datacenter and into the Cloud paradigm necessitates a hand-off of control – control of data, control of centralized IT functionality. Best practices, therefore, dictate a well-choreographed and thorough performance assessment of the application in advance of deployment to the Cloud. While management and maintenance control is largely relinquished, preparedness and validation of application performance provides the assurance IT organizations need to confidently deploy to the Cloud.

Proactively testing (and validating) end user experience

Now that you have thoroughly assessed Cloud provider capabilities and applied that knowledge to your application development and hosting plans, there is one more requirement to complete your proactive strategy: validate and ensure end user experience.

The best-laid plans cannot fully anticipate and account for the performance and experience risks associated with deploying applications in the Cloud. In fact, application issues within the Cloud environment can not only resurface, as they did in the datacenter,

but also be magnified. Take for example the latency implications with a chatty application – the introduction of minimal additional latency can create significant performance bottlenecks when a large number of application calls are occurring. In addition, multi-tenancy and shared Cloud resources mean that some applications can be negatively impacted by high load and resource requirements from other applications.

Pre-deployment performance testing is essential.

The current Cloud performance testing paradigm requires a pre-deployment migration of application components and data to a Cloud-based staging area in order to test functionality, establish benchmarks and set expectations. Copying over virtual machines and other components to the Cloud from the datacenter introduces its own performance and resiliency risks that need to be understood.

Once application components or a reference system are deployed, which can be time-intensive, additional testing code may be required and the application may be placed in a debug state. From there, the application or its components can be stress tested and the interaction of both the Cloud-based and datacenter-based components can be analyzed. What-if scenarios, times of peak load, scalability, etc. are all conditions that can then be tested. While this high-level view of testing is consistent with what QA and Performance Engineers have come to expect in traditional datacenters, the pay-as-you-go model of the Cloud makes this a costly proposition.

Rather, pre-deployment testing in the datacenter, with real Cloud-based simulation, is a more cost-effective and flexible means for testing applications. By precisely emulating Cloud conditions and services prior to deployment, organizations are able to test more scenarios at less cost and be certain of end user experience.

Organizations must be able to:

Collect real-world Cloud network information over time, including latency, jitter, packet loss, and bandwidth constraints

Replay these real-world impairments in a test lab

Understand datacenter location and end user location(s)

Automatically recreate multiple network scenarios, including best- and worst-case conditions

This approach to pre-deployment testing empowers organizations to proactively plan for and successfully deploy applications to the Cloud.

To optimize pre-deployment testing



In addition, emulating Cloud conditions and simulating real-world usage scenarios, like outages and peak loads, early in the Cloud deployment/development lifecycle allows organizations to better anticipate and plan for capacity and resource requirements. Analysis of application behavior in the datacenter under Cloud conditions and what-if scenarios can also help organizations determine which application components are best suited for, or are even capable of being deployed to, the Cloud.

A Practical Example with Shunra’s PerformanceSuite

To realize value and the fastest return on your Cloud migration investment, best practices dictate proactive pre-deployment testing with solutions like Shunra’s Performance Suite. As the leading application performance engineering provider, Shunra has helped thousands of companies worldwide build performance into their applications, whether WAN, Web, Mobile or Cloud.

When a multinational entertainment company decided to migrate its online communities and social media properties to a private IBM-hosted Cloud, it turned to Shunra to proactively determine and validate its migration strategy. The company had several load generation tools available and functionality testing experience in the lab, but recognized the potential impact of the move on its end users and wanted to ensure optimal application performance based on network conditions.

The company knew that latency would be introduced to the online applications based on the physics alone of a geographic move. However, they also needed to understand how additional gateways, network queues and conditions that would require packets to be re-sent could multiply this delay.

In order to test the impact of latency and other real-world network constraints, Shunra’s Network Catcher was deployed to the private Cloud to capture real-life latency, jitter and packet loss values. This data was then replayed in a test lab using Shunra’s PerformanceSuite and Shunra’s seamless integration with HP LoadRunner and Performance Center. The data was played in sequential order, and

again in random order, with various factors imposed to change parameters in order to test performance and scalability under the breadth of real-life conditions.

The company was able to precisely recreate the conditions of the private Cloud and accurately simulate multiple test scenarios in the company’s on-site lab. As a result of an extensive and thorough pre-deployment performance test, Shunra helped the company validate the performance and associated requirements of the online communities prior to deployment. This was of utmost importance as the company operates one of the most popular family-focused communities on the Web, and user experience could not be compromised. Shunra was also able to quantify the potential gains in efficiency, providing a cost justification for the migration.

As a result of supporting this migration project, the company now employs Shunra for performance validation and needs analysis on dozens of online application releases annually.

Key Impairments and Risks

As we mentioned, network impairments that are experienced in the data center can be magnified within a Cloud architecture. Assessing performance among varying Cloud network conditions is essential. Impairments to consider, include:

Latency

Latency is the amount of time required for a packet to reach its destination across a given physical link. It is also, more often than not, a primary source of performance problems. One way to think about latency is through a simple analogy: the driving distance between two points. How long a car takes to get from point A to point B depends on factors like distance, speed limits, and traffic congestion. If points A and B are close in proximity, then latency is negligible. As the distance becomes greater, however, as it does when you introduce a Cloud topology and the multiple gateways that must be traversed in a typical transaction, greater performance risk is introduced.

Factors contributing to latency include:

Geographic distance – increasing the distance between links introduces a delay based on the physics of sending data packets from one location to another; this delay is magnified by the potential need for additional “turns” or the need to re-send packets when they become corrupt or fragmented; a vicious cycle can result as the increased distance also increases the risk of packet corruption or loss.

Network queues – when traversing a network consisting of multiple intermediate networks, packets tend to “queue up” at busy routers, much as traffic accumulates at busy intersections; overloading these routes increases latency; and, if packets need to be re-sent, additional traffic, and thus latency, is created.

Before migrating an application to the Cloud, it is essential to

NetworkCatcher enables capture and playback of real-world network behavior.

understand the combined impact of real-world network latencies and application “turns” on the performance of critical business services to the end user.

Jitter

Jitter is a measure of the variability of latency. It describes the variation in time (or delay) that is experienced between sending and receiving data packets. The result of jitter can be packet loss or re-ordering, which can have dramatic impact on the performance of video or audio streams.

Bandwidth Availability

Bandwidth describes the speed at which information travels on a link per unit of time. Data cannot be sent or received faster than the underlying media allows. Bandwidth considerations, however, are more complicated than just the speeds at which data can be transmitted, known as theoretical bandwidth. Rather, when considering bandwidth and its impact on performance, we must consider other performance factors that affect how much of the available bandwidth can be used:

Bottlenecks – a network is only as fast as its slowest link; if users connect to a 1.5Mbps WAN through a 56 Kbps dial-up link, real bandwidth is 56 Kbps.

Utilization – as with any channel, the more traffic there is (think about cars on the highway), the slower the speed.

Protocol overhead (bandwidth allocation) – different protocols impose different bandwidth penalties – i.e., the percentage of the data stream allocated to addressing and other control functions; for example, ATM has an overhead of 10% (5 bytes for every 53-byte ATM cell), effectively lowering network bandwidth allocated for data transfer by 10%.

Quality of Service (QoS) – many network providers allocate bandwidth based on the type of traffic or destination; for example, video may get a higher priority than email because of greater potential performance problems with video; similarly, traffic going to a corporate customer may be prioritized over traffic to a residential customer.

Asymmetric bandwidth – another complication occurs when downloaded data is received much faster than uploaded data, as with a Digital Subscriber Line (DSL) network; typically used in residential settings, when DSL is used in a business environment, even a small upload can temporarily slow or stop other data traffic.

In Cloud environments, the impact of network connections and the amount of data that can be carried is an essential consideration, especially since bandwidth is subject to contention by multiple applications. In a public Cloud environment, in particular, the performance of any given application is subject to the volume of traffic generated by all the other applications utilizing the same infrastructure.

Packet Loss

In general, when data carried across a network is lost or corrupted, the affected packets must be resent. As discussed, this can compound network impairments like latency and jitter, causing significant performance degradation. This degradation is not due as much to the packet loss as it is to the time it takes for applications to respond to them. The most significant effect of packet loss is from application timeouts, which are defined as the length of time a network host is programmed to wait for a reply before resending the latest information again. Each time a packet must be resent, the resulting timeouts incurred can severely reduce the quality of the end user experience.

Packet loss can occur for several reasons:

Hardware or software bugs – packets can be assembled or disassembled incorrectly due to infrastructure or software defects.

Electrical problems – high power lines, inadequate noise isolation, air conditioners and other electrical sources can disrupt data transition.

Network loads – when traffic coming to a router exceeds the router’s ability to process, an overflow condition results; this overflow condition may be handled automatically by the router which proactively drops packets to avoid overflow conditions.

IP header corruption – when packet header information is corrupted, a router may misinterpret the packet as being invalid and drop it; header corruption typically occurs because of errors at the physical network layer which cause data bits to toggle.

Fragmentation – when a data packet exceeds the maximum allowed to traverse the network, it may be broken down into smaller packets before sending it on its way; this fragmentation takes time and increases the aggregate processing time required (because there are more packets to process) and more risk of lost packets.

Networks are imperfect. Network conditions change. With a huge number of data packets flying in many different directions, across complex network infrastructures that incorporate multiple technologies from multiple vendors, not every 0 and 1 will travel from endpoint to endpoint exactly as expected.

Cloud migrations introduce performance risk that can and must be mitigated to maintain user satisfaction, productivity and/or revenue streams. A proactive approach to performance engineering empowers organizations to see how their code will behave under variable and worst-case conditions. By incorporating the realities of the network environment into the test cycle, organizations gain valuable insight into the vulnerabilities that can adversely affect application performance. And, they are best equipped to resolve issues before end users are affected – saving considerable time and money.




About Shunra

When deploying applications across WAN, Web, Mobile or Cloud-based networks, risk mitigation and cost avoidance is paramount. Today, 80% of the costs associated with application development occur in remediating failed or underperforming applications after deployment, where the ineffective application has already had a negative impact on the end user or customer experience. Shunra offers a proactive approach to application performance engineering (APE). When implemented at the policy level and as a best practice across the Application Lifecycle, the Shunra PerformanceSuite™ builds real-world application performance testing (latency, packet loss, bandwidth optimization, jitter), into all business and mission-critical applications, all prior to deployment. The Shunra solution discovers, predicts, emulates and analyzes the performance of applications over real-world networks – all within an offline, pre-production, test lab or COE environment. The results? Shunra

provides customized performance results, enabling pre-production remediation and optimization, and confidence in application performance prior to deployment.

Shunra is the industry-recognized leader in Application Performance Engineering (APE), offering over a decade of experience with some of the most complex and sophisticated networks in the world. Customers include WalMart, McDonalds, Bank of America, Apple Computer, Cisco, Verizon, FedEx, GE, Walt Disney, TJX, Best Buy, eBay, Siemens, Motorola, Marriott, Merrill Lynch, AT&T, ADP, ING Direct, Citibank, Thomson Reuters, Master Card, IBM, Boeing, HP, Pfizer, Boeing, Intel, and the Federal Reserve Bank.

Shunra is based in Philadelphia, PA and is privately held. For more information, call 1.877.474.8672 or visit.www.shunra.com.


North America, Headquarters1800 J.F. Kennedy Blvd. Ste 601Philadelphia, PA USATel: 215 564 4046Toll Free: 1 877 474 8672Fax: 215 564 [email protected]

Israel Office6B Hanagar StreetNeve Neeman B Hod Hasharon45240, IsraelTel: +972 9 764 3743Fax: +972 9 764 [email protected]

European Office73 Watling StreetLondonEC4M 9BJTel: +44 207 153 9835Fax: +44 207 285 [email protected]

Call your Local office TODAY to find out more!For a complete list of our channel partners, please visit our website www.shunra.com

On Black

www.shunra.com

Ask Shunra About Our Proactive Strategies for Deploying Your Application in the Cloud Today!

Visit www.shunra.com and request to be contacted. Or contact Shunra directly at 1.877.474.8672 or 1.215.564.4046 (worldwide offices listed below)

Application Performance Engineering

On Black

WAN. Web. Mobile. Cloud.Confidence in Application Performance™

Documents

Shunra app cloud_whitepaper