Copyrighted material John Tullis 10/20/2015 page 1 Scalability & Capacity John Tullis DePaul Instructor [email protected]

Copyrighted materialJohn Tullis

04/21/23page 1

Scalability & Capacity

John TullisDePaul [email protected]


04/21/23page 2

Scalability/Capacity

Where are the bottlenecks?


04/21/23page 3


Data flow bottleneck details


04/21/23page 4


Latency Defined

• Latency will be defined as the round trip time between two systems due to the sum of the delays in the network components traversed. Latency has two important measurements, the average time for these round trips and the variance observed across multiple round trips. Latency is bounded on the low end by the minimum times to transit the electronic components in the path plus the time required to serialize the data onto media in the path plus the propagation time required to cross the media.


04/21/23page 5


Latency Defined

• The minimum time required to open a new TCP/IP connection and obtain data from a source is two round trips across the network. Therefore the best possible response to the user is two times the network’s round trip latency.• The number of objects on a page can greatly impact the user ’s perception of response time because of the way they magnify the network latency during the retrieval of those objects.


04/21/23page 6


“Latency” perspective


04/21/23page 7


Firewall impact on data flow bottlenecks

• Firewall impacts - compared to normal packet forwarding, firewalls with logging and proxy enabled can reduce throughput by as much as 50%. Note also: Today’s firewall systems employ more than just proxy and packet filtering.• A comprehensive firewall will do Stateful filtering / inspection, Network Address Translation (NAT), extensive logging, IP spoofing detection, SYN attack detection and anti-virus scanning. Each of these elements have a cost in the budget and must be considered in the overall performance budget.


04/21/23page 8


Caching

• The addition of memory to a system almost always improves performance. This is because physical I/O is a relatively expensive operation in terms of latency. It makes intuitive sense then, that by dedicating memory in a web server to store frequently accessed HTML pages and images, you will improve performance. As a rule of thumb, your web server should have enough RAM to accommodate all network buffers, frequently used applications, i mages and HTML , including those mounted via DFS.• This is especially important for dynamically generated pages


04/21/23page 9


Caching as a solution approach• Caching Approach Benefits:

• Share the Cache (Cache hit)

• Reduce delay

• Reduce WAN traffic

• Security Issues:

• (1) Cache consistency

• (2) Scalability

• Non- cached items

• SSL, Cookies, "?"...

• Greater delay (Extra Hop)


04/21/23page 10


Caching as a solution approach• Terminology

• Hit - item is in cache

• Miss - item is not in cache

• Stale hit - item is cached but out of date

• Check the cache!

• Administratively configured

• HTTP 1.1 Cache control header

• On Command

• Prefetching ( N - most active etc. )

• Cache is effective when it is large with a "large" user base.


04/21/23page 11


Cache details


04/21/23page 12


Planning for latency & processing time


04/21/23page 13


Web server logging• Web server logs, in particular the Access Log, record important information about the use of the web server. However, these logs can become very large and should be pruned and/or archived regularly. Logging can have a surp risingly large impact on response time and throughput. For this reason, you will often find that vendors turn logging off for bench marking purposes.• Production sites don’t turn logging off. Because high use web sites have correspondingly high logging activity, logs should be placed on the fastest devices available. Striping and mirroring of logs can also help to improve performance by streamlining I/O operations.


04/21/23page 14


Web server logging

•Use the logs to identify your peak processing periods. Then collect information from your peak periods to use for trend analysis. You will particularly want to look for statistics like the number of connections per second and the number of hits per second to understand your workload. This will help you to anticipate the need to add additional servers or to upgrade existing server resources.• By actively monitoring your workload, you can become proactive in managing the workload to minimize response time and maximize throughput.


04/21/23page 15


Network issues• If there is no contention for either CPU or memory resources on your system, and you are experiencing performance problems, you may have a network issue to resolve. After all, while incoming web requests may be relatively small, outgoing web responses can contain large graphics, applets, video or audio files. It is important to make sure that the number and size of the TCP buffers be tuned appropriately on the web server platform.• Because the size of requests coming into the server is so different than the requests going out of the server, many sites, site have routed incoming requests along relatively “thin” network pipes such as Token Ring while routing their output requests along relatively “fat” network pipes such as FDDI.


04/21/23page 16


Use multiple servers

Note - multiple content sources is a problem - remember content management presentations!


04/21/23page 17


Capacity planning - Facts & Figures

• Assume $5 million revenue / year.• Assume 1/2 million orders / year ($100 avg. order size)• Assume 5% conversion rate, therefore 10M visitors / year, which means 10M / 365 = 27,397 visitors / day. Round up to 30K.• Assume all 30K appear in 8 hour period for peak, so 3750 / hour. Round up to 4K.• Assume in any given second, 1/4 of visitors hitting the system. So 1000 / second.• Assume average page has 5 HTTP hits. Therefore, 5K hits / second.


04/21/23page 18



• Go to www.spec.org.• Get the SPECweb99 benchmark. SPECweb99 is the next-generation SPEC benchmark for evaluating the performance of World Wide Web Servers. This is based on:

• Standardized workload, agreed to by major players in WWW market• Stable implementation with no incomparable versions• Measurement of simultaneous connections rather than HTTP operations • Simulation of connections at a limited line speed• Dynamic GETs, as well as static GETs; POST operations.• Keepalives (HTTP 1.0) and persistent connections (HTTP 1.1). • Dynamic ad rotation using cookies and table lookups.• File accesses more closely matching today's real-world web server access

patterns.• Inter-client communication using sockets.


04/21/23page 19



• However, right now very few vendors have responded with the SPECweb99 results, so you will probably have to use the SPECweb96 results.• Get the SPECweb96 benchmark for a platform, say: IBM F50.• The IBM F50 has a SPECweb96 rating of 6716.• 5000 / 6716 = 75% load.• Assume that because this will also run the ecommerce engine and other software (personalization, advertising, taxation, etc.) that there will be 2 F50 boxes.


04/21/23page 20


Capacity planning - Facts & Figures continued

• Assume 30% of Web pages are directed to the commerce engine. But of those pages, when directed to the ecommerce engine there are not 5 HTTP hits per page but one hit. So 5000 / 5 = 1000. • 1000 x 30% = 300 hits / second. Use to calculated database impact.• Assume 1/2 of that 30% is general processing (non-cached product pages, non-cached category access, searches, etc.): 150 hits/sec.• Assume 1/3 of that 30% is updating orders (shop cart viewing and changes): 100 hits/sec.• Assume 1/6 of that 30% is submitting orders: 50 hits/sec.• Assume 2 order updates/order update: (100 x 2) 200 DB writes/sec.• Assume 5 DB updates per order submission: (50 x 5) 250 DB w/s.• Therefore total DB writes: 450 DB writes/second.• Therefore TPM-C equals: (450 x 60) = 27,000 DB writes/minute.


04/21/23page 21


Standards & Benchmarks

• The Transaction Processing Performance Council™ (TPC) was founded in August 1988 by eight leading hardware and software companies as a non-profit organization with the objective to produce common benchmarks to measure database system performance.• More than 50 companies are currently members of the council. There are two active benchmarks (TPC-C and TPC-D) for which results can be published. They can be used to calculate database performance.


04/21/23page 22


Capacity planning - Facts & Figures continued• Of the 300 hits/second directed @ ecommerce engine, assume that these generate 5 database reads: (5 x 300) = 1,500 DB reads/sec. • This generates 90,000 DB reads/minute.• Total DB reads & writes: (27,000 + 90,000) = 117,000 database transactions/minute. This is your calculated TPC-C.• Check the TPC-C number of machine @ www.tpc.org.• Assume this is: 135,815. (IBM Enterprise Server S80 with Oracle 8i as the DB server)• Are you okay? If your calculated TPC-C is less than the number rated for your machine, you are okay. Here, 117,000 / 135,815 = .86. But if this was greater than 1, your database machine would be undersized.• Note also we rounded up the original numbers.


04/21/23page 23


Capacity planning - calculate WS/EC RAM

• Calculate machine RAM….• Assume run 20 ecommerce “daemons”, @ 25Mbytes each: 500M. Note that you can increase the number of server processes as long as the CPU activity is less than 80%, paging is very low, and CPU wait is very low (less than 5%). • Assume Web server takes 50M.• Assume OS takes 40M.• Therefore, for Webserver & ecommerce host, need at least 590 Mbytes of RAM.• But additional software will require additional space. For example - Taxware, Net.Perceptions, Double-Click Adserver, etc.


04/21/23page 24


Capacity planning - calculate DB host RAM• Assume 1st DB connection takes 5Mbytes, each additional connection takes 1/2 Mbyte.• Assume 2 connections per ecommerce “daemon”.• This means (20 x 2) = 40, (39 x 0.5 + 5) = 19.5 + 5 = 24.5 Mbytes.• Assume database server requires: 150 Mbytes.• Assume indexing & querying takes 200 Mbytes.• Assume indexes held in memory take: 500 Mbytes.• Assume OS requires 40Mbytes.• Database server total: (25 (rounded) + 150 + 200 + 500 + 40) = 915 Mbytes of RAM. Round this up to 1 Gbyte of RAM.Note: there may be other programs running - MQSeries, etc. Those also should be factored in here.Note: more than 4 Gbytes of RAM provides no additional benefit.


04/21/23page 25


Capacity planning - calculate DB host RAM• However, note that there is a separate and simpler way to calculate RAM for the database server host. Simply assume that it is required that there be 512 Mbytes of RAM for each CPU. Thus:

• 1 CPU means 512 Mbytes RAM• 2 CPUs means 1 Gbyte RAM• 4 CPUs means 2 Gbytes RAM• 8 CPUS means 4 Gbytes RAM

• Note that for the typical WebSphere Commerce implementation, the IBM lab recommends 4 CPUs, therefore this implies 2 Gbytes of RAM.


04/21/23page 26


Capacity planning - calculate DB server Disk • For database server hard disk, assume: 50,000 products.• Assume 4 Kbytes/product: (50,000 x 4,000) = 200 Mbytes.• Assume index takes the same amount of space = 200 Mbytes.• Assume that text extender indexing is x10: 2,000 Mbytes.• Total database disk space for products: 2.4 Gbytes. • However - you need to calculate customer profile space also. If you keep 2 years worth of profiles online, each profile being 2K; @ 500,000 orders per year: (2K x 500,000 x 2) = 2 Gbytes.• Note that does not count profiles for customers who do not order. Use the 5% conversion rate for this…but assume that you store 1 year worth and profiles are only 1K. (1K x 500,000/.05 x 1) = 10 Gbytes.


04/21/23page 27


Capacity planning - calculate DB server Disk

• Total is now 14.5 Gbytes.• Assume disk space for each order is 2K.• 2 years of orders = (500,000 x 2 x 2K) = 2 Gbytes• Add disk space for OS, software, message queues. This may be 5 Gbytes, depending on message queue size especially. Note that OS swap space should be equal to twice RAM for AIX, 2.5 times RAM for Solaris. This works out to 4 - 5 Mbytes.• Total is now about (14.5 +2 + 5) = about 20 Gbytes.• If the hosting vendor backs the database server up to local drives 1st, then size to 40 Gbytes.• Finally…use 6 or more drives for each 3 Gbytes for performance.


04/21/23page 28


General suggestions

• Be sure to account for daily and seasonal variances in expected traffic to your site. Be aware that promotions, and other events can often increase traffic to a site by a factor of 5 or more. Configurations with separate application and database servers may be able to run with less hardware during average months, and connect more hardware as needed during high season.• Keep in mind that it is easier to ramp up to increased site demand if your database server has extra unused capacity. Extra capacity can be achieved at the database via the use of SMP systems which scale vertically with additional CPU boards and memory.


04/21/23page 29


Recommendations for Web servers

• Analyze your application workload, segment into informational and transactional types and deploy on separate systems of appropriate strength• Plan for peak access and design a ‘load shedding’ option• Set the minimum number of web server threads at a high enough level to accommodate daily "peak hour" loads to avoid synchronous allocation of threads at peak request time.• Monitor server statistics, or set alerts to determine when normal setting have been exceeded. If settings are exceeded on a frequent basis, consider increasing the minimum number of threads on your server.


04/21/23page 30


Recommendations for Web servers

• Use fast devices for Web Server logs. Stripe your logs across devices when possible.• Locally cache all objects which will be referenced frequently.• Java servlets are much faster than CGI-BIN programs and run faster if they are preloaded.


04/21/23page 31


Recommendations for MQSeries• An application must establish a connection to a message server prior to accessing its message queues. To minimize the time and resources consumed by connection handling, strive to minimize the number of "connect" operations. If possible, design an application to connect only once during of time in which queue access is needed.• A message queue must be opened before an application can "get" or "put" a message to the queue. The queue "open" operation is relatively expensive compared with the "get" or "put" of a message. Therefore, strive to minimize the number of "open” operations to a queue.• Use persistence only when necessary. Many messages can be recreated or resent if necessary.• Use very fast log devices, and stripe log datasets when possible . Logging occurs even without persistent messages.

Documents

Copyrighted material John Tullis 10/20/2015 page 1 Scalability & Capacity John Tullis DePaul Instructor [email protected]