Flash: AnEfficientandPortable WebServerWeb Servers 2/20 Fulfill client (e.g. web browser) request...

Flash: An Efficient and PortableWeb Server

Presentation by Nikolaos Giannarakis

Vivek S. Pai, Peter Druschel, Willy Zwaenepoel

November 18, 2015

Web Servers

Fulfill client (e.g. web browser) request for web content.To achieve high throughput web servers rely on cachingpopular content.If the requested content is not found in the cache then theserver must fetch it from the disk.Key idea: Overlap fetching (and other blocking actions) withserving requests for cached content.

Handling a request

Each task involves either network communication or disk I/Oand hence can block.Server must interleave the tasks of many requests to improveperformance.Different ways to achieve interleaving of tasks.

Multi-process architecture

Each process handles the tasks of a request sequentially.Multiple processes are used to serve many requestsconcurrently.

+ Interleaving in this model occurs naturally, as the OS willcontext switch when a process blocks.

− High overhead from multiple processes, less aggressiveoptimizations.

Multi-threaded architecture

Multiple threads are used to serve many requests concurrently.+ Less overhead compared to MP.+ Shared address space allows more aggressive optimizations.− Requires synchronization for accessing shared data.− Only feasible with OS that supports kernel threads.

Single-process event-driven architecture

Single process, interleaving of tasks is achieved usingnon-blocking I/O.Issues I/O and proceeds to next request. Uses polling tocheck for completion of I/O.

+ Minimal overhead, no context switches or synchronizationrequired.

− Missing abstraction of sequential execution of the tasks,control flow and reasoning become more complicated.

− In practice I/O operations may block.

Asymmetric multi-process event-driven architecture

Combines the previous approaches.A single process runs as an event loop but delegates blockingactions to helper threads/processes.

+ Less overhead compared to MP/MT, performance of singleprocess on cached requests.

+ Admits the same optimizations as SPED.− Only provides I/O concurrency, not CPU concurrency.

Design comparison

Disk BlockingThe cost of disk operations differ between the architecturesdepending on whether they can cause the system to block.In MP, MT and AMPED only one request gets blocked, thesystem can serve other requests.In SPED if one task blocks the whole server blocks as well.

Design comparison

Memory ConsumptionThe memory consumption of the server is important becausethe server caches requests. The more memory, the morerequests it can cache.SPED has very low memory consumption. It doesn’t requireany memory to keep track of children processes/threads.MT has some additional cost due to per-request threads.AMPED also incurs some overhead, but notice that AMPED’shelpers are per concurrent I/O task and not per request.MP has the highest memory overhead due to per-requestprocesses.

Design comparison

Disk UtilizationMultiple disks may improve performance if the server cangenerate multiple I/O operations concurrently.SPED can only generate one I/O operation hence it cannotbenefit from multiple disks (assumes it blocks?).MT, MP and AMPED can generate concurrent I/Ooperations.

Design comparison

Information GatheringServers perform profiling to uncover possibilities foroptimizations.Trivial in SPED and AMPED because of single thread.MP and MT require communication/synchronization.

Application-level CachingCaching of frequently accessed data (request responses, filemappings, etc.)Single cache for AMPED and SPED.

Flash: AMPED implementation

Implements helpers as processes for portability.Three types of application-level caching (filename translations,response headers, file mappings)Smart trick to keep data aligned by padding response headersaccordingly.

Performance Evaluation

Comparing architecturesCompare four servers based on Flash: Flash, Flash-MT,Flash-MP, Flash-SPED.

Comparing web serversCompare Flash and two state-of-the-art web servers Apacheand Zeus.

Evaluating optimizationsEvaluate the impact of the various optimizations implementedin Flash.

Evaluation environment

Same hardware (Pentium II, 128MB RAM).Two different operating systems (Solari, FreeBSD).100Mbit/s ethernet connections.

Synthetic workload

Cache-bound workload on Solaris

Same workload on FreeBSD

Real workload

Real workload on FreeBSD

Evaluating in a WAN

Previous evaluations were done in a LAN setting.This does not accurately model the increased connectiontimes due to packet losses and limited bandwidth that occurin a WAN.Idea: Use persistent connections

Impact of concurrent HTTP connections

Optimizations impact

Impact of optimizations

Each optimization (or cache hit) avoids one request.Performance may double thanks to optimizations.

Optimizations impact

Impact of optimizations

Each optimization (or cache hit) avoid one request.Performance may double thanks to optimizations.

Conclusions

Choice of server architecture is really important forperformance.The evaluation proved that the choice of OS is/was important.Flash matches the performance of SPED architectures oncache-bound workloads and exceeds the performance ofMP/MT architectures on disk-bound workloads.

Flash exceeds the performance of Zeus and Apache by agood margin.

Only focuses on I/O concurrency.

Flash: AnEfficientandPortable WebServerWeb Servers 2/20 Fulfill client (e.g. web browser) request...

Documents

1 11 Web Caching Web Protocols and Practice. 2 Topics Web Protocols and Practice WEB CACHING Cache Definition Goals of Web Caching Motivations for

Caching Technologies for Web Applications

An Evaluation of Caching Strategies for Clustered Thomas ... · An Evaluation of Caching Strategies for Clustered Web Servers Thomas Larkin A dissertation submitted to the University

Web Cache. Introduction what is web cache? Introducing proxy servers at certain points in the network that serve in caching Web documents for faster

1 Implementing ISA Server Caching. 2 Basics of Web Proxy Caching Definition: A Web cache sits between one or more Web servers (also known as origin servers)

Hyperbolic Caching: Flexible Caching for Web Applications · Hyperbolic Caching: Flexible Caching for Web Applications Aaron Blankstein, Siddhartha Sen†, and Michael J. Freedman

Web Caching and Data Compression

cse403 scale - courses.cs.washington.educourses.cs.washington.edu/courses/cse403/12au/lectures/cse403_s… · Scaling up through caching database server/ database web servers application

Web Caching Schemes1 A Survey of Web Caching Schemes for the Internet Jia Wang

Resilient Microservices with Kubernetes...Internet App Servers Web Servers Databases Microservices The Monolith Object Oriented Programming Version Control Caching Cloud Computing

The RAM based Web Proxy Servers - WSEAS€¦ · Key-Words: - Web Proxy, Web caching, Web Traffic, Storage Design, Performance Evaluation 1 Introduction A web cache server [1] (also

Hierarchical Caching and Prefetching for Continuous Media Servers with Smart Disks

High Performance P2P Web Caching

Shark: Scaling File Servers via Cooperative Caching

Hierarchical Caching and Prefetching for Continuous Media Servers with Smart Disks By:Amandeep Singh Parth Kushwaha

Internet: Authoritive DNS Servers Resolver: gethostbyname() Server: is 1.2.3.4 Client Caching DNS Server

INTEGRATING INTELLIGENT PREDICTIVE CACHING AND … · INTEGRATING INTELLIGENT PREDICTIVE CACHING AND STATIC PREFETCHING IN WEB PROXY SERVERS J. B. PATIL Department of Computer Engineering

Web Caching and Content Delivery. Caching for a Better Web Performance is a major concern in the Web Proxy caching is the most widely used method to improve

Large-Scale Web Caching and Content Deliverychase/cps212-f00/slides/webcache.pdf · Caching for a Better Web Performance is a major concern in the Web Proxy caching is the most widely

Web Caching - University of California, San Diego · Web Caching Page 1 of 24 Web Caching based on: Web Caching , Geoff Huston Web Caching and Zipf-like Distributions: Evidence and