API Caching, why your server needs some rest

@TwitterAds | Confidential

@lfcipriani2013-08-30

APIs CachingW h y y o u r s e r v e r n e e d s s o m e r e s t

R u b y c o n f B r a z i l 2 0 1 3

Who?@lfcipriani

@lfcipriani

Scope of this presentation

• Caching in a Distributed System• The flows of HTTP Cache and how to control them• Good and Bad Practices

@lfcipriani

If you need a friendly way to understand the Caching part of RFC 2616Scope of this presentation

5Source: http://www.slideshare.net/lfcipriani/fearless-http-requests-abuse

Definitions and

Definitions and Motivations

@lfcipriani

Memorizing phone numbers or go check phonebook every time

Analogy

@lfcipriani

Network Effect

Welcome to the first year of Software Engineering...

...where every request delivers a response without failure and all network is reliable and fast.

Source: First day on Internet Kid (know your meme)

@lfcipriani

What problems cache helps to solve?

• redundant and unnecessary data traffic• network bottlenecks• origin server heavy load (or spikes)• long network latency

@lfcipriani

HTTP Archive

Motivations

Source: http://httparchive.org/trends.php?s=All&minlabel=Jan+20+2011&maxlabel=Aug+15+2013

All sites Top 1000

@lfcipriani

HTTP Archive Cache lifetime: All Sites vs Top 100

Motivations

http://httparchive.org/interesting.php?a=All&l=Aug%2015%202013&s=Top100

HTTP Caching Protocol

@lfcipriani

HTTP Caching flows

@lfcipriani 14https://vine.co/v/hOuAXTOetuz

bit.ly/vinecaching

@lfcipriani 15https://vine.co/v/hOuMHbTzp6h

bit.ly/vinecaching

@lfcipriani 16https://vine.co/v/hOu5g9FVDa5

bit.ly/vinecaching

@lfcipriani 17https://vine.co/v/hOuvzinwrt6

bit.ly/vinecaching

@lfcipriani

The Cache headers zoo

18Source: http://www.slideshare.net/lfcipriani/fearless-http-requests-abuse

Cache Coherency

@lfcipriani

What’s cache coherency?

Since only the Origin Server knows the state of a resource with certainty, caches and other components must to ensure that the cached response is still fresh before returning it to client.

Due to the complexity, keep cache coherency in distributed systems has a high cost.

In a distributed system

@lfcipriani

Better safe than sorryStrong consistency

Maintain coherency by revalidating every request in origin server.

@lfcipriani

Living dangerouslyWeak consistency

Cache has autonomy to use a heuristic to decide whether the cached response is still fresh, without consulting the origin server

Basically, there are 2 types of weak consistency.

@lfcipriani

Weak consistency - Invalidation

@lfcipriani

Weak consistency - Invalidation is bad!

• approach does not scale

• server needs to coordinate with a unknown network of caches

• choose 2: immediacy, scalability, reliability • “There are only two hard things in Computer Science: cache invalidation and naming things” - Phil Karlton

• Two Generals Problem

http://www.subbu.org/blog/2010/01/cache-invalidationhttp://en.wikipedia.org/wiki/Two_Generals'_Problem

@lfcipriani

Weak consistency - When to do Invalidation

When your network is similar to the one below ;-)

@lfcipriani

Weak consistency - TTL approach

Taming Cache

@lfcipriani

Topology considerations

@lfcipriani

Controlling cacheability Protocol Specific Considerations

1. locally means a cache that servers only one consumer2. these directives override any configuration of the cache3. by default, we can cache non safe/authenticated requests, GET and HEAD and those with status code 200, 203, 206, 300, 301, 410

cache-control directive may I cache locally? may I cache

anywhere?should revalidate, even being fresh?

no-store no no n/aprivate yes no no

no-cache yes yes yespublic yes yes no

@lfcipriani 30

Protocol Specific ConsiderationsControlling cacheability

Be aware of the Vary header, if the value is a header name which values are high diversified, you could fill cache storage too fast.

@lfcipriani 31

Protocol Specific ConsiderationsControlling revalidation

Revalidation is done with conditional requests.

If-Modified-Since != Last-Modified = 200If-Modified-Since == Last-Modified = 304If-None-Match != Etag = 200If-None-Match == Etag = 304

You can even decide how revalidation is done.

@lfcipriani

Content specific considerations

Careful with cookies

Be aware of how privacy policy influences what’s cacheable

@lfcipriani

Content life cycle considerations

TL;DR;

Know the rates of change of your resources and establish a time to live for them.

Expires=[Date]Cache-Control: max-age=[seconds]

@lfcipriani 34

• too short (seconds) or too long (days) TTLs smell bad

• TTL can vary, don’t consider it as a constant value.

• don’t be afraid to get sophisticated, if needed:• L-Factor heuristic: (date - last modified) * factor• Prediction Models http://www.slideshare.net/jseidman/real-world-machine-learning-at-orbitz-strata-2011

• Control your cache strategy!

Content life cycle considerations

@lfcipriani

General considerations

Deciding to have NO cache is part of the strategy.

Your cache strategy might not be honored by an intermediary cache, no hard feelings about it, is more common than you think.

Measuring efficiency

@lfcipriani

Measuring Cache efficiency

Hit Rate = Cache hits / Total of requests

This will depend on:• how big your cache is• how similar the interests of the cache users are• the data rate of change• how caches are configured

@lfcipriani

Byte Hit Rate = Bytes transferred from cache hits / Bytes transferred by Total of requests

@lfcipriani

• the same metrics could be applied to revalidations

• do the measures by resource

• do continuous measures and monitor to improve strategy

@lfcipriani

Validate your strategy in redbot.org

Final considerations

@lfcipriani

Final considerations

• Is important to have a good knowledge of Topology of the application and Distributed Systems constraints.

• Think and build a good strategy, don’t rely on default heuristics

• Measure, monitor and improve. Strategies are dynamic and change it is part of the process.

• All this can be done incrementally, focus on relevant resources

• Be careful to not turn cache into overhead.

@lfcipriani 43

References

Web Protocols and Practice: HTTP/1.1, Networking Protocols, Caching, and Traffic Measurement (Balachander Krishnamurthy and Jennifer Rexford)HTTP: The Definitive Guide (David Gourley, Brian Totty, Marjorie Sayer and Anshu Aggarwal)

http://www.w3.org/Protocols/rfc2616/rfc2616.html (HTTP RFC)http://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html#sec13 (Caching in HTTP)http://stevesouders.com/http://talleye.com/https://dev.twitter.com/bit.ly/vinecaching

Thank you!

API Caching, why your server needs some rest

Technology

REST & Caching: Web Services, Accelerated - INNOQ · REST & Caching: Web Services, Accelerated JAOO 2009 ... Microsystems and webMethods Public Draft Universal Description, ... WS-Reliable

Force.com REST API Developer Guide 1 Introducing Force.com REST API REST API provides a powerful, convenient, and simple Web services API for interacting with Force.com. Its advantages

Arquitectura API Rest

Mobile Api and Caching

[MS-CPREST]: Control Plane REST API... · 2020-03-05 · Control Plane REST API . . . ,

1 vCloud Director Data Protection Extension REST API · 4 vCloud Backup REST extensions 7.5.1 Note on Query Pagination and Caching

Consuming Office 365 REST API - European …PaoloPia -paolo@pialorsi.com - Introducing Office 365 REST API What are the Office 365 REST API? •Set of services with REST (REpresentational

Rest API Security

Web API, REST API and Web Scraping

CallFire API Introduction. Outline Overview Generating API credentials 7 Services CallFire API information REST API REST Example SendText SOAP API SOAP

ESC REST API - Cisco · REST API Security REST Authentication REST Https Support REST API List Resources managed by ESC REST Tenants Networks Subnets Flavors Images Volumes Deployments

Java™ Caching API - Oracledownload.oracle.com/otn-pub/jcp/jcache-2_9-pfd-spec/JSR107Proposed... · Java™ Caching API The Java Caching API is an API for interacting with caching

Creating a proxy REST API with IBM API Management 4.0.0files.meetup.com/19095698/How to create a REST API... · Creating a proxy REST API with IBM API Management 4.0.0.0 April 29,

REST API Getting Started Guide - Emboticsftp.embotics.com/REST/v2.3/vCommander REST API Getting Started... · REST API Getting Started Guide vCommander version 5.2 REST API version

API Deep Dive: APIC EM Rest API

REST API - AudioCodes · REST API 1. Overview Mediant Devices 3 REST API 1 Overview The REST API is designed for developers who wish to programmatically integrate the Mediant Gateway

Distributed Caching, Using the JCACHE API and ehcache

Caching REST with Windows Communication Foundation

REST API Specification

Salesforce REST API