Strong Consistency Models and HTTP

In a wonderfully-written articleKyle Kingsbury explores different models of strong consistency and I thought it would be interesting to consider how to apply some of these concepts in the context of the HTTP protocol given the prevalence of web APIs in modern architectures. HTTP is an amazingly expressive protocol, especially for caching semantics, yet in my experience many protocol features are not widely used or understood.

In this article, we’ll explore how HTTP clients and servers can cooperate to achieve various consistency models. Remember that clients are part of a distributed system, and in fact we’ll see that many of these models cannot be achieved without having both clients and servers fulfill certain responsibilities that aren’t necessarily protocol requirements laid out in the HTTP protocol specification.

Now, we have an important caveat here: in order for a system to actually satisfy any of the consistency models we discuss, we have to assume that all of the participants implement the HTTP protocol correctly–at least to the point of being conditionally compliant with the specification. In practice, this is a pretty tall order; many very solid software components make practical tradeoffs all the time in the name of efficiency or interoperability with other noncompliant (but popular or necessary) components.


Linearizability is a consistency constraint requiring each operation (whether read or write) appears to take effect atomically at some point between its invocation and completion. Of course, if the origin server does not itself satisfy linearizability, it goes without saying that we’re out of luck. Thus, for the rest of this section, let’s assume the origin does satisfy linearizability.

As an important side note, many modern web services that are served out of multiple datacenters do not provide global linearizability; indeed, many modern “AP” datastores don’t provide linearizability across datacenters either, usually by design to promote higher availability and/or lower latency. Any datastore using asynchronous inter-datacenter replication isn’t linearizable.

Now, even with a linearizable origin service, right off the bat, caching presents a problem for linearizability in an HTTP context; if a client performs a GET that populates a cache entry, then another client completes writing the same location (via a PUT or POST to the same URL, but without passing through the same cache), and then the first client issues another GET to that resource and the response is served out of the cache, we’ve just violated linearizability.

Therefore, for a linearizable system, we have to ensure that a request cannot be served by a cache. There are actually two strategies here: the first is to ensure any intervening caches do not create a cache entry in the first place, and the second is to ensure that if a cache entry exists that it is not used to generate a response.

Preventing cache entries from being created

If we want to avoid creating cache entries in the first place, then Section 3 of RFC7234 (for HTTP/1.1) lists several conditions that must be met before a cache entry can be created.

Use uncacheable request methods

First, the cache must understand the request method, and it must be defined as being cacheable. Therefore, to avoid caching for linearizability purposes, we could stick to non-cacheable request methods. We all know GET requests are cacheable, of course, and perhaps some of us know that HEAD requests are cacheable too. Indeed, the vast majority of cache implementations only support caching for these methods. Did you know that POST requests are potentially cacheable, though? They are! Your XMLRPC or SOAP-based web service isn’t enough to save you here….

The standard HTTP request methods which are not cacheable are:

  • PUT
  • CONNECT (although this is used more for protocol negotation than for semantic application requests)

But there are also a whole bunch of extension request methods captured in IANA’s HTTP Method Registry. These include the following methods, some of which may or may not be cacheable. In my research for this article, I discovered that some of these methods are explicitly defined to be uncacheable, usually with language that says “responses to this method MUST NOT be cached”. Explicitly uncacheable extension methods include:

Some methods are implied to be uncacheable because they require that responses to the method MUST include Cache-Control: no-cache, without explicitly defining the cacheability of the actual method. Caches that do not understand/implement a method are not allowed to cache responses to those methods, so it isn’t clear what is gained here over defining the method to be non-cacheable. Since I don’t have the context that existed while these methods were being designed, we’ll assume there was a good reason. Methods that fall into this category include:

Some methods do not have explicit requirements around the cacheability of their responses. In the absence of a positive definition of being cacheable, we can assume they aren’t, and that a compliant cache won’t cache them, either because it understands them and knows they aren’t explicitly cacheable, or because it doesn’t understand them and hence won’t cache them. Many of these are defined to be unsafe and/or non-idempotent, which is corroborating evidence, although some simply describe unsafe semantics (they describe a modification of the targeted resource). These methods include:

Finally, some methods are not explicitly defined to be cacheable or non-cacheable, but are defined to be safe or describe safe semantics–i.e. they are “reads” of some sort. Technically, caches should not be caching them, but it might be easy to understand a cache implementer deciding to cache them, especially if the origin set cache-related headers like Expiresor Cache-Control: public. These methods include:

  • SEARCH. Actually, the spec says successful responses to a SEARCH request SHOULD NOT be cached, which means a conditionally compliant cache is still allowed to cache them.


Linearizable Strategy 1a: Restrict the API to use the following HTTP methods: COPY, DELETE, LINK, LOCK, MKCOL, MKREDIRECTREF, MOVE, OPTIONS, PROPPATCH, PUT, TRACE, UNLINK, UNLOCK, UPDATEREDIRECTREF. These methods are explicitly defined to be uncacheable.

Linearizable Strategy 1b (“Do you feel lucky, punk?” edition): Restrict your API to using the methods from Linearizable Strategy 1a plus: ACL, BASELINE-CONTROL, BIND, CHECKIN, CHECKOUT, LABEL, MERGE, MKACTIVITY, MKCALENDAR, MKWORKSPACE, ORDERPATCH, REBIND, UNBIND, UNCHECKOUT, UPDATE, VERSION-CONTROL. These methods are not explicitly cacheable and have unsafe semantics and hence will likely not be cached by a reasonably correct cache implementation.

Now, most application APIs I can imagine will want to do some kind of reads, writes, or generalized processing, so the lack of GET, HEAD, or POST will be problematic in practice. Therefore, while these two strategies are sufficient for linearizability in conjunction with a linearizable origin, they aren’t really practical.

Use nonstandard response codes

If a cache does not understand the response status code, it is not allowed to store the response. This leads, again, to a strategy of questionable practicality, which is to use completely custom status codes for responses. IANA also maintains a registry of response status codes, so you’d have to make sure you were not using something on this list.

However, there are some subtleties here: just because a status code is not currently on the registry does not mean someone won’t register it later. In addition, the HTTP protocol spec allows implementations to treat unknown status codes equivalently to the x00 status of that class. For example, an implementation receiving a 275 status code could otherwise treat it as a 200 response; again, while it should technically not be caching it since it doesn’t understand the 275, an error in implementation might lead to accidental caching if it is treated as a 200.

There are similar problems even with using undefined response classes, like 6XX codes (a future version of HTTP might define those).

In any event, due to the lack of solid options here (not to mention the dubious engineering benefit of using explicitly non-standard status codes), we’ll conclude there are no good linearizability strategies to be had by pursuing this angle.

Use the no-store cache directive

Aha! Here’s a very simple one: if Cache-Control: no-store appears on either the request or the response, a cache may not create a cache entry out of the response.

Linearizability Strategy 2a: Add Cache-Control: no-store to all requests from clients.

Linearizability Strategy 2b: Add Cache-Control: no-store to all responses from the origin.

Use the private directive

If all of the caches in the architecture are “shared” caches (multiple clients’ requests pass through them), then adding Cache-Control: private to all of the origin responses will prevent cache entries from being created. Given that the most common form of non-shared cache is one attached directly to the client (e.g. browser caches), and that many HTTP client libraries support caching, this strategy is likely only useful where the overall architecture is under a single administrative domain with sufficient governance to ensure the absence of non-shared caches. That may not be entirely common, but it isn’t entirely far-fetched either.

Linearizability Strategy 3: Ensure the absence of nonshared caches and include Cache-Control: private on all origin responses.

Authorization headers

For authenticated requests that add the Authorization header, shared caches cannot cache them unless explicitly allowed by the presence of a must-revalidatepublic, or s-maxage directive on the response Cache-Control header. This says nothing of shared caches, however, so as with the last strategy, we’d have to be able to ensure there aren’t any non-shared caches in the architecture.

Linearizability Strategy 4: Ensure the absence of nonshared caches, have clients supply Authorization headers on all requests, and ensure the origin avoids using must-revalidatepublic, or s-maxage.

This is starting to get convoluted, and certainly we’ve already encountered simpler solutions (like the use of no-store), but we’ll include it for completeness.

Avoid explicit cacheability

Finally, there are several things that explicitly signify cacheability on a response; if we avoid using any of these, then no cache entries will get created, either. Some of these are:

  • the presence of an Expires header
  • the presence of max-ages-maxage, or public in the Cache-Control header
  • the presence of a cache control extension that allows caching
  • the response having a status code that is cacheable by default

These latter two are tricky in the context of an extensible protocol, since we can’t know a prior whether a cache control extension allows caching or not, or whether a future (to-be-registered) status code might be cacheable by default or not. Therefore, we’ll have to restrict ourselves to known cache control extensions as well as known status codes. IANA’s cache control directive registry lists two extensions, stale-while-revalidate and stale-if-error, which do not explicitly allow the creation of cache entries themselves but do allow existing cache entries to be used in certain instances. I believe they can technically be used as part of this strategy, therefore, although it’s perhaps safer not to include them.

I did consult IANA’s status code registry to see if there were any which were defined as cacheable by default; according to the HTTP spec, additional status codes are not cacheable by default, so if they have been registered and their RFC does not indicate cacheability, then I have included them in the “safe” list below:

Linearizability Strategy 5: Do not include Expires headers on responses; restrict Cache-Control headers to include only the max-stalemin-freshno-cacheno-storeno-transformonly-if-cachedmust-revalidateprivateproxy-revalidatestale-while-revalidate, or stale-if-error directives; and ensure only the 100, 101, 102, 201, 202, 205, 207208, 302, 303, 304, 305, 307, 400, 401, 402, 403, 406, 407, 408, 409, 411, 412, 413, 415, 416, 417, 422423424, 426, 428429431, 500, 502, 503, 504, 505, 506507508510, or 511, status codes are used for responses.

This is very convoluted, and in particular, doesn’t give you any reasonable status codes to use for successful “reads”.

Preventing cache entries from being served

The other high-level linearizability strategy is to prevent existing cache entries from being served without some form of coordination with the origin. This is governed by Section 4 of RFC 7234, which lists several conditions that must be satisfied. Therefore, we can choose strategies that cause one or more of the conditions not to be satisfied.

The first condition is that the URI for the request and the stored response must match; this isn’t really very useful as if we were to make this untrue, we would have a very buggy cache implementation indeed!

The second condition requires that the request method associated with the stored response allows it to be used for the given request; this would seem to be another way of stating that the request method is cacheable, which we have covered above. The RFC doesn’t have a further reference at this point, though, and it may imply something beyond this. For example, the GET method’s definition says responses to it may be used to satisfy subsequent GET and HEAD requests, but presumably not other methods. This possibly suggests that there is a strategy involving keeping track of the sequences of methods that get used for various resources where by interleaving particular methods with one another, but this strikes me not only as very difficult to research and describe, but also to implement, so while I’ll identify this as a possible “missing” strategy, we’ll pursue some of the other requirements instead.

Ensure variants don’t match

The third condition involves not being able to serve a cache entry variant for responses that populated the Vary header but where the values on the incoming request don’t match those on the stored cache variant. This suggests a more unusual strategy, which is to set the Vary header on every response to include a particular request header for which clients generate a globally-unique value with each request.

For example, if the origin always set Vary: Request-GUID and a client always set a Request-GUID header containing an actual globally unique identifier (GUID), then cache entries would never be used to satisfy those requests.

Linearizable Strategy 5: Set the Vary header on every origin response and ensure clients generate globally-unique values for the listed header(s) with each request.

Use no-cache

There are multiple ways to do this, but what is interesting is that the no-cache directive allows you to serve a response from cache as long as the cache entry is successfully validated with the origin. According to the way revalidation works, it is possible for a cache to receive a validation response from another intermediate cache that might have a more up-to-date cache entry. However, Cache-Control is defined as an end-to-end header and so must be both be forwarded on requests by intermediaries and must also be stored as part of cached responses.

Therefore, a Cache-Control: no-cache on a request must eventually be forwarded all the way to an origin on a revalidation chain, leading to two possibilities:

Linearizable Strategy 6a: Set Cache-Control: no-cache on all requests.

Linearizable Strategy 6b: Set Pragma: no-cache on all requests. It’s worth noting that this will work with HTTP/1.0 participants; the Cache-Control header was introduced in HTTP/1.1.

Similarly, if the origin sets Cache-Control: no-cache on all its responses, then this header will be present on all cache entries as well as on any responses generated from them.

Linearizable Strategy 6c: Set Cache-Control: no-cache on all origin responses.

Requiring validation

The last requirement for a cached response being eligible to satisfy a request requires it to either be:

  • fresh
  • allowed to be served stale; or
  • successfully validated.

Therefore, the negation here requires that we fail to satisfy all of these possibilities.

In order for a stored response never to be fresh, it needs to have a freshness lifetime of zero seconds, and this needs to be set explicitly–if it is not set then a cache may calculate a heuristic freshness lifetime. There are several ways to accomplish this: with Cache-Control: max-age=0, with Cache-Control: s-maxage=0 (if all the caches are shared caches), or by setting an Expires header to something prior to the Date header. This trick with a past-dated Expiresheader will also work with HTTP/1.0 implementations.

Generally speaking, caches are not supposed to generate stale responses unless they are disconnected from the origin or if they are explicitly allowed by something like the max-stale request directive. Setting Cache-Control: max-stale=0 on all requests could help, as could Cache-Control: min-fresh=1 if the origin is always setting max-age=0 as above. However, a disconnected cache could still generate stale responses even with these request directives.

In order to require validation, the origin can set Cache-Control: must-revalidate; a disconnected cache will generate a 504 error response rather than a stale cached response in this case. If all of the caches are shared (proxy) caches, then Cache-Control: proxy-revalidate and Cache-Control: s-maxage=0 have the same meaning.

Finally, we have successful validation. Really, if we can force a cache to revalidate its cache entries with the origin, this is just a more efficient way to proxy an origin response through, and that is probably good enough for preserving the linearizability guarantees of the origin. If we really want to prevent it, though, we can do so by not supplying any cache validators on responses generated by the origin–i.e. we never set Last-Modified or ETag headers on the response.

We can mix and match these to produce a family of possible strategies:

Linearizable Strategy 7a: Set Cache-Control: max-age=0, must-revalidate on all origin responses.

Linearizable Strategy 7b: Set Cache-Control: max-age=0, must-revalidate and avoid setting Last-Modified or ETag headers on all origin responses.

Linearizable Strategy 7c: Set Cache-Control: must-revalidate and set Expires to a value prior to the Dateheader on all origin responses.

Linearizable Strategy 7d: Set Cache-Control: must-revalidate and set Expires to a value prior to the Dateheader, while avoiding setting Last-Modified or ETag headers on all origin responses.

Linearizable Strategy 7e: Ensure all caches in the architecture are shared caches, and set Cache-Control: max-age=0, proxy-revalidate on all origin responses.

Linearizable Strategy 7f: Ensure all caches in the architecture are shared caches, set Cache-Control: max-age=0, proxy-revalidate, and avoid setting Last-Modified or ETag headers on all origin responses.

Linearizable Strategy 7g: Ensure all caches in the architecture are shared caches, and set Cache-Control: s-maxage=0 on all origin responses.

Linearizable Strategy 7h: Ensure all caches in the architecture are shared caches, set Cache-Control: s-maxage=0, and avoid setting Last-Modified or ETag headers on all origin responses.

Linearizable Strategy 7i: Ensure all caches in the architecture are shared caches, set Cache-Control: proxy-revalidate, and set Expires to a value prior to Date on all origin responses.

Linearizable Strategy 7j: Ensure all caches in the architecture are shared caches, set Cache-Control: proxy-revalidate, set Expires to a value prior to Date, and avoid setting Last-Modified or ETag headers on all origin responses.

Summary: Linearizability

It seems like the most concise solution that covers a general architecture with a mix of HTTP/1.0 and HTTP/1.1 participants is:

Set Pragma: no-cache on all requests, and set Cache-Control: no-store on all origin responses.

If we can’t rely on the clients to be consistent about their part, then we can update this to an origin-only strategy:

Set Cache-Control: no-store and set Expires to a value prior to the Date header on all origin responses.