Author Topic: HOWTO: Advice to improve object caching on the internet  (Read 870 times)

Offline TheOracle

  • Hero Member
  • *****
  • Posts: 152
  • Karma: 16
HOWTO: Advice to improve object caching on the internet
« on: March 05, 2008, 03:20:41 PM »
There are a few tricks that I've found in deploying NS configurations to help insure that content is cacheable by client browsers and intermediate caches.

1.  If you use cookie persistence, make sure the timeout is set to 0
If there is a set-cookie in the headers of an object, caches won't cache the object.  If you set a timeout value, there will be a set-cookie on all objects, static or otherwise, resulting in lowered cache hit rates.
2.  Make sure your servers use the same etag value.  When servers generate an etag value, they generally don't do so in a way that is consistent across several servers.  As such, if an if-modified request with the etag value is present, the server will think the content changed, and deliver a new copy.  In general though, if an etag is present, then a last-modified is also, and as long as the timestamps on a file are kept in sync on all servers, then this can be used in an if-modified request.    As such, if you are load balancing servers using unique etag values, you can use the following to strip them when a last-modified header is present:

add rewrite action remove_etag delete_http_header Etag
add rewrite policy remove_etag "http.RES.HEADER(\"Last-Modified\").EXISTS&&http.RES.HEADER(\"Etag\").EXISTS" remove_etag

3.  Don't have the server add oddball values for various headers including (taken from live websites):

Vary: Host
Etag: "us, us"
Cache-Control: max-age=0  # unless you want it NOT to be cached
Vary: *

Including a Vary header other than "Vary: User-Agent" is bad news:

Some hyperlinks may fail when the server sends a VARY header response

IE (even through 7) will ONLY cache an object if the vary header is only for the user-agent, nothing else.  Nada.  Zip.  As such, even if you use:

"Vary: User-Agent, Accept-Encoding"

Will cause the object to not be cached on the client with IE.  Some proxies only accept "Vary: Accept-Encoding", however, so you may have to choose which of the two you want.  As a best practice, besides using

4.  If possible modify the content management system such that all static objects are versioned.  In this, I mean a URL should look something like "/static/01-01-2008-v1/object.jpg".  If the object is changed, then anyplace the object is used should be changed as well.  This is very useful for the next rule...

5.  Insure a header such as "Cache-Control: max-age=<x>" is served with every static object.  The value should be picked based on how likely it is that an object is to be changed.  For example, if you have followed rule 4, then you can use a policy such as follows:

add rewrite action insert_cc insert_http_header "Cache-Control" "max-age=31536000, public"

This will insure that intermediate caches won't even bother if the object is fresh for a year, and  explicitly states that the object can be cache unless another behavior of the object prevents caching.  Browsers will honor this as well, and they won't do if modified requests against the object either.

A more conservative value:

add rewrite action insert_cc insert_http_header "Cache-Control" "max-age=3600, public"

A word of caution, if you do expect to change content on a cached page, then never use a high max-age value, as clients and proxies will NEVER EVEN ASK if the object has been changed once they know they can hold onto the object for a set amount of time.

Some followup reading on cache behaviors (I have no relation to this guy or his blog):

mnot’s Web log: The State of Browser Caching
mnot’s Web log: HTTP entries

Offline evildani

  • Administrator
  • Hero Member
  • *****
  • Posts: 282
  • Karma: 17
By some strange cosmic reason I missed this thread.... OH MY GOD!!!! its good!!