HTTP ETag

The ETag or entity tag is part of HTTP, the protocol for the World Wide Web. It is one of several mechanisms that HTTP provides for Web cache validation, which allows a client to make conditional requests. This mechanism allows caches to be more efficient and saves bandwidth, as a Web server does not need to send a full response if the content has not changed. ETags can also be used for optimistic concurrency control^[1] to help prevent simultaneous updates of a resource from overwriting each other.

An ETag is an opaque identifier assigned by a Web server to a specific version of a resource found at a URL.^[2] If the resource representation at that URL ever changes, a new and different ETag is assigned. Used in this manner, ETags are similar to fingerprints and can quickly be compared to determine whether two representations of a resource are the same.

ETag generation

The use of ETags in the HTTP header is optional (not mandatory as with some other fields of the HTTP 1.1 header). The method by which ETags are generated has never been specified in the HTTP specification.

Common methods of ETag generation include using a collision-resistant hash function of the resource's content, a hash of the last modification timestamp, or even just a revision number.

In order to avoid the use of stale cache data, methods used to generate ETags should guarantee (as much as is practical) that each ETag is unique. However, an ETag-generation function could be judged to be "usable", if it can be proven (mathematically) that duplication of ETags would be "acceptably rare", even if it could or would occur.

RFC-7232 explicitly states that ETags should be content-coding aware, e.g.

ETag: "123-a" – for no Content-Encoding
ETag: "123-b" – for Content-Encoding: gzip

Some earlier checksum functions that were weaker than CRC32 or CRC64 are known to suffer from hash collision problems. Thus they were not good candidates for use in ETag generation.

Strong and weak validation

The ETag mechanism supports both strong validation and weak validation. They are distinguished by the presence of an initial "W/" in the ETag identifier, as:

"123456789"   – A strong ETag validator
W/"123456789" – A weak ETag validator

A strongly validating ETag match indicates that the content of the two resource representations is byte-for-byte identical and that all other entity fields (such as Content-Language) are also unchanged. Strong ETags permit the caching and reassembly of partial responses, as with byte-range requests.

A weakly validating ETag match only indicates that the two representations are semantically equivalent, meaning that for practical purposes they are interchangeable and that cached copies can be used. However, the resource representations are not necessarily byte-for-byte identical, and thus weak ETags are not suitable for byte-range requests. Weak ETags may be useful for cases in which strong ETags are impractical for a Web server to generate, such as with dynamically generated content.

{}

References

^ "Editing the Web – Detecting the Lost Update Problem Using Unreserved Checkout". W3C Note. 10 May 1999.
^ "ETag – HTTP | MDN". developer.mozilla.org. Retrieved 10 October 2021.

External links

Apache HTTP Server Documentation – FileETag Directive
Editing the Web: Detecting the Lost Update Problem Using Unreserved Checkout, W3C Note, 10 May 1999.
Live demo of zombie cookie using ETags
Old SQUID Development projects – ETag support Archived 23 September 2012 at the Wayback Machine (completed in 2001)
Using ETags to Reduce Bandwidth & Workload with Spring & Hibernate
ETag in HTTP/1.1 specification
Concerning Etags and Datestamps by Lars R. Clausen (2004)

[1] "Editing the Web – Detecting the Lost Update Problem Using Unreserved Checkout". W3C Note. 10 May 1999.

[:0-2] "ETag – HTTP | MDN". developer.mozilla.org. Retrieved 10 October 2021.

[1]

[2]