Tracking is the holy grail of online marketers and businesses that rely on accurate information about their users. The motivation for tracking is hardly ever as suspicious as some privacy advocates would have us believe. Companies use the information to provide better services. Nevertheless, users should be able to decide for themselves whether to allow their online activity to be tracked. That many decide to install tracking blockers and deny the use of third-party cookies is evidence that there’s a proportion of Internet users that dislike the idea of being tracked.
ETags are a method used by some site owners to circumvent user choice where tracking is concerned, and they are an interesting illustration of how tracking works.
To track a user a site needs to place a piece of information on that user’s system to identify them. It can be as simple as a unique number keyed to a database entry on the site’s servers or those of a third party (hence third-party cookies). When the user first visits a page, the server sends the requested data along with a unique identifying number—often in a cookie, which is just a text file.
When the user visits that site again, or another site within the tracking company’s network, the server requests the cookie, and because it contains a unique number, the site can be entered into a database under that unique number. Over time, the tracking service can build a profile from the sites a user visits. The information is generally innocuous and doesn’t identify an individual person, but some are uncomfortable about even that level of tracking and seek to block it.
ETags are part of HTTP, the protocol by which information is communicated from server to browser. They exist to help browsers avoid loading the same data repeatedly. If you visit a webpage with an image, it will be downloaded and stored on your computer. If that image never changes, there’s no need to download it again, but the browser needs some way of determining if the image has changed before it downloads it. When the image is sent the first time, the server can send an ETag along with it, which the browser stores. The ETag is simply a number in an HTTP header field that identifies a unique version of the image. If the image stays the same, so does the ETag. When the visitor returns to the page, the browser can send the ETag number back with a request asking if the current ETag on the server matches the one it stored. The server can then respond with a “not changed” message and the browser will use the image it downloaded previously. If the ETag is different, then the browser downloads the new version of the image. Not having to download the same data more than once saves bandwidth and makes sites load faster.
If you followed the earlier description of cookie tracking, it should be clear how ETag tracking works. ETags are a unique number that can identify a browser, and they are automatically sent to a server when requesting a specific resource, which can be a tiny image. The ETag can be keyed to a database entry containing previous browsing habits in the same way as the information contained in a cookie.
ETags are an interesting example of the way that technology invented with one thing in mind can be put to work in a wholly unintended way.