Double-keyed caching: Browser cache partitioning

addyosmani.com

69 points

feross

3 days ago


27 comments

ssttoo 2 days ago

A random fact from before the “before” in the article: the cache for all resource types used to be the same. We (ab)used this to do preloading before it was available in the web platform, e.g. download a JavaScript without executing it:

  var js = new Image();
  js.src = 'script.js';
  js.onerror = function(){/* js done */};
Then some browsers started having a separate image cache and this stopped working
zerof1l 2 days ago

As a user, you can use extension like LocalCDN. Instead of having your browser download common libraries and fonts, the extension would intercept the request and serve the local version. Even better in terms of privacy and security.

  • heinrich5991 2 days ago

    This also dramatically increases your fingerprintability, I think. Fingerprinters probably aren't using it yet though, because it's such a small target audience. But if it were to be implemented into a major browser, I'm sure they'd start using it.

    • Retr0id 2 days ago

      To elaborate, JS running on a page can probe your cache (via timing) to discern which URLs are already in it or not. If the cache is shared cross-origin, then it can be used to leak information cross-origin.

  • jayd16 2 days ago

    Why does this do anything? The attack described is a timing attack, so a local proxy doesn't really help.

stuaxo 2 days ago

This is annoying, can't we pretend to download the resources we have instead if just not doing it?

  • RunningDroid a day ago

    It'd be nice, but doing it without giving away that's what's happening would probably get complicated

bilekas 2 days ago

The author recommends Domain Consolidation however this seems like some bad advice in certain cases due to the browsers max connection limit on domains. At least in http 1.0. or am I mistaken ?

  • csswizardry 2 days ago

    You are correct that this is an optimisation only available to H2+, but optimising to the H/1.x use-case is a use-case not worth optimising for—if one cared about web performance that much, they wouldn’t be running H1/.x in the first place.

    Most connections on the web nowadays are over H2+: https://almanac.httparchive.org/en/2024/http#http-version-ad...

    • giantrobot 2 days ago

      > if one cared about web performance that much, they wouldn’t be running H1/.x in the first place.

      You may not have intended it this way but this statement very much reads as "just use Chrome". There's lots of microbrowsers in the world generating link previews, users stuck with proxies, web spiders, and people stuck with old browsers that don't necessarily have H2 connectivity.

      That doesn't mean over-optimize for HTTP 1.x but decent performance in that case should not be ignored. If you can make HTTP 1.x performant then H2 connections will be as well by default.

      Far too many pages download gobs of unnecessary resources just because they didn't bother tree shaking and minifying resources. Huge populations of web users at any given moment are stuck on 2G and 3G equivalent connections. Depending where I am in town my 5G phone can barely manage to load the typical news website because of poor signal quality.

  • shanemhansen 2 days ago

    In my opinion the benefits of domain consolidation outweigh the costs.

    Using a bunch of domains like a.static.com,b.static.com etc was really only helpful when the limit on connections to a domain was like 2. Depending on the browser those limits have been higher for a while.

    For http/2 it's less helpful.

    But honestly there's not really one right answer theoretically. multiple domains increase fixed overhead of DNS, tcp connect, tls handshake, but offer parallelism that doesn't suffer from head of line blocking.

    You can multiplex a bunch of request/responses over a http/2 stream in parallel... Until you drop a packet and remember that http/2 is still TCP based.

    UDP based transports like http/3 and quic don't have this problem.

fergie 2 days ago

This seems to be quite a drastic change that could only have been warranted by a bigger problem than the one outlined in the article. Also strange the way than Osmani preemptively shuts down criticism of the change.

What is the actual problem thats being solved here? Is cache-sniffing being actively used for fingerprinting or something?

Whats going on here?

  • simonw 2 days ago

    The privacy problems were really bad.

    Want to know if your user has previously visited a specific website? Time how long it takes to load their CSS file, if it's instant then you know where they have been.

    You can tell if they have an account on that site by timing the load of an asset that's usually only shown to logged-in users.

    All of the browser manufacturers decided to make this change (most of them over 5 years ago now) for privacy reasons, even though it made their products worse in that they would be slower for their users.

  • Retr0id 2 days ago

    We should also be asking if the benefits of a non-partitioned cache are truly there in the first place. I think the main claim is that if (for example) the same JS library is on a CDN and used across multiple sites, your browser only has to cache it once. But how often does this really happen in practice?

    • sroussey 2 days ago

      In the days of jQuery… all the time.

      In the days of webpack, not so much.

    • shanemhansen 2 days ago

      Well I know it's not directly the answer to your question, but the article mentions that in practice a partitioned cache increases bandwidth by 4%. Which at the scale of billions of web browsers is actually pretty bad.

  • kibwen 2 days ago

    The privacy implications are discussed in the article.

    • fergie 2 days ago

      You don't seem to have properly read the previous comment or the article.

      The question (not addressed in the article) is: is there in fact a gigantic, widespread problem of using cache-sniffing to fingerprint users?

      We have always known that this was possible, yet continued to include cross site caching anyway because of the enormous benefits. Presumably something fairly major has recently happened in order for this functionality to be removed- if so what?