Open Bug 1829240 Opened 1 year ago Updated 21 days ago

Difference between how Firefox and Chrome cache resources intercepted with a service worker can affect gmail icon and image loading

Categories

(Core :: Graphics: ImageLib, defect, P1)

defect

Tracking

()

People

(Reporter: ksenia, Assigned: tnikkel)

References

(Blocks 2 open bugs, )

Details

(Whiteboard: [necko-triaged][necko-priority-review])

Attachments

(2 files, 1 obsolete file)

Attached file gmail.zip (obsolete) —

We've performed investigation of Gmail Offline feature to determine what is missing in Firefox in order for it to be enabled and it appears that functionality is working as expected (while spoofing UA as Chrome), however most of the icons are not loaded once the browser goes offline.

The reason behind this is primarily due to a difference in how Chrome and Firefox cache resources (png images in this case) that were intercepted with a service worker.

On the first load, requests for images are intercepted and fetched by a service worker in both browsers. On all subsequent page reloads in Chrome the memory cache takes precedence over the service worker - the icons are served directly from the memory cache, bypassing the service worker. In contrast, Firefox prioritizes the service worker over the memory cache.

I've attached an archive with sample code. To reproduce:

  1. Start a sever with python3 -m http.server 8000 in the unzipped folder.
  2. Open gmail.html (http://localhost:8000/gmail.html).
  3. Open "Network" tab in the devtools panel and make sure that "Disable Cache" checkbox is unchecked.
  4. Reload the page and observe the Transferred column.

In Firefox it has "service worker" in the column, while Chrome has "memory cache" (on all subsequent loads). Also if I set a breakpoint in the SW code, it never reaches it in Chrome since the request is returned from the memory cache.

(P.S It is worth mentioning that gmail is using CacheStorage api to check whether the request has been cached, but not actually putting it in caches for unknown reason, so this code has been commented out as it doesn't affect the behaviour described here)

This looks like a difference in caching strategy between the browsers, and Gmail offline is relying on images served from the memory cache, so I wonder if we want to consider the possibility of aligning the behaviour with Chrome?

@valentin - do you know anything about cache-vs-serviceworker priority?
@asuth - do you know if this behavior is speced?

Severity: -- → S2
Flags: needinfo?(valentin.gosu)
Flags: needinfo?(bugmail)
Priority: -- → P2
Whiteboard: [necko-triaged][necko-priority-review]

On all subsequent page reloads in Chrome the memory cache takes precedence over the service worker - the icons are served directly from the memory cache, bypassing the service worker. In contrast, Firefox prioritizes the service worker over the memory cache.

Not sure if the memory cache in Chrome is similar to the image cache in Firefox. I assume so.
In any case, I'm seeing pretty much the same behaviour in both Firefox and Chrome - once you turn off the network connection, the icons still load - the main difference is that in Firefox they don't show up in devtools precisely because they're being loaded from the image cache.

Are you seeing something different once you turn off the network?

Flags: needinfo?(valentin.gosu) → needinfo?(kberezina)

I think this suggests something regressed with the ImageCacheKey's explicit ServiceWorker awareness. I saw this bug the other day and it looks like we may also be failing some related WPT tests but I didn't have time to dig in too deeply.

There is an explicit check for the ServiceWorkerManager in ImageCacheKey::GetSpecialCaseDocumentToken that should be fine (we will create a ServiceWorkerManager in a content process), but which should be removed in favor of just checking whether there is a controller.

Flags: needinfo?(bugmail)

In any case, I'm seeing pretty much the same behaviour in both Firefox and Chrome - once you turn off the network connection, the icons still load - the main difference is that in Firefox they don't show up in devtools precisely because they're being loaded from the image cache.

Are you seeing something different once you turn off the network?

Yeah, with network off I'm seeing the same behaviour with my testcase, but not on the actual site.

On gmail it goes through the service worker once offline for all image requests in Firefox (as in the testcase with network on). I've tried to recreate this locally, but no luck so far :( Will update the testcase if I manage to match the behaviour I'm seeing on the site.

Flags: needinfo?(kberezina)
Attached file gmail.zip

I was able to replicate the behaviour in a new testcase, the main difference from the previous one is that the request for the html page is saved in the CacheStorage and then retrieved from CacheStorage without calling the network again, if found.
I've also deployed the testcase to https://ksy36.github.io/ to hopefully make it easier to run without starting the server locally.

Attachment #9329600 - Attachment is obsolete: true

So, if understand the issue correctly, the problem is that Chrome at least attempts to do a network load, whereas Firefox just loads from the image cache always and never calls the service worker.
Let me know if that's not accurate.

Component: Networking: Cache → Graphics: ImageLib

(In reply to Valentin Gosu [:valentin] (he/him) from comment #6)

So, if understand the issue correctly, the problem is that Chrome at least attempts to do a network load, whereas Firefox just loads from the image cache always and never calls the service worker.
Let me know if that's not accurate.

Sorry, given my first testcase it might have been confusing, but the second one on https://ksy36.github.io/ should demonstrate the exact behaviour from gmail offline. It's actually the opposite, Chrome uses memory cache and Firefox always calls the service worker (as long as the request for html page was also intercepted/saved in the cache storage with the service worker).

There are several things that prevent us from using an image from the image cache on reload of this document.

When the script global object is cleared from a document (which I think correlates roughly to when it stops being an active document displayed in a tab) we also clear every image from the cache that was loaded if the document was controlled

https://searchfox.org/mozilla-central/rev/4e6970cd336f1b642c0be6c9b697b4db5f7b6aeb/dom/base/Document.cpp#7724

So the images are cleared from the image cache when the document before the reload goes away.

Fixing that is not enough because as comment 3 notes that the controlled document is in the image cache key, and we do need it to be equal to get a hit in the image cache as it is checked in operator==

https://searchfox.org/mozilla-central/rev/4e6970cd336f1b642c0be6c9b697b4db5f7b6aeb/image/ImageCacheKey.cpp#63

and it's in the hash

https://searchfox.org/mozilla-central/rev/4e6970cd336f1b642c0be6c9b697b4db5f7b6aeb/image/ImageCacheKey.cpp#90

and when you reload a document you get a new document.

Fixing that is still not enough because the expiration time on the image cache entry gets set to the past here

https://searchfox.org/mozilla-central/rev/4e6970cd336f1b642c0be6c9b697b4db5f7b6aeb/image/imgRequest.cpp#606

because it doesn't get any cache expiration time back from the network channel. If I load the images by themselves the expiration is one year in the future, so I guess that info doesn't make it through when loaded as a service worker?

What I don't understand is why loading from the image cache is necessary for this to work, shouldn't the images still display when they hit the cache in the service worker?

ni for the last question in comment 8.

Flags: needinfo?(kberezina)

ni for next steps here

Flags: needinfo?(valentin.gosu)
Flags: needinfo?(bugmail)

(In reply to Timothy Nikkel (:tnikkel) from comment #8)

What I don't understand is why loading from the image cache is necessary for this to work, shouldn't the images still display when they hit the cache in the service worker?

I think this may be answered by comment 0:

(In reply to Ksenia Berezina [:ksenia] from comment #0)

(P.S It is worth mentioning that gmail is using CacheStorage api to check whether the request has been cached, but not actually putting it in caches for unknown reason, so this code has been commented out as it doesn't affect the behaviour described here)

So from the above it sounds like gmail is not caching the images, if I understand correctly. That means that gmail would experience the same problem on chrome if the memory cache was purged or no longer available, presumably that would happen if the browser was restarted.

Aside: Thanks for doing this investigation!

(In reply to Ksenia Berezina [:ksenia] from comment #0)

This looks like a difference in caching strategy between the browsers, and Gmail offline is relying on images served from the memory cache, so I wonder if we want to consider the possibility of aligning the behaviour with Chrome?

I think we can probably relax the very restrictive choices we're currently using for ServiceWorkers. This will likely result in a bunch of potential performance benefits for when ServiceWorkers are being used, so this would be desirable.

Spec-wise: https://github.com/w3c/ServiceWorker/issues/962 is a somewhat abandoned conversation last touched in 2017 dealing with the cache-ability of ServiceWorker returned resources, but as noted there and in the more recent (2022!) and very relevant https://github.com/whatwg/fetch/issues/1400 (Define the "memory cache"), the HTML spec defines The list of available images which seems to be the image cache. The normative text is fairly short so I'll quote it here:

Each Document object must have a list of available images. Each image in this list is identified by a tuple consisting of an absolute URL, a CORS settings attribute mode, and, if the mode is not No CORS, an origin. Each image furthermore has an ignore higher-layer caching flag. User agents may copy entries from one Document object's list of available images to another at any time (e.g. when the Document is created, user agents can add to it all the images that are loaded in other Documents), but must not change the keys of entries copied in this way when doing so, and must unset the ignore higher-layer caching flag for the copied entry. User agents may also remove images from such lists at any time (e.g. to save memory). User agents must remove entries in the list of available images as appropriate given higher-layer caching semantics for the resource (e.g. the HTTP Cache-Control response header) when the ignore higher-layer caching flag is unset.

Step 6.2 of Updating the image data populates the key via:

Let key be a tuple consisting of urlString, the img element's crossorigin attribute's mode, and, if that mode is not No CORS, the node document's origin.

I see no indications that the Service Worker spec monkey-patches anything about the above definition, so spec-wise there's no reason for us to bind the image cache keys directly to the controlled document instance. That said, it seems like bug 1202085 which added most of the current implementation approach was concerned about ServiceWorkers potentially polluting the image cache for pages that are not controlled by the SeviceWorker.

It seems like the original approach proposed in bug 1202085 of "add the URL of the controller of the document to ImageCacheKey" was probably a better approach than what ended up being used. A more modern version that would be to use the uint64_t ServiceWorkerDescriptor::Id() available via aDocument->GetController()->Id() (after making sure it isSome()) as the extra portion of the key. This would ensure that:

  • A ServiceWorker can't pollute the image cache for uncontrolled pages.
  • Any request from a given installed version of a ServiceWorker will have the same key across document reloads (and even in different processes, etc.)
  • For the sanity of developers, if the ServiceWorker (SW) gets updated because there was a new version on the server, that new version will have a different id and so a different cache key. So developers won't get frustrated by seeing cached values as they iterate on their SW.

Implementation-wise, it seems like that might be fairly straightforward? Note that we should remove the parts that do ServiceWorkerManager::GetInstance() to see if ServiceWorkers are enabled. It's sufficient and much cleaner to just see if there's a controller or not.

Flags: needinfo?(bugmail)

(In reply to Timothy Nikkel (:tnikkel) from comment #8)

Fixing that is still not enough because the expiration time on the image cache entry gets set to the past here

https://searchfox.org/mozilla-central/rev/4e6970cd336f1b642c0be6c9b697b4db5f7b6aeb/image/imgRequest.cpp#606

I should expand on this. It's a little more complicated. We do still get a hit on the image cache and we return that image, but we issue a network request to check if the image has been updated on the server

https://searchfox.org/mozilla-central/rev/4a5c56f4aca291802ce27320cd9a752dd5dd955e/image/imgLoader.cpp#2087

So in devtools this does not get marked as hitting the image cache. I believe we won't draw the returned image until the network request resolves one way or the other, but I'd have to double check that.

(In reply to Andrew Sutherland [:asuth] (he/him) from comment #11)

It seems like the original approach proposed in bug 1202085 of "add the URL of the controller of the document to ImageCacheKey" was probably a better approach than what ended up being used. A more modern version that would be to use the uint64_t ServiceWorkerDescriptor::Id() available via aDocument->GetController()->Id() (after making sure it isSome()) as the extra portion of the key. This would ensure that:

That seems like it would fix problem except what do we do about the expiry time being in the past? Do we just not have an expiry time for image cache entries that come from a service worker?

Attached image gmail-icons.png

(In reply to Timothy Nikkel (:tnikkel) from comment #8)

What I don't understand is why loading from the image cache is necessary for this to work, shouldn't the images still display when they hit the cache in the service worker?

Yeah, that's correct - the images should load from the cache storage after hitting the service worker, but from my tests with gmail they weren't in the cache storage, so all requests for icons failed in Firefox. The images did load in Chrome, but only because they were stored in memory cache. As Andrew mentioned, purging the memory cache in Chrome results in the same problem as in Firefox (could be done by enabling "Disable cache" checkbox in the Network tab).

(In reply to Andrew Sutherland [:asuth] (he/him) from comment #11)

(In reply to Ksenia Berezina [:ksenia] from comment #0)

(P.S It is worth mentioning that gmail is using CacheStorage api to check whether the request has been cached, but not actually putting it in caches for unknown reason, so this code has been commented out as it doesn't affect the behaviour described here)

So from the above it sounds like gmail is not caching the images, if I understand correctly. That means that gmail would experience the same problem on chrome if the memory cache was purged or no longer available, presumably that would happen if the browser was restarted.

To follow up on this, I did some more investigation today to figure out why images requests are not saved in the cache storage, and was able to trigger saving by disabling Offline mode and enabling it again in gmail settings. That made gmail to store most of the requests for offline resources in cache storage with offline-asset-cache cache name. This improved things, and a lot of the icons appear now while offline, with a few still missing (they don't seem to be storing the missing ones in the cache storage, unclear why).

To be fair, I'm not quite sure what caused the offline-asset-cache cache storage to become empty in my previous tests - they don't clear it anywhere in the code and I'm not able to trigger the behaviour again, apart from deleting it manually.

I've attached a screenshot of the main page showing difference between browsers. While most of the icons are showing in Firefox, there are some that are missing, for example the logo icon - since it's not stored in the cache storage, the request fails in Firefox, but it's showing in Chrome because it's cached in memory.

Thanks for investigating this! I'm wondering if it might be a good idea to contact Gmail first to see if we can get their support on this before we make changes, since the offline mode is enabled based on Chrome UA?

The difference described in this bug is surfacing when the cache storage is empty, or certain resources are not in the cache. Once requests are stored in the cache storage, and the cache is not cleared along the way, the icons load as expected in the offline mode, with the exception of images that don't get saved in cache (but that might be an easier fix on their side).

(In reply to Timothy Nikkel (:tnikkel) from comment #13)

(In reply to Andrew Sutherland [:asuth] (he/him) from comment #11)
That seems like it would fix problem except what do we do about the expiry time being in the past? Do we just not have an expiry time for image cache entries that come from a service worker?

Ah, yeah, we have a gap right now in terms of (being verbose for my own future reference here):

This gap in the ServiceWorkers implementation is twofold in terms of:

  1. Failure to propagate for pass-through fetch
  2. Failure to mint for resources that were matched in the DOM Cache API storage.

Note that there is a related bug 1336199 about supporting alternate data stream storage in the DOM Cache API and Eden had patches which also dealt with some nsICacheInfoChannel aspects, in addition adding a minimal (and delightfully named) CachedCacheInfoChannel.

I think it likely makes sense for us to:

  • Have this bug address the image cache key issue.
  • I'll file or reference existing bugs corresponding to InterceptedHttpChannel providing at least cache expiration time information that the image cache can then use. Setting a needinfo on myself because I have to step out for something.
Flags: needinfo?(bugmail)

(In reply to Ksenia Berezina [:ksenia] from comment #15)

Thanks for investigating this! I'm wondering if it might be a good idea to contact Gmail first to see if we can get their support on this before we make changes, since the offline mode is enabled based on Chrome UA?

You've identified an important gap in our current Service Worker implementation and its interaction with image caching that I do think we want to address, so I think we care about this and want to address the image caching problems no matter what.

It's always a reasonable question as to whether we should implement something like this where there is potentially more spec work to be done before or after performing the spec work, but I don't think there is actually any blocking spec work here, just gaps in our (ServiceWorker) implementation.

I think the needinfo's I set have been answered, so clearing those.

Flags: needinfo?(valentin.gosu)
Flags: needinfo?(kberezina)

(In reply to Andrew Sutherland [:asuth] (he/him) from comment #17)

You've identified an important gap in our current Service Worker implementation and its interaction with image caching that I do think we want to address, so I think we care about this and want to address the image caching problems no matter what.

It's always a reasonable question as to whether we should implement something like this where there is potentially more spec work to be done before or after performing the spec work, but I don't think there is actually any blocking spec work here, just gaps in our (ServiceWorker) implementation.

This makes sense, thanks!

Tim, is it fair to keep this as an S2?

Flags: needinfo?(tnikkel)

It's an S2 because it blocks offline gmail use, I don't feel like I know enough about the importance of offline gmail to make that call, so I'm bouncing the needinfo back to yourself.

Flags: needinfo?(tnikkel) → needinfo?(bhood)
Flags: needinfo?(bhood)

My guess here is that this does not have a "high impact" (offline gmail is probably not used broadly), so I'm dropping the severity.

Severity: S2 → S3

(updating the subject to make it easier to find this bug)

Flags: needinfo?(bugmail)
Summary: Difference between how Firefox and Chrome cache resources intercepted with a service worker → Difference between how Firefox and Chrome cache resources intercepted with a service worker can affect gmail icon and image loading
Blocks: 1885871

Bumping Priority and Severity, and assigning Tim to do some research.

Assignee: nobody → tnikkel
Severity: S3 → S2
Priority: P2 → P1
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: