Open Bug 1873204 Opened 6 months ago Updated 1 month ago

Missing gmail icons as served by ServiceWorker due to apparent 0-length persisted Cache API storage; question of how 0-length records were persisted

Categories

(Core :: Storage: Cache API, defect, P2)

defect

Tracking

()

ASSIGNED
Tracking Status
firefox-esr115 --- unaffected
firefox121 --- unaffected
firefox122 --- wontfix
firefox123 --- wontfix
firefox124 --- wontfix
firefox125 --- wontfix
firefox126 --- wontfix

People

(Reporter: mayankleoboy1, Assigned: asuth)

References

(Regression)

Details

(Keywords: regression)

Attachments

(5 files)

  1. Use my daily profile
  2. Go to gmail (i am already logged in)

AR: The icons in front of folders are missing, the icons at the top that have "select all, refresh and "3-dot" menu" are missing

  1. If you now do a Ctrl+shift+R (i.e. force reload), then the icons appear. But the next time you login, they disappear again.

Regressed by :
Bug 1866240 - Maintain usage information in the database; r=dom-storage-reviewers,asuth

Differential Revision: https://phabricator.services.mozilla.com/D195081

This also repros if:

  1. Open gmail, force-refresh, open gmail in another tab
  2. Open gmail, force-refresh, close gmail tab, open gmail in another tab
  3. Open gmail, force-refresh, simple refresh (click on reload button)

Surprised that noone else has filed such a bug yet!

Attached file about:support
Attached image Bad3.png
Summary: Gmail has icons missing in front of folders the first time you open it → Gmail has icons missing in front of folders unless you force-refresh
Flags: needinfo?(jvarga)

Happy to share any log, output etc privately to devs

Just to be sure your profile (specifically <profile>/storage) is ok, can you go to the browser console and check if there are any errors mentioning QM_TRY ?

Please check status of individual storage APIs at https://firefox-storage-test.glitch.me as well.

Thanks

Flags: needinfo?(jvarga)

Set release status flags based on info from the regressing bug 1866240

(In reply to Jan Varga [:janv] from comment #4)

Just to be sure your profile (specifically <profile>/storage) is ok, can you go to the browser console and check if there are any errors mentioning QM_TRY ?

I couldnt see any message while searching for "QM_TRY" . Will share the full log with you on matrix chat.

Please check status of individual storage APIs at https://firefox-storage-test.glitch.me as well.

Overview:
Storage is working. This is the same version (123) as the last time you loaded this page.
Specific Subsystem Statuses:

LocalStorage
Good: Totally Working. (fullyOperational)
QuotaManager
Good: Totally Working. (fullyOperational)
IndexedDB
Good: Totally Working. (fullyOperational)
Cache API
Good: Totally Working. (fullyOperational)

Debug Info:

{
"v": 1,
"curVersion": 123,
"prevVersion": 123,
"ls": {},
"qm": {
"lastWorkedIn": 123
},
"idb": {
"persistentCreatedIn": 123,
"persistentLastOpenedIn": 123,
"clearDetectedIn": 0
},
"cache": {
"firstCacheCreatedIn": 123,
"unpaddedOpaqueCreatedIn": 0,
"paddedOpaqueCreatedIn": 123
}
}

See Also: → 1806264

(copy and paste error on my part; wrong initial see also)

See Also: 18062641829240
Severity: -- → S3
Priority: -- → P2

Firefox 122 | Regression Engineering Owner (REO)


Hi :jstutte,

I see that this was triaged as S3/P2.

Do we believe that this is a more widespread issue beyond something potentially anomalous in the reporter's profile?

I was unable to reproduce this in Firefox 122 using my own gmail account.

Flags: needinfo?(jstutte)

Forwarding the question to :asuth who has more context. The P2 mostly wanted to say "let's ensure we get to it soon to confirm it is not affecting many users".

Flags: needinfo?(jstutte) → needinfo?(bugmail)

(In reply to Erik Nordin [:nordzilla] from comment #8)

Do we believe that this is a more widespread issue beyond something potentially anomalous in the reporter's profile?

We don't believe this is widespread, we do believe this is likely related to ServiceWorkers, but it's unclear how much of this is related to potential corruption in the user's specific gmail Cache API storage or a broken install by the SW script versus how much this might be related to edge cases or bugs in the gmail SW script like we saw in bug 1829240. I was hoping to dig into this more deeply last week and have a more conclusive answer, but ran out of time; I'm going to take a more triage-y approach to this for now.

We definitely don't think there would be any new regressions[1] related to ServiceWorkers that would be related to this. If we do see any increase in reports of something like this, I would expect it to be related to Gmail performing a rollout of ServiceWorkers.

1: I'm going to needinfo the reporter more about the process of identifying bug 1866240 as a regression, as we would expect any regressions related to this to manifest as a comprehensive QuotaManager failure, but the reporter's feedback from comment 6 suggests this is not occurring.

Flags: needinfo?(bugmail)

Thanks very much for reporting this!

Can you clarify:

  1. Is this still happening for you that you need to use ctrl-shift-r to bypass the ServiceWorker to see the icons? (It's possible if gmail rolled out a new release that the problem might have gone away.)
  2. If you open a new tab, open devtools, bring up the network panel, then use the URL bar to navigate to gmail with the network panel open... do you see any errors in the network panel?
    • In particular, if I use the DOM inspector for my gmail session, I see that the email inbox icon comes from a URL of https://ssl.gstatic.com/ui/v1/icons/mail/gm3/1x/inbox_fill_baseline_n900_20dp.png and if I use the network panel filter to search for "inbox_fill" I see what you can see in the attached image. (Note that I had used the gear icon on the far right of the network panel to check "persist logs" and I opened the network panel on an already loaded gmail session and hit ctrl-r; I only see the 2 ServiceWorker hits if I disable persist logs and hit ctrl-r.)
  3. What was the process you used to identify the regressor? Was that mozregression bisection, checking what landed, other? If doing bisection, were you creating a fresh profile each time or using your existing profile. If using your existing profile, did you restore from a backup first?

In terms of mitigations / correcting the problem if it's only a case of an "unhappy" Cache API storage[1], if you delete the Cache API directory for your gmail directory while Firefox is not running, that should likely fix the problem. Specifically, for example, I have directories like the following in my profile, as found from the root of my profile directory:

  • storage/default/https+++mail.google.com/cache
  • storage/default/https+++mail.google.com^userContextId=10/cache
  • storage/default/https+++mail.google.com^userContextId=2/cache

The one without a userContextId is my default container, and the other 2 are for other containers. If I completely remove the directory when Firefox is not running, at next startup any attempt to load the ServiceWorker should initially fail as it realizes the storage is gone, removing the ServiceWorker. It's also possible to remove the ServiceWorkers from about:serviceworkers either before or after removing the directory.

1: Unfortunately right now it's hard to differentiate whether the Cache API storage is unhappy because of a failure to cache things from the ServiceWorker script (which might not be related to a storage bug in Firefox but could be correlated with our behavior around lifecycle events differing from other browsers) or from a database corruption problem. One thing you can do to help rule out database corruption is, if you change into one of the above "cache" directories and you have a modern SQLite3 binary installed, you can run sqlite3 caches.sqlite "PRAGMA integrity_check" and it will print out ok if the database is okay. This will not work if the database is open by Firefox because it is running and you have a gmail tab corresponding to the origin open.

Flags: needinfo?(mayankleoboy1)
Attached image gmail_network panel.png

network panel detail for gmail. All the "302" icons were green for this checkbox icon.

(In reply to Andrew Sutherland [:asuth] (he/him) from comment #11)

  1. Yes this is still happening. When i load gmail, the icons are missing. I need to do a force refresh (ctrl+shift+r) to see the icons

  2. I have attached a screenshot of the network panel with "persist logs" enabled. The color is green but the size is 0kb. Make what you will of this :)

I also have saved a HAR file of network log where i open gmail (and icons are missing) and then do a force refresh (icons appear). I can share the HAR privately with the devs if that is useful. Alternately, also happy to run test builds, share logs, share live screen etc.

  1. When doing a regression, I reused my existing profile. That is, I pointed mozregression to my daily profile and told it to "reuse" it. (I.e., I did not create a clone of my daily profile and point mozregression to the clone.)

  2. I am not too bothered with fixing this issue :) . I am more interested in supporting the devs to fix the root cause of this issue. I have shared some files with :janv on matrix. Let me know if there is anything else that i can help with.

An additional detail : My internet service provider has terrible latency issues. So could it be that the SW's "timeout" on their request, think gmail is in offline mode and then exhibit the behaviour of bug 1829240?

(ni you back so that you dont miss my response)

Flags: needinfo?(mayankleoboy1) → needinfo?(bugmail)

Set release status flags based on info from the regressing bug 1866240

:asuth any further investigation planned here? Wondering if we should still be tracking this for a potential fix in 122/123
I have not seen other reports of this since Fx122 was released.

(In reply to Donal Meehan [:dmeehan] from comment #16)

:asuth any further investigation planned here? Wondering if we should still be tracking this for a potential fix in 122/123
I have not seen other reports of this since Fx122 was released.

Based on comment 13 mentioning network latency/timeout issues, I am currently hoping to look into/validate our behavior around edge cases that could involve us somehow writing Responses into the Cache API that are truncated.

I don't think most users should experience this problem, and if they do, clearing data for the site should fix the problem, but I do want to understand what's happening here, in particular around stream closure. Specifically, since it seems like we must be in the steady state processing of the response, I think the termination of the fetch controller on failure likely applies. The images are served with a content-length header, so I also need to understand what the HTTP spec says about that and how our http channel maps that.

Assignee: nobody → bugmail
Status: NEW → ASSIGNED
Flags: needinfo?(bugmail)
Summary: Gmail has icons missing in front of folders unless you force-refresh → Missing gmail icons as served by ServiceWorker due to apparent 0-length persisted Cache API storage; question of how 0-length records were persisted

Since yesterday (or 1-2 days back), gmail now gives me all the icons. I can monitor this for a few days more if needed.
:asuth, up to you if you want to keep this bug open. Else I can close this as WORKSFORME.

Flags: needinfo?(bugmail)

Thanks for the update! I would still like to finish up ensuring we have test coverage for the timeout cases, so I'd like to keep this bug open.

Flags: needinfo?(bugmail)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: