Open Bug 1673749 Opened 4 years ago Updated 2 years ago

Firefox 82.0 tab crash [@ mozilla::extensions::StreamFilterParent::Init ]

Categories

(WebExtensions :: Request Handling, defect, P3)

Firefox 82
x86_64
Windows 10
defect

Tracking

(Not tracked)

ASSIGNED

People

(Reporter: programming.alinpasol, Assigned: robwu)

References

Details

Crash Data

Hi. I am the developer of an add-on called Twitter Link Deobfuscator which uses webRequest.StreamFilter.ondata to intercept and modify response data sent from the Twitter servers.

One of the users told me that he experienced multiple tab crashes. He submitted 2 crash reports: c0537db9-aa83-4558-9013-707e50201026 and 6c6a3264-b361-45a3-8848-5d9f20201027. The first crash (c0537db9-aa83-4558-9013-707e50201026) happened on a tab with a Twitter page opened and the second (6c6a3264-b361-45a3-8848-5d9f20201027) happened on a Youtube page. In both cases, only the tabs crashed, not the browser.

We suspect that this bug is related to bugs 1403546 and 1648132.

Another one of my user's tabs crashed. This time he used Firefox 82.0.1 build 20201026153733. The crash report's ID is 56aeb62e-58bd-426c-8635-6319f0201028.

Component: General → Request Handling
Product: Firefox → WebExtensions

Hello,

I’ve attempted to reproduce the issue on the latest Nightly (84.0a1/20201101092255) and Release (82.0.2/20201027185343) under Windows 10 Pro 64-bit, however without success. Several tabs with Twitter and YouTube (playing a video) have been opened for approximately 2 hours. None of the tabs have crashed.

Did you install the add-on beforehand? It might be that API call the cause. You can install it from the AMO page or from the GitHub repository as a temporary add-on.

(In reply to Alin Pasol from comment #3)

Did you install the add-on beforehand? It might be that API call the cause. You can install it from the AMO page or from the GitHub repository as a temporary add-on.

Yes, of course.

(In reply to Alex Cornestean from comment #4)

(In reply to Alin Pasol from comment #3)

Did you install the add-on beforehand? It might be that API call the cause. You can install it from the AMO page or from the GitHub repository as a temporary add-on.

Yes, of course.

OK, then. I will contact the user to see if he can create an account to provide more information.

(In reply to Alex Cornestean from comment #2)

I’ve attempted to reproduce the issue on the latest Nightly (84.0a1/20201101092255) and Release (82.0.2/20201027185343) under Windows 10 Pro 64-bit, however without success. Several tabs with Twitter and YouTube (playing a video) have been opened for approximately 2 hours. None of the tabs have crashed.

Just two hours won't cut it, my two reports that Alin mentioned for example have 5 and 22 hours browser uptime, on a fresh install it may take some days to faciliate. Having a twitter.com tab open should be sufficient, I don't think that youtube.com (or kicker.de, that also crashed some time later) have a direct correlation with this situation, these tab crash problems only really came up after Twitter Link Deobfuscator started to monitor the network traffic with the aforementioned "StreamFilter" function.

I didn't have new tab crashes for the last couple of days, but these may have had not enough time to facilitate since I had to reboot my system repeatedly due to Driver and Windows updates. I will submit the next crash report and mention it here, if this bug still persists.

I was worrying/hoping for a bit that the "StreamFilter"-related crashes were gone for good, after a week or so without incidents, but yet another twitter.com tab croaked on my end just an hour ago: https://crash-stats.mozilla.org/report/index/8cd29b32-4c37-456f-aeda-550020201107

Severity: -- → S2
Status: UNCONFIRMED → NEW
Crash Signature: [@ mozilla::extensions::StreamFilterParent::Init ]
Ever confirmed: true
Priority: -- → P2
Flags: needinfo?(mixedpuppy)

Can someone tell me if there is a Linux tool that can open a minidump created from Windows? I would like to take a look at the minidumps @spodermenpls provided me.

What are you hoping to find the the minidump? The relevant call stack is already shown in the crash-stats page.

I'm not aware of a way to debug Windows minidump files on Linux, other than using a Windows VM. Microsoft conveniently provides VM images with development tools included at https://developer.microsoft.com/en-us/windows/downloads/virtual-machines/

If you manage to find a dmp file from Linux, then you can read https://firefox-source-docs.mozilla.org/contributing/debugging/debugging_a_minidump.html . In the past I have used minidump-2-core to create a coredump to inspect in gdb.

See Also: → 1678734

(In reply to Rob Wu [:robwu] from comment #9)

What are you hoping to find the the minidump? The relevant call stack is already shown in the crash-stats page.

I'm not aware of a way to debug Windows minidump files on Linux, other than using a Windows VM. Microsoft conveniently provides VM images with development tools included at https://developer.microsoft.com/en-us/windows/downloads/virtual-machines/

If you manage to find a dmp file from Linux, then you can read https://firefox-source-docs.mozilla.org/contributing/debugging/debugging_a_minidump.html . In the past I have used minidump-2-core to create a coredump to inspect in gdb.

Does the crash-stats page contain all the relevant info? The crash-stats page contains only public data. Does the private data contain anything useful?

I downloaded other VMs from Microsoft in the past but I was not aware of those VMs, so thank you very much for pointing them out. I will try them.

I already read the article on how to debug a minidump and built Breakpad but when I tried to open one with minidump-2-core it exited with an error message: "This minidump was not generated by Linux or NaCl." I read in the documentation that it can be used to open Linux minidumps but I thought it can open Windows minidumps, too. I said to myself that there must be a way to open Windows minidumps from Linux and that must be it.

Few hours later, one twitter.com and one youtube.com tab crashed again, but only one crash report was generated (because they may have happened simultaneously?): https://crash-stats.mozilla.org/report/index/c6b8b368-8e0f-42c1-a8d2-ded5e0201122

I didn't mention that I am now using Firefox 83.0, so the bug was expectedly carried over from the 82.0 branch.

It's very interesting that you're hitting this crash so often, I wonder whether there is any relation to the number of add-ons that use the webRequest extension API.
Could you look in your profile directory (see about:support), and share addonStartup.json.lz4? I am primarily interested in the webRequest listeners + addon ID + addon version, so if you like to redact the contents of it, then you can use mozlz4a.py -d addonStartup.json.lz output.json using mozlz4a.py from https://gist.github.com/kaefer3000/73febe1eec898cd50ce4de1af79a332a/a266410033455d6b4af515d7a9d34f5afd35beec to decompress it, and redact parts that you don't want to share.

If you don't want to share the file in public, mail it to me.

(In reply to Rob Wu [:robwu] from comment #13)

It's very interesting that you're hitting this crash so often, I wonder whether there is any relation to the number of add-ons that use the webRequest extension API.
Could you look in your profile directory (see about:support), and share addonStartup.json.lz4? I am primarily interested in the webRequest listeners + addon ID + addon version, so if you like to redact the contents of it, then you can use mozlz4a.py -d addonStartup.json.lz output.json using mozlz4a.py from https://gist.github.com/kaefer3000/73febe1eec898cd50ce4de1af79a332a/a266410033455d6b4af515d7a9d34f5afd35beec to decompress it, and redact parts that you don't want to share.

If you don't want to share the file in public, mail it to me.

I've just send you the file via e-mail. I don't have a Python environment set up on my system, and the contents of said file don't seem to be overly sensitive from what I've read, so I haven't touched it. Please tell me if you've got it and if it contains the needed infos you mentioned.

Out of the list of add-ons from comment 14, the following extensions use webRequest.filterResponseData (there are more add-ons that use (non-)blocking webRequest (for some or all URLs), but they are likely not interesting:

The Twitter Link Deobfuscator extension uses the filterResponseData API in an unreliable and non-deterministic way. The onBeforeRequest listener is non-blocking, but the logic that determines whether or not to attach the filter is asynchronous, with lots a significant number of async steps (querying extension storage + querying tabs). If the latter operations are slow, then it is theoretically possible for a race condition between the start of the response, and attaching a StreamFilter (in response to the webRequest.filterResponseData call).

In that case, the following sequence of events could lead to a crash (by assertion failure):

  1. [content process] HttpChannelChild initialized.
  2. [main process] "http-on-modify-request" dispatch webRequest.onBeforeRequest to extension. non-blocking. request starts.
  3. [extension process] Extension starts some async stuff and eventually calls webRequest.filterResponseData. (non-blocking!)
  4. [main process] StreamFilter creation request received.
  5. [main process] meanwhile, the request is received and OnStartRequest is called.
    • This clears the reference to the traceable channel, so that extensions cannot attach new stream filters.
    • OnStartRequest forwarded to content child.
  6. [content process] HttpChannelChild::DoOnStartRequest called, sets mTracingEnabled = false; @ https://searchfox.org/mozilla-central/rev/6bb59b783b193f06d6744c5ccaac69a992e9ee7b/netwerk/protocol/http/HttpChannelChild.cpp#547,556
  7. [content process] Process AttachStreamFilter request from end of step 5 @ https://searchfox.org/mozilla-central/rev/6bb59b783b193f06d6744c5ccaac69a992e9ee7b/netwerk/protocol/http/HttpChannelChild.cpp#3021-3022,3031,3036

I tried to write a minimal example to reproduce this, but couldn't. However, I was able to verify that StreamFilters can be created right until the parent's OnStartRequest notification (after the first step of step 5, StreamFilter.create calls are rejected).

If step 7 runs earlier (e.g. before 6 or 5), then the crash wouldn't happen. Since HTTP requests are often slower than extension functions (and the window for the race condition is relatively narrow anyway), this assertion/crash isn't triggered that often in practice.

To fix this, I propose to check whether the channel has started (and drop/close the stream filter) in AttachStreamFilterEvent::Run (before the call to StreamFilterParent::Attach) @ https://searchfox.org/mozilla-central/rev/6bb59b783b193f06d6744c5ccaac69a992e9ee7b/netwerk/protocol/http/HttpChannelChild.cpp#3022

For extensions, a work-around to avoid this problem is to only call browser.webRequest.filterResponseData from a blocking webRequest listener.

I just looked through the stacks of recent crash reports: associated with this bug: https://crash-stats.mozilla.org/signature/?moz_crash_reason=~rv&signature=mozilla%3A%3Aextensions%3A%3AStreamFilterParent%3A%3AInit&date=%3E%3D2020-11-26T23%3A07%3A00.000Z&date=%3C2020-12-03T23%3A07%3A00.000Z&_columns=date&_columns=product&_columns=version&_columns=proto_signature&_sort=-date&page=1

It seems that my analysis covers all of these reports.
ESR78 has a different stack, but it still falls under my analysis (step 4 is just a bit shorter), due to changes from bug 1646592.

Assignee: nobody → rob
Status: NEW → ASSIGNED
Flags: needinfo?(mixedpuppy)

Hey Rob, based on a very quick look it seems that the most recent crash reports being tracked by Bug 1403546 are now overlapping with the ones tracked by this one (the number of crashes correlated with the two issues looks exactly the same and I opened a couple and they were definitely had the same crash identifies).

Would you mind to double-check if we really need two separate issues (both P2s but with slightly different severity) to track the remaining scenarios that are triggering a crash from StreamFilterParent::Init?

Flags: needinfo?(rob)
See Also: → 1403546

I closed the other bug. I'll drop the priority here since it's a low-volume crash with no clear STR. I described my hypothesis in comment 15, but was unable to create a unit test so I moved to other stuff.

Flags: needinfo?(rob)
Priority: P2 → P3

Since the crash volume is low (less than 15 per week), the severity is downgraded to S3. Feel free to change it back if you think the bug is still critical.

For more information, please visit auto_nag documentation.

Severity: S2 → S3
You need to log in before you can comment on or make changes to this bug.