Closed Bug 1728331 Opened 3 years ago Closed 8 hours ago

[Fission] same domain tabs are loaded in the same process when reloaded in bulk

Categories

(Core :: DOM: Content Processes, defect, P3)

Desktop
All
defect

Tracking

()

RESOLVED FIXED
129 Branch
Fission Milestone Future
Tracking Status
firefox-esr78 --- disabled
firefox-esr91 --- disabled
firefox91 --- disabled
firefox92 --- wontfix
firefox93 --- wontfix
firefox94 --- wontfix
firefox129 --- fixed

People

(Reporter: cmuresan, Assigned: nika)

References

Details

Crash Data

Attachments

(5 files, 1 obsolete file)

[Affected versions]:

  • Firefox Nightly 93.0a1 - Build ID: 20210830162701

[Affected Platforms]:

  • Windows 10
  • macOS 10.15.7
  • Linux MX 4.19

[Prerequisites]:

  • Have the following prefs set on a new profile:
    • dom.ipc.processCount.webIsolated set to 4.
    • fission.autostart set to true.

[Steps to reproduce]:

  1. Open 10 webpages from the same domain (eg. facebook, google, wikipedia) each in a new tab.
  2. Open the about:processes page in a new tab.
  3. Restart the browser using the Multiprocess Browser Console (ctrl/cmd+shift+j followed by ctrl/cmd+alt/option+r).
  4. Right click the about:processes tab and choose the "Select all Tabs" option.
  5. Right click the same tab and select the "Reload Tabs" option.
  6. Wait for all tabs to load and observe the process list.

[Expected result]:

  • The tabs with the same domain are split into 4 processes.

[Actual result]:

  • The tabs with the same domain are all in the same process.

[Notes]:

  • The issue is also reproducible when selecting multiple tabs and reloading them simultaneously, but in this case they can end up in two processes.
  • The issue is also reproducible if the tabs are unloaded via Session Restore.
  • The issue is NOT reproducible if the tabs are loaded one at a time.
  • The issue is NOT reproducible if the tabs are opened at the same time from a Bookmark folder or from the Synced Tabs sidebar.
  • Attached a screen recording of the issue: link.

Took a second to look into this. I believe this is caused because we always choose not to create a new process for a remote type if an existing process doesn't have any tabs loaded in it. This is done here: https://searchfox.org/mozilla-central/rev/ad2ffab089e4e0c0fe99a1a046ab2b1c45546bdb/dom/base/ProcessSelector.jsm#49-53. With normal tab opening we would create the browser in each new process synchronously, due to the blocking process launch, but for navigation process switches with Fission, we create it async, and there's a short time period during the process switch where we have the new process launched, but it doesn't have any tabs technically loaded in it yet. During this time if we do another process switching navigation, we'll end up re-using the same process, as we don't see the process as having any tabs loaded in it.

If we want to avoid this behaviour, we'll need to track pending navigations into a content process in addition to existing tabs, and handle them here.

This bug will only affect users if we decide to increase the process count to greater than 1 process (bug 1727158), depending on the results of this experiment.

Blocks: 1727158
See Also: → 1728332

Tracking for Fission MVP.

Tentatively assigning to Nika because she knows the details of this bug.

Assignee: nobody → nika
Fission Milestone: ? → MVP
Priority: -- → P3
Whiteboard: fission-soft-blocker

We'd like to fix this bug, but it doesn't need to block Fission MVP.

Fission Milestone: MVP → Future
Whiteboard: fission-soft-blocker

The changes I currently have posted in bug 1731792 should also fix this issue.

(In reply to Nika Layzell [:nika] (ni? for response) from comment #5)

The changes I currently have posted in bug 1731792 should also fix this issue.

Hi Nika, in bug 1731792 there are still two patches ready to land, do we know if this bug here is already fixed?

In case this bug is still valid instead, were you still planning to look into this?

Flags: needinfo?(nika)

This patch replaces the previous process selection infrastructure with a
new setup implemented entirely in C++, which should more accurately
track the set of processes in use, and should encourage re-use of the
existing content process when navigating by not counting the current
tab.

This approach intentionally allows for process switching to another
process during navigation if there is uneven load between processes to
encourage balanced process use.

I think this may also fix some of the session restore issues with many
tabs using the same process, rather than being spread over 4, as we now
track a tab earlier in its lifecycle before the BrowserParent instance
is created.

This attribute is not used in Gecko, but exists for use by other
applications. Specifically, the APP_TYPE_EDITOR type is given permission
to load privileged images as tested by browser_docshell_type_editor.js.
Before these changes, that test passed because the docshell was loaded
in a different process, so the cache was empty when each load occurred,
but after my changes the process ends up being re-used, so the image
cache bypasses this check.

This changes the image cache key to also include the app type
information so that it will be compared before re-using the entry.

The patch in that bug which landed didn't fix this issue, only the other 2 patches which bounced due to a devtools failure actually fixed this bug.

It appears that the devtools team did some work fixing an intermittent issue around this in bug 1733272, so it may be worth rebasing those patches and seeing if they're landable now.

Flags: needinfo?(nika)

I've moved the patches over to this bug, so that they're not on a closed bug anymore.

Pushed by nlayzell@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/e9242af1224d
Part 1: Avoid cycling between processes when navigating within a tab, r=smaug
https://hg.mozilla.org/integration/autoland/rev/b6649f0253c5
Part 2: Respect AppType in ImageCacheKey, r=emilio

There are some r+ patches which didn't land and no activity in this bug for 2 weeks.
:nika, could you have a look please?
If you still have some work to do, you can add an action "Plan Changes" in Phabricator.
For more information, please visit auto_nag documentation.

Flags: needinfo?(nika)
Flags: needinfo?(emilio)
Flags: needinfo?(emilio)

These are fairly low-importance patches with confusing test failures which I haven't had the time to investigate yet.

Flags: needinfo?(nika)

This logic did nothing in Fission, however it did have an effect for
e10s-Android, which will no longer have process recycling. I expect the impact
of this to be minimal given Android's tendency to kill processes already.

This feature is being removed because supporting the recycling approach with
the new KeepAlive system being added in future patches would've added
complexity for a largely-unsupported configuration.

Depends on D213334

This shouldn't change behaviour, as the default JS MinTabSelector's behaviour
should roughly match the C++ one. The selector will be updated in a later
patch.

A new pref has been added to disable content process re-use which to replace
the test-only use-case for replacing the process selector.

Depends on D213335

This is a fairly significant patch, however it would be difficult to break it
down into smaller patches:

  1. The various mechanisms used to manage ContentParent lifecycles have been
    merged together into a common "KeepAlive" system. A process will begin shutdown
    when its keepalive count reaches 0. (though it will still wait for all
    BrowserParents to also be dead before sending the actual shutdown message as
    before).

This replaces a number of bespoke systems for tracking BrowserParent instances
in different lifecycle states, remote workers, ongoing process switches, and
preallocated processes.

  1. KeepAlives are now managed automatically by a UniquePtr variant
    (Unique[Threadsafe]ContentParentKeepAlive). This makes the hand-off over
    KeepAlive lifecycles explicit, even for workers.

  2. All KeepAlives are now keyed by a BrowserId, which will be 0 for keepalives
    not associated with a specific tab. This allows the new process selection logic
    to count all tabs other than the one being navigated when deciding which
    process to use.

  3. The process switching logic now tracks it's KeepAlive with a BrowserId,
    meaning that ongoing process switches are considered when performing process
    selection, even if the BrowserParent hasn't been created yet.

Depends on D213337

Attachment #9267920 - Attachment is obsolete: true
Attachment #9267921 - Attachment is obsolete: true
Summary: [Experiment] [process-count-4] Same domain tabs are loaded in the same process when reloaded in bulk → [Fission] same domain tabs are loaded in the same process when reloaded in bulk

(In reply to Nika Layzell [:nika] (ni? for response) from comment #18)

Created attachment 9406847 [details]
Bug 1728331 - Part 4: Make ContentParent KeepAlives explicit with RAII references, r=smaug!,#dom-worker-reviewers!

Just a note that as a side effect this patch reverts the changes from bug 1809134. There seems to be no real need to know about the impending shutdown before the RecvDestroy starts running on the main thread of the content process and in any case no existing code path in the content process was ever based on this sequence.

See Also: → 1809134
Pushed by nlayzell@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/45735575df21
Part 1: Remove e10s process recycling, r=smaug
https://hg.mozilla.org/integration/autoland/rev/9e5bd0186aa6
Part 2: Remove JS process selectors, r=smaug
https://hg.mozilla.org/integration/autoland/rev/d920c2d72d00
Part 3: Remove unused EnsureContentProcess method from nsIXULRuntime, r=smaug
https://hg.mozilla.org/integration/autoland/rev/2cf5cad90099
Part 4: Make ContentParent KeepAlives explicit with RAII references, r=smaug,dom-worker-reviewers,asuth

Backed out for causing bc failures in browser_docshell_type_editor.js

  • Backout link
  • Push with failures
  • Failure Log
  • Failure line: TEST-UNEXPECTED-FAIL | image/test/browser/browser_docshell_type_editor.js | APP_TYPE_UNKNOWN is not allowed to access privileged image - false == true - got false, expected true (operator ==)
Flags: needinfo?(nika)
Attachment #9267921 - Attachment description: Bug 1728331 - Part 2: Respect AppType in ImageCacheKey, r=emilio → Bug 1728331 - Part 5: Respect AppType in ImageCacheKey, r=emilio
Attachment #9267921 - Attachment is obsolete: false
Pushed by nlayzell@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/1d413fe340a0
Part 1: Remove e10s process recycling, r=smaug
https://hg.mozilla.org/integration/autoland/rev/779ac735736c
Part 2: Remove JS process selectors, r=smaug
https://hg.mozilla.org/integration/autoland/rev/9e04f49c926e
Part 3: Remove unused EnsureContentProcess method from nsIXULRuntime, r=smaug
https://hg.mozilla.org/integration/autoland/rev/dabd7d6836c8
Part 4: Make ContentParent KeepAlives explicit with RAII references, r=smaug,dom-worker-reviewers,asuth
https://hg.mozilla.org/integration/autoland/rev/30bbda0eb197
Part 5: Respect AppType in ImageCacheKey, r=emilio
Flags: needinfo?(nika)
Crash Signature: [@ mozilla::dom::ContentParent::GetNewOrUsedLaunchingBrowserProcess]
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Target Milestone: 129 Branch → ---
Pushed by nlayzell@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/80193c01e94b
Part 1: Remove e10s process recycling, r=smaug
https://hg.mozilla.org/integration/autoland/rev/692426e4d4a1
Part 2: Remove JS process selectors, r=smaug
https://hg.mozilla.org/integration/autoland/rev/3dd07ed49638
Part 3: Remove unused EnsureContentProcess method from nsIXULRuntime, r=smaug
https://hg.mozilla.org/integration/autoland/rev/b55aacf910e7
Part 4: Make ContentParent KeepAlives explicit with RAII references, r=smaug,dom-worker-reviewers,asuth
https://hg.mozilla.org/integration/autoland/rev/0f60c8fbc8db
Part 5: Respect AppType in ImageCacheKey, r=emilio
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: