Open Bug 1824669 Opened 1 year ago Updated 1 year ago

1340.3 - 84.67% target:parent-process objects-with-no-stacks / reload-inspector:parent-process objects-with-no-stacks + 9 more (Linux) regression on Mon March 20 2023

Categories

(DevTools :: General, defect, P3)

defect

Tracking

(firefox-esr102 unaffected, firefox111 unaffected, firefox112 unaffected, firefox113 fix-optional)

Tracking Status
firefox-esr102 --- unaffected
firefox111 --- unaffected
firefox112 --- unaffected
firefox113 --- fix-optional

People

(Reporter: jdescottes, Unassigned)

References

(Regression)

Details

(Keywords: perf, perf-alert, regression)

Perfherder has detected a devtools performance regression from push 1371c18ce24fdc3fd80d94aff58434d5875079fb. As author of one of the patches included in that push, we need your help to address this regression.

Regressions:

Ratio Test Platform Options Absolute values (old vs new)
1340% target:parent-process objects-with-no-stacks linux1804-64-tsan-qr 231.58 -> 3,335.50
117% reload-netmonitor:parent-process objects-with-no-stacks linux1804-64-qr 2,878.08 -> 6,249.83
110% reload-no-devtools:parent-process objects-with-no-stacks linux1804-64-qr 3,096.83 -> 6,498.67
108% reload-no-devtools:parent-process objects-with-no-stacks linux1804-64-shippable-qr 3,148.46 -> 6,540.42
107% reload-netmonitor:parent-process objects-with-no-stacks linux1804-64-shippable-qr 3,160.80 -> 6,536.00
107% reload-debugger:parent-process objects-with-no-stacks linux1804-64-qr 3,092.88 -> 6,395.08
103% reload-debugger:parent-process objects-with-no-stacks linux1804-64-shippable-qr 3,292.83 -> 6,696.25
99% reload-webconsole:parent-process objects-with-no-stacks linux1804-64-qr 3,449.63 -> 6,870.25
91% reload-webconsole:parent-process objects-with-no-stacks linux1804-64-shippable-qr 3,721.69 -> 7,099.17
88% reload-inspector:parent-process objects-with-no-stacks linux1804-64-qr 3,780.96 -> 7,117.25
85% reload-inspector:parent-process objects-with-no-stacks linux1804-64-shippable-qr 4,084.69 -> 7,543.08

Details of the alert can be found in the alert summary, including links to graphs and comparisons for each of the affected tests. Please follow our guide to handling regression bugs and let us know your plans within 3 business days, or the offending patch(es) may be backed out in accordance with our regression policy.

If you need the profiling jobs you can trigger them yourself from treeherder job view or ask a sheriff to do that for you.

For more information on performance sheriffing please see our FAQ.

this specific push doesn't look like it would have this kind of impact
From the graph, I was able to see that this push would have the trouble maker: https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=b477ec718591a08f2251bbc0499e487597a483cf&tochange=1371c18ce24fdc3fd80d94aff58434d5875079fb

we have a few patches for devtools ( Bug 1823399 , Bug 1823335, Bug 1815937 ) and nothing else in the push seems to be related, so I'll have a look

(In reply to Nicolas Chevobbe [:nchevobbe] from comment #1)

this specific push doesn't look like it would have this kind of impact
From the graph, I was able to see that this push would have the trouble maker: https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=b477ec718591a08f2251bbc0499e487597a483cf&tochange=1371c18ce24fdc3fd80d94aff58434d5875079fb

we have a few patches for devtools ( Bug 1823399 , Bug 1823335, Bug 1815937 ) and nothing else in the push seems to be related, so I'll have a look

Thanks! Would be great to be able to fix the pushlogs on existing alerts :)

I reverted the patch on DevTools and couldn't see much impact, but also I get some pretty unstable results :/

Looking at one graph (https://treeherder.mozilla.org/perfherder/graphs?highlightAlerts=1&highlightChangelogData=1&highlightCommonAlerts=0&selected=3987732,1664848258&series=autoland,3987732,1,12&timerange=1209600&zoom=1679345627097,1679346348503,3365.176560437021,7469.62772346326), I was able to see a smaller push leading to Bug 1819128

Florian, I'm not sure I understand what's done in your patch, but could that introduce some delay in GC that would explain the regression we saw here?

Flags: needinfo?(florian)

(In reply to Nicolas Chevobbe [:nchevobbe] from comment #3)

I reverted the patch on DevTools and couldn't see much impact, but also I get some pretty unstable results :/

Looking at one graph (https://treeherder.mozilla.org/perfherder/graphs?highlightAlerts=1&highlightChangelogData=1&highlightCommonAlerts=0&selected=3987732,1664848258&series=autoland,3987732,1,12&timerange=1209600&zoom=1679345627097,1679346348503,3365.176560437021,7469.62772346326), I was able to see a smaller push leading to Bug 1819128

Florian, I'm not sure I understand what's done in your patch, but could that introduce some delay in GC that would explain the regression we saw here?

Do you have profiles? I added markers for update timers, so it should be easy to see what's going on if update timers are messing with your tests.

The behavior before bug 1819128 was that we would never fire more than one update timer at once. If two or more were scheduled to fire at the same time, we would fire one, then wait two minutes, fire the next one, etc... From a power use perspective, it's better to run everything at once.

Attempting to landing the patch uncovered non-deterministic behavior in our tests, because our tests start with new profiles where no update timer has ever fired, so all of them are scheduled to fire for 30s after startup. Before the patch, that meant a first timer would fire after 30s, then we would have one firing every 2 minutes. After the patch, they all run at once 30s after startup. Some of these update timer callbacks do almost nothing and are very cheap, but some trigger a lot more work (triggering background updates that actually use CPU time). If your performance tests typically take more than 30s (likely), and some update timers are not disabled (very likely), it's likely that the run of your tests that happens to be 30s after startup is slower, because update timers take time away from the tests.

Flags: needinfo?(florian)

Those specific tests assert the number of objects created during the test, so indeed if new update timers happen during those tests, it's very likely they create additional objects. The tests can't really differentiate between devtools and non-devtools objects either, so if that's the case then it seems we can accept this as a new baseline?

(In reply to Julian Descottes [:jdescottes] from comment #5)

The tests can't really differentiate between devtools and non-devtools objects either, so if that's the case then it seems we can accept this as a new baseline?

It would be good to verify with a profile (or some debug logging) that update timers are indeed the cause. Accepting as a new baseline seems risky, because that means any change to the code any update timer runs could regress (or improve) these tests again.

Setting P3/S3, we can monitor this in the coming weeks. The numbers look very bad, but as explained above we count all objects so those tests are by nature very flaky.

Severity: -- → S3
Priority: -- → P3

Backing out the patches from Bug 1819128, I can confirm that this is the regressor, as I'm getting back to previous baseline

https://treeherder.mozilla.org/perfherder/compare?originalProject=try&originalRevision=4a3adb1c22eeac3a4416df90ddc7ea312da0424e&newProject=try&newRevision=297b98703c5a0061174b14cdf6b5db40e723dc3f&framework=12&page=1&filter=reload-webconsole%3Aparent-process+objects-with-no-stacks

I'm putting the regressed by filed, but that's not on Florian to fix, we need to check if the numbers are stable, and either accept those, or find a way to properly wait to get stable results

Regressed by: 1819128

Set release status flags based on info from the regressing bug 1819128

You need to log in before you can comment on or make changes to this bug.