Implement more CacheIR ops in PBL
Categories
(Core :: JavaScript Engine, task, P3)
Tracking
()
Tracking | Status | |
---|---|---|
firefox125 | --- | fixed |
People
(Reporter: jsharp, Assigned: jsharp)
Details
Attachments
(10 files)
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review |
I've been working to extend cfallin's "portable baseline" (PBL) tier with more CacheIR op fast-paths. I'm opening this issue to attach a Phabricator patch stack with the current state of that work.
Assignee | ||
Comment 1•8 months ago
|
||
If you've declared that a value doesn't actually need to be a GC root,
then you're opting out of the usual safety guarantees and might as well
allow constructing a Handle to refer to that value too.
Assignee | ||
Comment 2•8 months ago
|
||
The baseline tier sometimes checks whether any of a bit-set of flags are
present on an ArgumentsObject, so PBL needs the ability to do the same.
As a bonus, factoring this pattern to a separate function allows
simplifying the implementation of the existing getters for flags.
Depends on D192030
Assignee | ||
Comment 3•8 months ago
|
||
The ObjectElements header has a "length" field which is mostly only
meaningful for ArrayObject subclasses of NativeObject. However, this
field is read as a hint in NativeObject::growElements, passed to
goodElementsAllocationAmount when deciding how much memory to allocate,
even if the object is not an array.
The baseline tier sometimes increments this field in the CacheIR
StoreDenseElementHole op, even if the object is not an array. So cfallin
and I want to ensure that PBL matches that behavior.
To do so, we need public access to this field, so this commit adds a
public getter and setter.
Depends on D192031
Assignee | ||
Comment 4•8 months ago
|
||
Many CacheIR ops are only ever followed by a ReturnFromIC. Rather than
branching through the dispatch table just to get to a return, we can
package up this special case to return directly.
Depends on D192032
Assignee | ||
Comment 5•8 months ago
|
||
The most important of these cleanups is a bug-fix for
GuardDynamicSlotIsSpecificObject, which was interpreting its "slot"
parameter as a byte-offset rather than as an index into the slots array.
Other changes just make the boilerplate for reading registers and stub
fields more consistent; add comments for incomplete implementations
relative to the baseline tier; or add more helpful tracing.
Depends on D192033
Assignee | ||
Comment 6•8 months ago
|
||
Depends on D192034
Updated•8 months ago
|
Updated•8 months ago
|
Updated•8 months ago
|
Updated•8 months ago
|
Updated•8 months ago
|
Updated•8 months ago
|
Assignee | ||
Comment 8•8 months ago
|
||
One motivation for this work is to get more performance improvements out of PBL. This tier can give correct results even if all CacheIR ops immediately return NextIC
, which causes it to eventually run the non-IC fallback for each bytecode op. However, the more of these ops we implement, the better performance should get, ideally. So here's a rough benchmark of speed-up ratios on Octane v2, before and after the final patch of this stack:
1.22747x RayTrace
1.17575x EarleyBoyer
1.10825x Mandreel
1.07161x Gameboy
1.06072x PdfJS
1.02652x RegExp
1.00976x Splay
1.00975x CodeLoad
1.00685x NavierStokes
0.995134x Crypto
0.975806x DeltaBlue
0.965409x Box2D
0.946667x Richards
The overall score computed by Octane shows a 1.04583x speedup. However this should all only be taken as a rough guide to performance impact as I've seen significant variability in trying to measure this benchmark suite.
Even if we assume these measurements are valid, I don't think it's worth paying attention to the benchmarks which show slowdowns until we have implemented all of the CacheIR ops that those benchmarks use. Implementing only part of an IC chain can lead to slowdowns as we evaluate the newly-implemented ops and then throw away the results due to encountering an unimplemented op, then redo the work in the fallback. If there are still slowdowns at that point, we should profile where the time is going; we may find that there are cases where interpreting the IC chain is slower than just using the full fallback, in which case the "hybrid ICs" logic in PBL may need tuning.
Updated•8 months ago
|
Updated•8 months ago
|
Updated•8 months ago
|
Assignee | ||
Comment 9•8 months ago
|
||
Apparently clang-format wants to apply diffs like this:
- INT32_OP(BitXor, ^, {});
- INT32_OP(BitXor, ^,
-
{
-
});
That's just excessive, and also for some reason wasn't required when pbl
landed about a month ago. I propose that we just ignore clang-format in
this case.
Depends on D192034
Comment 10•7 months ago
|
||
There are some r+ patches which didn't land and no activity in this bug for 2 weeks.
:jsharp, could you have a look please?
If you still have some work to do, you can add an action "Plan Changes" in Phabricator.
For more information, please visit BugBot documentation.
Updated•7 months ago
|
Assignee | ||
Comment 11•7 months ago
|
||
Sorry for the delay. I discovered that the jit-tests suite has been failing when PBL is built in debug mode. Chris didn't previously turn on debug builds in CI so nobody noticed.
I don't think these patches introduce any new test suite failures so I'd be happy to land them as-is, but Chris argued (and I don't blame him) that I should open a separate bug and land the fixes for the existing failures before landing these new patches. However I haven't had time to make much progress on investigating those failures; I have a fix for one issue but I don't know how many more there are.
So as far as I'm concerned, this patch stack could land as-is, but I would totally understand if you agree with Chris that we should set it aside until we can verify that debug-build jit-tests pass too.
Assignee | ||
Comment 12•7 months ago
|
||
The various compilers' implementations of emitCompareStringResult
check several fast cases first and only fall back to a general
comparison if none of those cases apply. The general comparisons include
debug asserts that the fast paths must have already been checked. So PBL
needs to check those fast paths too.
Depends on D192035
Assignee | ||
Comment 13•7 months ago
|
||
Depends on D195677
Assignee | ||
Comment 14•7 months ago
|
||
Depends on D195678
Assignee | ||
Comment 15•7 months ago
|
||
The fixes for PBL to pass the jit-test suite in debug builds turned out to be pretty minor so I've just added them to this patch stack, as well as a patch to run these tests in CI going forward. I don't know anything about Mozilla CI configuration so please double-check that I copy-pasted things correctly.
Could you also do a try run for me? I'm pretty confident in the correctness of these patches at this point but, you know, just in case.
Comment 16•7 months ago
|
||
(In reply to [email protected] from comment #15)
Could you also do a try run for me? I'm pretty confident in the correctness of these patches at this point but, you know, just in case.
Sure, here you go: https://treeherder.mozilla.org/jobs?repo=try&group_state=expanded&revision=1a183b6bf04922f656f07237c259dd685bff908f
Comment 17•7 months ago
|
||
Side note; probably worth asking for Level 1 SCM access at this point if you'll be working on maintaining PBL.
Comment 18•7 months ago
|
||
jsharp, let us know if/when this is ready to land and one of us can do it :)
Assignee | ||
Comment 19•7 months ago
|
||
Alright, I think it's ready to land! Thank you for your help getting it there. My first Mozilla contribution, how exciting :-)
Comment 20•4 months ago
|
||
I've taken over driving this patch-stack and did a rebase and fixed an issue with the added CI config; try builds here for everything but pbl-debug
, and here for a rerun of the latter. :jsharp plans to update his patch-stack from this branch when able and then I'll Lando it when green. It seems there are some test timeouts which we'll take a look at.
Comment 21•4 months ago
|
||
Here is a green try-run of the pbl-debug
CI job; "optimize": false
had caused the timeouts. I think this is ready to land once Jamey is able to upload the above branch as updated patches.
Comment 22•4 months ago
|
||
Pushed by chris@cfallin.org: https://hg.mozilla.org/integration/autoland/rev/99ba26092555 Allow using fake-rooted values as handles, r=jonco https://hg.mozilla.org/integration/autoland/rev/6a531229c73c Allow checking multiple ArgumentsObject flags, r=jandem https://hg.mozilla.org/integration/autoland/rev/d942b1570d05 Ignore "length" for non-arrays, r=jandem https://hg.mozilla.org/integration/autoland/rev/06febe799b24 Specialize PREDICT_NEXT for ReturnFromIC, r=jandem https://hg.mozilla.org/integration/autoland/rev/844d439c9a8b Cleanups for existing PBL CacheIR ops, r=jandem https://hg.mozilla.org/integration/autoland/rev/71e5b4ccf4bd Don't reformat short INT32_OP invocations, r=jandem https://hg.mozilla.org/integration/autoland/rev/72d2892bd9bc Implement more CacheIR ops in PBL, r=jandem https://hg.mozilla.org/integration/autoland/rev/c053ea8e7a45 Fix debug asserts in PBL CompareStringResult, r=jandem https://hg.mozilla.org/integration/autoland/rev/a1da44e0fc12 Fix PBL setting frame debuggee flag, r=jandem https://hg.mozilla.org/integration/autoland/rev/8bfa623afd6f Add PBL debug builds to CI, r=sfink,jmaher
Comment 23•4 months ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/99ba26092555
https://hg.mozilla.org/mozilla-central/rev/6a531229c73c
https://hg.mozilla.org/mozilla-central/rev/d942b1570d05
https://hg.mozilla.org/mozilla-central/rev/06febe799b24
https://hg.mozilla.org/mozilla-central/rev/844d439c9a8b
https://hg.mozilla.org/mozilla-central/rev/71e5b4ccf4bd
https://hg.mozilla.org/mozilla-central/rev/72d2892bd9bc
https://hg.mozilla.org/mozilla-central/rev/c053ea8e7a45
https://hg.mozilla.org/mozilla-central/rev/a1da44e0fc12
https://hg.mozilla.org/mozilla-central/rev/8bfa623afd6f
Description
•