gh-119109: improve `functools.partial` vectorcall with keywords #124584

dg-pb · 2024-09-26T08:40:10Z

(Potentially closes #128050)

This IMO is the best approach to resolve fallback "issue". It:
a) Eliminates the need for the fallback or any need to switch between implementation after initial construction
b) Delivers performance benefits for vectorcall when partial has keywords

Benchmark:

# BENCH 2 ARGS
# ------------
S="
from functools import partial
f=lambda a, b: a - b
p1 = partial(f)
p2 = partial(f, b=2)
l = lambda a: f(a, b=2)
"

$PYCMD -c "${S}; print(p1(1, 2))"   # -1     | -1     |
$PYCMD -c "${S}; print(p2(1))"      # -1     | -1     |
                                    # BEFORE | AFTER  | %CHN | LAMBDA LB
$PYCMD -m timeit -s $S 'p1(1, 2)'   #  87 ns |  85 ns |      |
$PYCMD -m timeit -s $S 'p1(1, b=2)' # 100 ns |  96 ns |      |
$PYCMD -m timeit -s $S 'p2(1)'      # 240 ns | 135 ns | -45% |  94 ns
$PYCMD -m timeit -s $S 'p2(a=1)'    # 350 ns | 160 ns | -55% | 110 ns


# BENCH 10 ARGS
# -------------
S="
from functools import partial
func = lambda a, b, c, d, e, f, g, h, i, j: (a + b + c + d + e + f + g + h + i + j)
p = partial(func, f=5, g=6, h=7, i=8, j=9)
l = lambda a, b, c, d, e, f=5, g=6, h=7, i=8, j=9: func(a, b, c, d, e, f=f, g=g, h=h, i=i, j=j)
"

C0="${S}; print(p(0, 1, 2, 3, 4))"
C1='p(0, 1, 2, 3, 4)'
C2='p(a=0, b=1, c=2, d=3, e=4)'                             # disjoint kw and pto_kw
C3='p(a=0, b=1, c=2, d=3, e=4, f=5, g=6)'                   # kw partially overlaps pto_kw
C4='p(a=0, b=1, c=2, d=3, e=4, f=5, g=6, h=7, i=8, j=9)'    # kw overrides pto_kw


$PYCMD -c $C0               #  45     | 45     |
                            #  BEFORE | AFTER  | %CHN | LAMBDA LB
$PYCMD -m timeit -s $S $C1  #  440 ns | 320 ns | -28% | 240 ns
$PYCMD -m timeit -s $S $C2  #  890 ns | 440 ns | -50% | 260 ns
$PYCMD -m timeit -s $S $C3  # 1000 ns | 600 ns | -40% | 270 ns
$PYCMD -m timeit -s $S $C4  # 1250 ns | 700 ns | -44% | 300 ns

# FUNCTION CALL - 210 ms
$PYCMD -m timeit -s $S 'f(a=0, b=1, c=2, d=3, e=4, f=5, g=6, h=7, i=8, j=9)'

No penalty for calls without pto_kwds.
Non negligible speed improvement for calls with pto_kwds: 27 - 55%

Issue: functools.partial does not re-set vector call. #119109

Modules/_functoolsmodule.c

rhettinger · 2024-09-26T17:38:49Z

Perhaps @vstinner has the time and interest in looking at this.

dg-pb · 2024-09-29T07:26:51Z

I think it is a good compromise between simplicity and performance now.

One micro-optimization that I couldn't figure out how to do simply is pre-storing kwnames tuple so it doesn't need to be created on every call. It would drop another ~50 ns.

Not sure how much sense it makes yet, but I posted faster-cpython/ideas#699 in relation to this.

Ready for review now.

Modules/_functoolsmodule.c

dg-pb · 2024-10-17T01:52:13Z

Was wandering if it might be worth factoring out macros for private use.

picnixz · 2025-01-05T09:18:34Z

Is the performance change a direct effect of the simplification or is this the other way around? (namely, can we decouple the performance gains from the simplification?) (not that the PR is too big but for bisecting commits, it's easier when we have atomic changes)

dg-pb · 2025-01-05T09:19:25Z

Your title change is incorrect, this has wider implications.

It removes dynamic switching, which causes issues with free threading. And generally results in more linear straight forward flow, so it is a design improvement as well.

The fact that vectorcall will be used in more cases, which is faster is only a benefit. Although from users perspective the performance gain is the only thing worth mentioning.

dg-pb · 2025-01-05T09:21:39Z

Is the performance change a direct effect of the simplification

yes, it is one atomic change. There is no way to split it.

Modules/_functoolsmodule.c

…tial_vectorcall_kw

Modules/_functoolsmodule.c

serhiy-storchaka · 2025-07-07T11:32:14Z

Modules/_functoolsmodule.c

+        }
+        Py_XDECREF(pto_kw_merged);
+
+        /* Resize Stack if the call has keywords */


Is it needed? The stack can be slightly overallocated, but is this a problem?

In theory this can grow to moderate size.

E.g.

len(pto_kwds) = 100 len(kwds) = 100 set(pto_kwds) == set(kwds) len(stack) = 300 len(used_stack) = 100 say all keys and values are 1-character strings. Then, size of used objects is 100 * 2 * 42 = 8400. Over-allocated memory is 200 * 8 = 1600.

So that is 20% of extra memory.

Of course, this is both overestimation and an extreme case.

But still, maybe it is a good idea to eliminate 0.1% of extreme cases.

github search:

/\bpartial\(/ - 900K files

/\bpartial\((.*=){7,}.*/ - 1K files

So I say maybe instead of completely removing it, could change if (nkwds && ...) to if (nkwds > 6 && ...).

This would eliminate performance overhead for 99% of cases and would safeguard against unconventional usage.

Also, for say 1 keyword argument, reallocation can have a visible performance impact, but for 7 or more, it will be a negligible percentage of total.

I am worrying more about the code complexity than performance and memory consumption. Although resizing the stack takes a time.

If you want to leave it and make it conditional, use something like nkwds + n_merges > init_stack_size/2, so the stack will only be resized if this saves significant amount memory.

I am in favour of keeping this for now. Given the whole block can be removed without breaking anything at any time, I think this is more like a "soft complexity". I will see to a bit better rule and add a comment.

Modules/_functoolsmodule.c

initial implementation

69ba0e9

dg-pb requested a review from rhettinger as a code owner September 26, 2024 08:40

bedevere-app bot added the awaiting review label Sep 26, 2024

bedevere-app bot mentioned this pull request Sep 26, 2024

functools.partial does not re-set vector call. #119109

Open

rruuaanng reviewed Sep 26, 2024

View reviewed changes

Modules/_functoolsmodule.c Outdated Show resolved Hide resolved

Modules/_functoolsmodule.c Outdated Show resolved Hide resolved

dg-pb added 3 commits September 26, 2024 15:29

V2

9a21b55

small fixes

f23021c

V3

d840ad7

rhettinger requested review from vstinner and removed request for rhettinger September 26, 2024 17:33

dg-pb marked this pull request as draft September 27, 2024 06:31

bedevere-app bot removed the awaiting review label Sep 27, 2024

V4

2dd7568

dg-pb marked this pull request as ready for review September 27, 2024 14:54

bedevere-app bot added the awaiting review label Sep 27, 2024

dg-pb added 2 commits September 27, 2024 18:04

fix compiler warnings

862097f

V5 stable

64c889b

add commented fix if merging after pythongh-124652

a7142d5

rruuaanng reviewed Oct 1, 2024

View reviewed changes

Modules/_functoolsmodule.c Show resolved Hide resolved

Modules/_functoolsmodule.c Show resolved Hide resolved

Modules/_functoolsmodule.c Show resolved Hide resolved

dg-pb and others added 5 commits October 4, 2024 16:36

error check

ba36d01

fix error check

acba269

minor macro edit

898a104

merge to main

10b9f3b

📜🤖 Added by blurb_it.

3647c25

dg-pb mentioned this pull request Oct 17, 2024

functools.partial placeholders #119127

Closed

small edits

f9e3fd4

dg-pb mentioned this pull request Dec 20, 2024

Race between partial_vectorcall_fallback and _PyVectorcall_FunctionInline under free-threading #128050

Open

picnixz changed the title ~~gh-119109: improve performance for functools.partial vectorcall with keywords~~ gh-119109: improve functools.partial vectorcall with keywords Jan 5, 2025

dg-pb added 4 commits January 5, 2025 14:49

small edits and fixes

acd9c56

macros removed, post-resizing instead

cc557e9

regain previous performance

e8fbaf8

small edit

1a8a56c

erlend-aasland reviewed Jan 6, 2025

View reviewed changes

Modules/_functoolsmodule.c Outdated Show resolved Hide resolved

Modules/_functoolsmodule.c Outdated Show resolved Hide resolved

labels removed

b3ff73d

erlend-aasland reviewed Jan 6, 2025

View reviewed changes

Modules/_functoolsmodule.c Outdated Show resolved Hide resolved

Modules/_functoolsmodule.c Outdated Show resolved Hide resolved

Modules/_functoolsmodule.c Outdated Show resolved Hide resolved

Modules/_functoolsmodule.c Outdated Show resolved Hide resolved

dg-pb added 2 commits January 6, 2025 15:05

comment edits

00ebb4b

minor fixes and improvements

4575b6c

This was referenced Jan 6, 2025

gh-124652: partialmethod simplifications #124788

Open

functools.partialmethod simplification #124652

Open

dg-pb added 4 commits January 8, 2025 17:21

small stack size doubled + small edits

25a91aa

removed commented fix when trailing placeholders allowed

e326fcf

moved declarations to more sensible place

85d658f

reorder declarations

cefa7d8

dg-pb mentioned this pull request May 8, 2025

gh-125028: Prohibit placeholders in partial keywords #126062

Merged

serhiy-storchaka self-requested a review May 8, 2025 07:58

dg-pb and others added 2 commits May 8, 2025 14:58

Merge remote-tracking branch 'upstream/main' into pythongh-119109-par…

aa3a11f

…tial_vectorcall_kw

Merge branch 'main' into pythongh-119109-partial_vectorcall_kw

8730ded

kumaraditya303 reviewed Jun 10, 2025

View reviewed changes

Modules/_functoolsmodule.c Show resolved Hide resolved

dg-pb and others added 2 commits June 10, 2025 17:00

null fix

adacecf

Merge branch 'main' into pythongh-119109-partial_vectorcall_kw

a09ce19

serhiy-storchaka reviewed Jul 7, 2025

View reviewed changes

dg-pb added 3 commits July 7, 2025 18:26

few more brushes based on ss review

192d261

comment

1f21e74

assertion fixes

3070c67

Uh oh!

gh-119109: improve functools.partial vectorcall with keywords #124584

Are you sure you want to change the base?

gh-119109: improve functools.partial vectorcall with keywords #124584

Conversation

dg-pb commented Sep 26, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

rhettinger commented Sep 26, 2024

Uh oh!

dg-pb commented Sep 29, 2024

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dg-pb commented Oct 17, 2024

Uh oh!

picnixz commented Jan 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dg-pb commented Jan 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dg-pb commented Jan 5, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

serhiy-storchaka Jul 7, 2025

Choose a reason for hiding this comment

Uh oh!

dg-pb Jul 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

serhiy-storchaka Jul 7, 2025

Choose a reason for hiding this comment

Uh oh!

dg-pb Jul 7, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

gh-119109: improve `functools.partial` vectorcall with keywords #124584

gh-119109: improve `functools.partial` vectorcall with keywords #124584

dg-pb commented Sep 26, 2024 •

edited

Loading

picnixz commented Jan 5, 2025 •

edited

Loading

dg-pb commented Jan 5, 2025 •

edited

Loading

dg-pb Jul 7, 2025 •

edited

Loading