-
-
Notifications
You must be signed in to change notification settings - Fork 8.3k
py/objcode: Implement PEP626 to add co_lines to code objects on settrace builds. #16989
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #16989 +/- ##
==========================================
+ Coverage 98.54% 98.56% +0.02%
==========================================
Files 169 169
Lines 21877 21945 +68
==========================================
+ Hits 21558 21631 +73
+ Misses 319 314 -5 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Code size report:
|
Related to #16797? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, this looks like a nice addition!
I think we should then deprecate co_lnotab
following CPython.
And it would be good to add another test to this PR, to test the actual numeric values from co_lines()
, to make sure they match the expected bytecode offsets for a simple test function (that test would need a corresponding .py.exp
file because you can't use CPython to work out the right answer).
iter->bc = 0; | ||
iter->source_line = 1; | ||
iter->ci = self->rc->prelude.line_info; | ||
code_colines_next(MP_OBJ_FROM_PTR(iter)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to discard the first line-no entry, is that intended?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is intentional; this actually works around a bit of weirdness the test case helped me discover. For whatever reason, the first entry in this table always has a zero-length bytecode segment attributed to line 1.
This might be something to fix in the bytecode emitter, though that could also have knock-on effects for the other consumers of that table.
Skipping that first entry results in co_lines
behavior behavior that matches CPython 3.10, where it only enumerates the body lines of a function --- though either way that's a diff with CPython 3.12, where co_lines
includes the function's declaration line.
Hmm, actually, looking at the compilation result, it looks like |
There we are, having it return the new line_info value instead of an increment value for it manages to get it fully elided: https://godbolt.org/z/bEY6ffTK8 Though, that makes it much less clear what's going on --- and there's similar differences in the other contexts with how incrementing (Also, clang gets the exact same compilation result regardless of whether or which way I refactor it.) I'll leave the PR as-is unless we really feel like it's important to turn the screws as tight as possible just to win back 2 opcodes here. |
The Was initially concerned if there was some sort of overflow making |
Signed-off-by: Anson Mansfield <[email protected]>
Signed-off-by: Anson Mansfield <[email protected]>
Signed-off-by: Anson Mansfield <[email protected]>
Signed-off-by: Anson Mansfield <[email protected]>
Signed-off-by: Anson Mansfield <[email protected]>
Signed-off-by: Anson Mansfield <[email protected]>
5c31dc4
to
c27a6ad
Compare
I've been playing around with these tests; at the moment the test I added also covers the cases where a correct The current outputs from co_lines are still meaningful and correct in some sense, though --- is that summing behavior actually something this needs to implement? Or can these extra coverage cases just be dropped (or made into cpydiffs or something)? Also, should I add coverage for a case where the table needs to skip more than 2048 lines at once? (e.g. to overflow the 11-bit line number increment in the 2-byte format, and require a multi-entry encoding for it like |
Signed-off-by: Anson Mansfield <[email protected]>
c27a6ad
to
45623a4
Compare
Ok, in the most recent variant of the test I've tried, I'm now really confused, and suspect I may have found a bug with the emitter for the underlying line info tables. The function in question, which I've annotated with what seems to be the line-number encodings, is: def f(x, y, obj, obj2, obj3):
a = x + y # line 4: bc+4 line+4
b = x - y # line 5: bc+4 line+1
# line 6
# line 7
# line 8
# line 9
# line 10
# line 11
# line 12
# line 13
# line 14
# line 15
# line 16
# line 17
# line 18
# line 19
c = a * b # line 20: bc+4 line+15
obj.a.b.c.d.e.f.g.h.i.j.k.l.m.n.o.p.q.r.s.t.u.v.w.x.y.z.fun() # line 21: bc+31 line+1; bc+27 line+0
# line 22
# line 23
# line 24: bc+0 line+3
# line 25
# line 26
# line 27: bc+0 line+3
# line 28
# line 29
obj2.a.b.c.d.e.f.g.h.i.j.k.l.m.n.o.p.q.r.s.t.u.v.w.x.y.z.fun() # line 30: bc+31 line+3; bc+27 line+0
# line 31
# line 32
# line 33: bc+0 line+3
# line 34
# line 35
# line 36
# line 37
# line 38
# line 39
# line 40
# line 41
# line 42
# line 43
# line 44
# line 45
# line 46
# line 47
# line 48
# line 49
# line 50
# line 51
# line 52
# line 53
# line 54
# line 55
# line 56
# line 57
# line 58
# line 59
obj3.a.b.c.d.e.f.g.h.i.j.k.l.m.n.o.p.q.r.s.t.u.v.w.x.y.z.fun() # line 60: bc+31 line+27; bc+27 line+0
return c # line 61: bc+2 line+1 I'd expect line 60 to get encoded as I've not dug in to see if this is related to the SyntaxError this causes in the unix minimal and zephyr ports, but it doesn't seem far-fetched that this might also be part of that. |
|
||
for start, end, line_no in f.__code__.co_lines(): | ||
print(f"line {line_no} start: {start}") | ||
print(f"line {line_no} end: {end}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some ports (eg unix minimal, zephyr) don't enable f-strings, so we try to avoid using them in tests. Instead use "...".format()
. Then the SyntaxError will go away on those CI build.s
It's correct. The line number encoding for
And that decodes to the following
|
I'm not sure what you mean by "sum together multiple successive table entries"?
No, I think it's fine without that case. |
Summary
This PR implements PEP 626 – Precise line numbers for debugging and other tools..
This adds one method,
co_lines
, that returns an iterator over(begin, end, line_no)
tuples of line numbers for applicable bytecode ranges.Testing
This PR includes automated tests of all of the currently-implemented code object attributes, including the existing attributes that previously had no test coverage.
The actual values in
co_lines
can't be compared directly to CPython's, but all of the invariants that the method is meant to observe are checked.In addition to the github workflow checks, I've also manually verified that all tests pass when run against a unix
MICROPY_PY_SYS_SETTRACE
build. (In particular, I've checked to make sure they're not skipped).Trade-offs and Alternatives
As in CPython, this is implemented as its own distinct iterator object. Potentially, this function could be functionally replicated by instead eagerly building a list of the tuples and returning an iterator over that list, at a slightly smaller code size; but the iterator approach allows for a tighter behavioral match with CPython (
fun_code_full.py:25
), and code size is not at nearly the same premium in settrace builds.