-
-
Notifications
You must be signed in to change notification settings - Fork 8.3k
py: Add PEP 750 template strings support #17557
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
py: Add PEP 750 template strings support #17557
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #17557 +/- ##
==========================================
- Coverage 98.57% 98.53% -0.04%
==========================================
Files 169 172 +3
Lines 21968 22637 +669
==========================================
+ Hits 21654 22306 +652
- Misses 314 331 +17 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Code size report:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the contribution! This will be nice to have for the webassembly port.
I didn't do a review yet, but there will need to be tests to get full coverage of the new code.
@dpgeorge Thank you for the feedback and for taking time to look at this! I'm glad this will be useful for the webassembly port. I'll add more tests to ensure full coverage when I find time. |
@koxudaxi slightly off topic - but I notice you'll be at EuroPython in Prague, as will I. We should look out for each other and have a coffee or lunch together! 🇪🇺 🐍 |
@ntoll Sounds good! See you in Prague. ☕ |
330b05f
to
7a1dc11
Compare
Implements template strings (t-strings) as specified in PEP 750. Includes Template and Interpolation objects, expression parser, string.templatelib module, tests and documentation. Enabled for Unix, Windows, and WebAssembly ports. Signed-off-by: Koudai Aono <[email protected]>
7a1dc11
to
50dd4d1
Compare
I tried really hard to make 100% coverage but some code is never called and I cannot cover it. Do you know how to fix this? |
|
||
A flag for `keys()`, `values()`, `items()` methods to specify that | ||
A flag for :meth:`btree.keys`, :meth:`btree.values`, :meth:`btree.items` methods to specify that | ||
scanning should be inclusive of the end key. | ||
|
||
.. data:: DESC | ||
|
||
A flag for `keys()`, `values()`, `items()` methods to specify that | ||
A flag for :meth:`btree.keys`, :meth:`btree.values`, :meth:`btree.items` methods to specify that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I faced an error when building with Sphinx because it conflicted with string.templatelib.Template.values
. To solve this problem, I renamed the values attribute in btree to be more explicit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like a useful change, then --- it's just not a change related to PEP-750 t-string support, so it needs to be a separate commit and separate PR.
Git cherry-pick
and rebase -i
are your friends!
@dpgeorge |
@@ -542,7 +542,18 @@ const byte mp_binary_op_method_name[MP_BINARY_OP_NUM_RUNTIME] = { | |||
}; | |||
|
|||
static mp_obj_t instance_binary_op(mp_binary_op_t op, mp_obj_t lhs_in, mp_obj_t rhs_in) { | |||
#if MICROPY_PY_TSTRINGS | |||
if (op == MP_BINARY_OP_ADD || op == MP_BINARY_OP_INPLACE_ADD) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This whole bit can be removed, it's not needed. There is already a check in template_binary_op()
.
@@ -822,6 +847,356 @@ static mp_obj_t extra_coverage(void) { | |||
MICROPY_STACK_CHECK == 0 || old_stack_limit == new_stack_limit); | |||
} | |||
|
|||
|
|||
#if MICROPY_PY_TSTRINGS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm surprised all of this coverage code is needed here, in C.
You should be able to move most (if not all) of this to pure Python tests. Did you have trouble doing that?
mp_print_t repr_print; | ||
vstr_init_print(&repr_vstr, 16, &repr_print); | ||
mp_obj_print_helper(&repr_print, value, PRINT_REPR); | ||
conv_value = mp_obj_new_str_from_vstr(&repr_vstr); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These 3 cases for r/s/a all look the same. Can you factor the code to reduce code duplication?
} | ||
dest[0] = mp_obj_new_tuple(interps->len, values); | ||
} else { | ||
mp_obj_t *values = m_new(mp_obj_t, interps->len); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can simplify the code here and not have any temporary memory by the following:
mp_obj_tuple_t *tuple = MP_OBJ_TO_PTR(mp_obj_new_tuple(interps->len, NULL));
for (size_t i = 0; i < interps->len; i++) {
tuple->items[i] = interp->value;
}
dest[0] = MP_OBJ_FROM_PTR(tuple);
return MP_OBJ_FROM_PTR(result); | ||
} | ||
|
||
case MP_BINARY_OP_REVERSE_ADD: { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can just remove this case
, it's not needed.
est_bytes += est_seg_cnt * sizeof(mp_parse_node_t) * 2; // Assume some growth | ||
est_bytes += est_interp_cnt * sizeof(mp_parse_node_t) * 2; | ||
|
||
if (est_bytes > MICROPY_PY_TSTRING_MAX_BYTES) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this test is needed, what's wrong with the tstring growing large?
I've made a few comments, mostly regarding coverage. About the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR needs to be rebased into several distinct commits so that it can be properly reviewed. This is a very complicated set of changes that adds a lot of potential surface for errors, requiring careful scrutiny --- and the way it mixes concerns does not inspire confidence.
To make this easier to review, please split your commit into multiple commits. If you still have a non-squashed head you can submit instead, that'll probably helpful; but otherwise you're going to need to learn to love git cherry-pick
and git rebase -i
.
Several of the changes seem to be fixes to string parsing generally, beyond just t-strings. It's not surprising that you might discover latent string-parsing bugs while implementing something like this, but those fixes need to be split into their own commit, and need their own matching tests.
Several of the changes are tooling changes: the MICROPY_STACK_CHECK_MARGIN
and related changes that seem to be meant to accommodate running micropython with a sanitizer. Sanitizers are definitely helpful on features like this, and if those are changes that allow some sanitizer that previously didn't work on micropython at all to be used, that's potentially quite useful and should be submitted as its own PR. And if they're specific to bypassing errors introduced by this feature, this needs more discussion here.
I'd also appreciate separate commits for test code vs feature code, so that it's more straightforward to verify test sensitivity --- i.e. that the tests that are supposed to fail without the feature actually do fail without it. Doing it with separate commits makes it easy to just checkout the test commit on its own and run the suite from there.
All of the tests have .exp
files associated with them --- it's understandable since this is a syntax feature not present in most of the CPython versions used for testing, either.
Having those as part of the same commit makes it less convenient for testing, though --- please separate out the .exp
files that don't represent actual micropython-specific behaviors and put them in a separate commit from their tests.
vstr_add_byte(&lex->vstr, '{'); | ||
next_char(lex); | ||
} else { | ||
} else if (is_fstring) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to be redundant with the while (is_fstring
condition.
|
||
Note: When building with sanitizers (ASan/UBSan), an 8 KiB stack margin is | ||
automatically reserved unless the port overrides MICROPY_STACK_CHECK_MARGIN. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Documentation change not related to PEP-750.
extern mp_obj_t mp_builtin___template__(mp_obj_t strings, mp_obj_t interpolations); | ||
extern const mp_obj_fun_builtin_fixed_t mp_builtin___template___obj; | ||
extern const mp_obj_type_t mp_type_template; | ||
extern const mp_obj_type_t mp_type_interpolation; | ||
extern mp_obj_t mp_obj_new_interpolation(mp_obj_t value, mp_obj_t expression, mp_obj_t conversion, mp_obj_t format_spec); | ||
|
||
typedef struct _mp_obj_template_t { | ||
mp_obj_base_t base; | ||
mp_obj_t strings; | ||
mp_obj_t interpolations; | ||
} mp_obj_template_t; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you need all of these symbols just to test this, these symbols probably shouldn't be private.
// Reserve extra C-stack headroom for overflow checks. | ||
// Sanitizer builds enlarge call frames; 8 KiB prevents | ||
// false positives without noticeably shrinking usable stack. | ||
#define MICROPY_STACK_CHECK_MARGIN (8192) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like a tooling change that's unrelated to PEP-750.
Though if I'm wrong about that, this needs more than just this comment to justify the claim that the sanitizer errors this bypasses are actually "false" positives.
#if MICROPY_PY_FSTRINGS | ||
static bool is_char_or4(mp_lexer_t *lex, byte c1, byte c2, byte c3, byte c4) { | ||
return lex->chr0 == c1 || lex->chr0 == c2 || lex->chr0 == c3 || lex->chr0 == c4; | ||
} | ||
#endif | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move this up next to is_char_or
and is_char_or3
.
Also, this function doesn't need a macro guard -- it already has static
linkage, so the compiler can already elide it if it's not used anywhere else in the file.
typedef char _tstr_assert1[(TSTR_HDR_INT_SHIFT + 12 <= 32) ? 1 : -1]; | ||
typedef char _tstr_assert2[(TSTR_MAX_SEG <= 0xFFF) ? 1 : -1]; | ||
typedef char _tstr_assert3[(TSTR_MAX_INT <= 0xFFF) ? 1 : -1]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting idiom, I wasn't aware you could do static asserts like that!
But you should use MP_STATIC_ASSERT
(from misc.h
) instead.
|
||
#if MICROPY_PY_TSTRINGS | ||
Q(_templatelib) | ||
Q(__t) | ||
Q(Template) | ||
Q(Interpolation) | ||
Q(strings) | ||
Q(interpolations) | ||
Q(value) | ||
Q(expression) | ||
Q(conversion) | ||
Q(format_spec) | ||
Q(string_dot_templatelib) | ||
Q(string) | ||
Q(templatelib) | ||
Q(values) | ||
Q({) | ||
Q(}) | ||
Q(__template__) | ||
Q(__lt_tstring_expr_gt_) | ||
Q(__iter__) | ||
Q(template_iterator) | ||
#endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is unnecessary, these should all already be picked up automatically from any MP_QSTR_*
symbols, during the qstr preprocessing build step. (If you're getting IDE errors it's because you haven't set your include path to include the generated includes folder in the build output directory.)
#if MICROPY_PY_SYS_SETTRACE | ||
// Initialize profiling state | ||
ts->prof_callback_is_executing = false; | ||
ts->prof_trace_callback = MP_OBJ_NULL; | ||
ts->current_code_state = NULL; | ||
#endif | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a tooling change unrelated to PEP-750.
# Use larger stack size for desktop platforms to accommodate sanitizers | ||
if sys.platform in ("linux", "darwin", "emscripten", "win32"): | ||
sz = 32 * 1024 | ||
else: | ||
sz = 2 * 1024 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a tooling change unrelated to PEP-750.
// Ensure we have at least 3 characters before accessing | ||
if (lex->fstring_args.len >= 3) { | ||
lex->chr0 = lex->fstring_args.buf[0]; | ||
lex->chr1 = lex->fstring_args.buf[1]; | ||
lex->chr2 = lex->fstring_args.buf[2]; | ||
} else { | ||
// This should not happen with well-formed f/t-strings | ||
lex->chr0 = lex->fstring_args.len > 0 ? (unichar)lex->fstring_args.buf[0] : MP_LEXER_EOF; | ||
lex->chr1 = lex->fstring_args.len > 1 ? (unichar)lex->fstring_args.buf[1] : MP_LEXER_EOF; | ||
lex->chr2 = lex->fstring_args.len > 2 ? (unichar)lex->fstring_args.buf[2] : MP_LEXER_EOF; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be part of a separate commit, and needs a non-tstring test.
Summary
Implements PEP 750 template strings for MicroPython.
Started in discussion #17497. Template strings (t-strings) are new in Python 3.14 - they return Template objects instead of strings, so you can access the literal parts and expressions separately.
Changes:
Usage:
Testing
Tested on unix port with both MICROPY_PY_TSTRINGS enabled and disabled.
Test coverage:
All existing tests continue to pass. New tests added in
tests/basics/string_template*.py
.Ports tested: Unix (other ports need testing)
Trade-offs and Alternatives
Adds ~240 bytes when enabled on unix port. When disabled, no impact.
Worth it because:
Config: Enabled by default when MICROPY_CONFIG_ROM_LEVEL >= EXTRA_FEATURES.
Needs MICROPY_PY_FSTRINGS=1.
To disable:
make CFLAGS_EXTRA=-DMICROPY_PY_TSTRINGS=0