-
-
Notifications
You must be signed in to change notification settings - Fork 8.3k
py/formatfloat: Improve accuracy of float formatting code. #17444
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Code size report:
|
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #17444 +/- ##
=======================================
Coverage 98.56% 98.56%
=======================================
Files 169 169
Lines 21946 21948 +2
=======================================
+ Hits 21632 21634 +2
Misses 314 314 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
6efb905
to
ef61c67
Compare
Additional results after a few more experiments in double-precision :
|
Thanks for this. I will study it in detail. I also found the corresponding CPython issue and long discussion about this topic: python/cpython#45921 . There's a lot of useful information there. It looks like this is a hard thing to get right. |
I read that thread, it is indeed an interesting input. But I am afraid that we will not be able to use much of it due to ressource constraints in MicroPython. |
Out of curiosity, I have been investigating a bit more... Half of these remaining errors are caused by About 20% of the remaining errors are caused by large negative exponents (exp < -255). There is probably something we could improve there as well, but I am not sure it is a big concern for MicroPython if such seldom-used numbers have accuracy problems in I am also looking at cleaning a bit the code, to reduce code footprint. While doing this, one things that I noticed is that the function includes lots of checks on |
Yes, that makes a lot of sense. |
Related: #6024 has been around for a while and attempt to improve the other side of this, ie parsing of floats, using higher precision ints. |
Thanks, I will look at it as well. I have a big update to submit for this PR, which I believe makes things look much better. |
fb265b8
to
9652e45
Compare
I have just force-pushed my new code. import array, math, time, binascii
seed = 42
def pseudo_randouble():
global seed
ddef = []
for _ in range(8):
ddef.append(seed & 0xFF)
seed = binascii.crc32(b'\0', seed)
arr = array.array('d', bytes(ddef))
return ddef, arr[0]
# The largest errors come from seldom used very small numbers, near the
# limit of the representation. So we keep them out of this test to keep
# the max relative error display useful.
if float('1e-100') == 0:
# single-precision
min_expo = -96 # i.e. not smaller than 1.0e-29
# Expected results:
# HIGH_QUALITY_REPR=1: 99.71% exact conversions, relative error < 1e-7
# HIGH_QUALITY_REPR=0: 94.89% exact conversions, relative error < 1e-6
else:
# double-precision
min_expo = -845 # i.e. not smaller than 1.0e-254
# Expected results:
# HIGH_QUALITY_REPR=1: 99.83% exact conversions, relative error < 2.7e-16
# HIGH_QUALITY_REPR=0: 64.01% exact conversions, relative error < 1.1e-15
ttime = 0
stats = 0
N = 10000000
max_err = 0
for _ in range(N):
(ddef, f) = pseudo_randouble()
while f == math.isinf(f) or math.isnan(f) or math.frexp(f)[1] <= min_expo:
(ddef, f) = pseudo_randouble()
start = time.time_ns()
str_f = repr(f)
ttime += time.time_ns() - start
f2 = float(str_f)
if f2 == f:
stats += 1
else:
error = abs(f2 - f) / f
if max_err < error:
max_err = error
print("{:.19e}: repr='{:s}' err={:.4e}".format(f, str_f, error))
print("{:d} values converted in {:d} [ms]".format(N, int(ttime / 1000000)))
print("{:.2%} exact conversions, max relative error={:.2e}".format(stats / N, max_err)) It is similar to the one Damien posted before, but the version tests specifically the This new code brings the error rate to a really low level, which should be acceptable for MicroPython intended use. |
aba60a3
to
adcadc2
Compare
779f0e9
to
a7c4061
Compare
The unix / nanbox variant did originally crash at the end of the accuracy test, due to the randomly generated float number that create invalid nanbox objects. I have therefore fixed the nanbox code in |
1e547ba
to
1647452
Compare
py/mpprint.h
Outdated
#define PF_FLAG_SEP_POS (9) // must be above all the above PF_FLAGs | ||
#define PF_FLAG_USE_OPTIMAL_PREC (0x200) | ||
#define PF_FLAG_ALWAYS_DECIMAL (0x400) | ||
#define PF_FLAG_SEP_POS (16) // must be above all the above PF_FLAGs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Over in the PR where the code that added PF_FLAG_SEP_POS we discussed the position of this flag. There is a port with 16-bit int
s. It was possible to position PF_FLAG_SEP_POS at 9, because there were enough bits in an unsigned 16-bit value to squeeze in _
and ,
as shifted values without widening the type of the flags argument.
It's my mistake that I didn't comment on the requirement here, nor write a check that would trigger during CI. There is an assertion about the situation in objstr.c
but because the pic16bit port is not built during CI (non-free toolchain) the assertion is only checked on platforms with 32-bit ints.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Goot point. I saw your (unrelated) PR on mpprint.h
yesterday and was wondering about this.
I have just updated the code to move these new flags above the SEP
bits.
As they are strictly related to the use of floats, which are obviously not enabled in the pic16 port, this should be safe.
535a56a
to
153cbb5
Compare
While fixing |
Heads up: I found a trick to get 100% good conversions, significantly faster then CPython, and the code appears to be smaller than my previous version. Working on it... |
314df34
to
84a001f
Compare
Runtimes with the code above: STM32 (pyboard V1.1) 0.76s So only CC3200 and SAMD21 would be above the 10s limit. That's OK. We might not have to take care about them. All single float tests fail by exceeding the error limits at |
742fb2c
to
e4f3f8d
Compare
I will test that during the day. Just a quick compare for the two ends of the range of your recent test accuracy code with the code I used: MIMXRT 1.4s (your code) vs 0.37s Note that even with your code the ESP8266 needs the changed test, that the value returned by pseudo_randouble() is a float. |
tests/float/float_format_accuracy.py
Outdated
max_err = 0 | ||
for _ in range(N): | ||
f = pseudo_randouble() | ||
while math.isinf(f) or math.isnan(f) or math.frexp(f)[1] <= min_expo: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ESP8266 needs this line to be changed to:
while type(f) is not float or math.isinf(f) or math.isnan(f) or math.frexp(f)[1] <= min_expo:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I did not realize the problem would not be specific to randbits
. I have now force-pushed a new version reusing many of yours ideas, including using randbits
. I have limited the number of tests to 1500, so it should run on all platforms I guess.
Would you give it a try ?
32a6477
to
98fe95a
Compare
|
There is not much difference between rnd_buf[] being local or global. The local version is slightly faster, since it does not need a symbol lookup. So for code clarity the local version seems better. Interestingly the results for ESP8266 and SAMD21, RP2, STM32 or ESP32 at N=1200 are different. Except for SAMD21, you could go back to N=2000. SAMD21 at N=1200:
ESP8266, N=1200:
ESP8266, N=2000:
RP2 at N=1200:
RP2 at N=2000:
|
On a side note: At single precision the function pseudo_randfloat() is called 2301 times, at double precision it's 2171. |
So they all pass fine at N=2000 except for the ESP8266, where the error stays too high. I wonder what's the difference using the same float code in MP. Obviously the compiler & C-libs are different. |
The problem has the same root cause as the reason why you had to add
I am looking at fixing the first issue as I did for REPR_D, but I am afraid that this would increase code size significantly given this is an inline function. So maybe we should leave your test in the test case, and I should submit this as a different PR to make the impact on code size more obvious. Regarding the second issue, I guess I can detect REPR_C by the fact that some floats don't exist, and adjust the expectations accordingly. |
Yes, that's possible. |
98fe95a
to
86f9af4
Compare
Thanks for the hint. So I have added the exact same test in Regarding the problem of hand-crafted REPR_C floats getting interpreted as objects:
|
tests/float/float_format_accuracy.py
Outdated
# Deterministic pseudorandom generator. Designed to be uniform | ||
# on mantissa values and exponents, not on the represented number | ||
def pseudo_randfloat(): | ||
global rnd_seed, rnd_buff |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If at all, this line should be
global float_size
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Besides that the test passes for all boards.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you VERY much for all time you have spent helping me to improve this PR, this was really helpful.
I have pushed the new version without the useless global
(and moving away math.nan
fix on REPR_C, as it was not strictly needed for this PR). No need to rerun the tests for now, nothing else has changed.
I will be sending a separate PR for fixing math.nan
and crafted nan-floats on REPR_C and REPR_D
Please move to a different PR, this one is getting big :)
I need to think about this. |
I have pushed the code with the new REPR_C tolerance. I think this should now work on all platforms. To keep the test duration short, I have left the number of numbers to test set to 1200. As demonstrated with REPR_C, 1200 steps appear to be sufficient to detect any platform-specific discrepancy. |
86f9af4
to
11a8e87
Compare
@dpgeorge I have submitted a PR with the trivial fix for I have intentionally kept the REPR_C crafted-floats issue separate, as more time will be needed to find the most efficient implementation. The test code is already available here: edf5cab |
11a8e87
to
86fbed2
Compare
I have rebased the PR to master and removed nanbox-related changes, now that they have been integrated in master. |
py/mpprint.c
Outdated
@@ -353,6 +353,14 @@ int mp_print_float(const mp_print_t *print, mp_float_t f, char fmt, unsigned int | |||
|
|||
char *s = buf; | |||
|
|||
if ((flags & PF_FLAG_ALWAYS_DECIMAL) && strchr(buf, '.') == NULL && strchr(buf, 'e') == NULL && strchr(buf, 'n') == NULL) { | |||
if ((size_t)(len + 2) < sizeof(buf)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to remove this buffer size check by allocating 2 more bytes to buf
, and then passing sizeof(buf) - 2
to mp_format_float()
above, thereby guaranteeing there will always be enough space to add these characters?
Same comment for the PF_FLAG_ADD_PERCENT
check below, so add 3 bytes in total to buf
and actually pass sizeof(buf) - 3
to mp_format_float()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent idea, I have changed that.
py/mpconfig.h
Outdated
// Float conversion to/from string implementations | ||
#define MICROPY_FLTCONV_IMPL_BASIC (0) // smallest code, but inexact | ||
#define MICROPY_FLTCONV_IMPL_APPROX (1) // slightly bigger, almost perfect | ||
#define MICROPY_FLTCONV_IMPL_EXACT (2) // bigger code, but 100% exact repr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would use "and 100% exact repr" (instead of "but") because it's a positive thing that it's an exact representation.
py/mpconfig.h
Outdated
#endif | ||
|
||
#ifndef MICROPY_FLTCONV_IMPL | ||
#define MICROPY_FLTCONV_IMPL (MICROPY_FLOAT_DEFAULT_FLTCONV_IMPL) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this MICROPY_FLOAT_DEFAULT_FLTCONV_IMPL
needed? Why not just wrap the above bit of code in #ifndef MICROPY_FLTCONV_IMPL
and then define MICROPY_FLTCONV_IMPL
directly?
Also, maybe call it MICROPY_FLOAT_FORMAT_IMPL
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, done
This is looking very good now! My testing on PYBD-SF2 (single prec) and PYBD-SF6 (double prec) shows that everything is good, all tests pass. And around +250 bytes code size is pretty decent for the benefit provided by the approx algorithm. Definitely |
Following discussions in PR micropython#16666, this commit updates the float formatting code to improve the `repr` reversibility, i.e. the percentage of valid floating point numbers that do parse back to the same number when formatted by `repr`. This new code offers a choice of 3 float conversion methods, depending on the desired tradeoff between code size and conversion precision: - BASIC method is the smallest code footprint - APPROX method uses an iterative method to approximate the exact representation, which is a bit slower but but does not have a big impact on code size. It provides `repr` reversibility on >99.8% of the cases in double precision, and on >98.5% in single precision. - EXACT method uses higher-precision floats during conversion, which provides best results but, has a higher impact on code size. It is faster than APPROX method, and faster than CPython equivalent implementation. It is however not available on all compilers when using FLOAT_IMPL_DOUBLE. Here is the table comparing the impact of the three conversion methods on code footprint on PYBV10 (using single-precision floats) and reversibility rate for both single-precision and double-precision floats. The table includes current situation as a baseline for the comparison: PYBV10 FLOAT DOUBLE current = 364136 27.57% 37.90% basic = 364188 91.01% 62.18% approx = 364396 98.50% 99.84% exact = 365608 100.00% 100.00% Signed-off-by: Yoctopuce dev <[email protected]>
86fbed2
to
7034fad
Compare
The default settings in |
Summary
Following discussions in PR #16666, this pull request updates the float formatting code to reduce the
repr
reversibility error, i.e. the percentage of valid floating point numbers that do not parse back to the same number when formatted byrepr
.The baseline before this commit is an error rate of ~46%, when using double-precision floats.
This new code is available in two flavors, based on a preprocessor definition:
Testing
The new formatting code was tested for reversibility using the code provided by Damien in PR #16666
A variant using formats
{:.7g}
,{:.8g}
and{:.9g}
was used for single-precision testing.Compatibility with CPython on the various float formats was tested by comparing the output using the following code:
The integration tests have also found some corner cases in the new code which have been fixed.
For single-precision floats, some test cases had to be adapted:
float_format_ints
is tapping into an ill-defined partial digit of the mantissa (the 10th), which is not available in single-precision floats with the new code due to integer limitations. So the display range has been updated accordingly.float_struct_e
uses a 15-digit representation which is meaningless on single-precision floats. A separate version for double-precision has been made insteadfloat_format_ints
, there is one test case specific to single-precision floats which verifies that the largest possible mantissa value16777215
can be used to store that exact number and retrieve it as-is. Unfortunately the rounding in the simplest version of the new algorithm makes it display as a slightly different number. This would cause the CI test to fail on single-precision floats when the improved algorithm is not enabled.Trade-offs and Alternatives
It is unclear at that point if the simplest version of this improvement is worth the change:
The full version of the enhancement makes much more difference in terms of precision, both for double-precision and single-precision floats, but it causes about 20% overhead on conversion time, and makes the code a bit bigger
Looking forward to reading your feedback...
Edit 1: See #17444 (comment) for updates on accuracy results
Edit 2: Updated values in #17444 (comment)