Skip to content

extmod/modlwip.c: Fix crash when calling recv on listening socket #17333

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 26, 2025

Conversation

andrewleech
Copy link
Contributor

@andrewleech andrewleech commented May 21, 2025

Summary

Fix an issue where calling recv() on a listening socket after accept() would cause a hard fault crash on
MicroPython hardware. This happens when user code incorrectly tries to receive data on the listening socket
instead of the accepted connection.

With this fix, MicroPython now properly raises an OSError with errno=107 (ENOTCONN) similar to CPython,
instead of crashing. This makes the behavior consistent with standard socket implementations and improves error
handling.

The bug is demonstrated by the following code pattern:

import socket
s = socket.socket(socket.AF_INET)
s.bind(('0.0.0.0', 8882))
s.listen()
s.accept()   # connection not saved
s.recv(3)  # incorrectly calling recv on listening socket

Once data is sent to this socket from

import socket
s = socket.socket(socket.AF_INET)
s.connect(('127.0.0.1', 8882))
s.send(b'abc')

The first device crashes with a hard fault.

Closes #16697 (the root cause of the wrap_socket() crash is calling recv on a listening socket.)

Testing

Added a new test file tests/multi_net/tcp_accept_recv_listen.py that demonstrates the fix by intentionally
triggering the error condition and checking for the appropriate exception.

Testing was performed against a pyboard-d (WIFI):

./run-multitests.py -i pyb:/dev/ttyACM0 -i cpython multi_net/tcp_accept_recv.py

The test confirms that a proper OSError is raised with the correct errno value (107/ENOTCONN) instead of
crashing.


While this bug was found / fixed manually the unit test and documentation was created with AI support from Claude Code.

Copy link

github-actions bot commented May 21, 2025

Code size report:

   bare-arm:    +0 +0.000% 
minimal x86:    +0 +0.000% 
   unix x64:    +0 +0.000% standard
      stm32:    +0 +0.000% PYBV10
     mimxrt:    +0 +0.000% TEENSY40
        rp2:    +0 +0.000% RPI_PICO_W
       samd:    +0 +0.000% ADAFRUIT_ITSYBITSY_M4_EXPRESS
  qemu rv32:    +0 +0.000% VIRT_RV32

@andrewleech andrewleech force-pushed the socket_listed_recv branch 2 times, most recently from 583b17e to 210f5c9 Compare May 21, 2025 02:22
@@ -0,0 +1,39 @@
# Test recv on listening socket after accept without saving connection
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless I'm missing something subtle, this test looks identical to the existing tcp_accept_recv.py test?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeah, I referenced that unit test when adding the one here but for some reason thought it was performing recv on the correct accepted socket rather than directly on the original socket.

I just ran the test between cpython and a rpi pico_2W and it doesn't hard fault reset, but nor does it raise an exception, it never receives anything and hangs.

But then I test again with pyb-d on current master and it passes multi_net/tcp_accept_recv.py but reliably crashes on multi_net/tcp_accept_recv_listen.py

Turns out yeah it's rather subtle

     s.bind(socket.getaddrinfo("0.0.0.0", PORT)[0][-1])
-    s.listen()
+    s.listen(1)
     multitest.next()

With 1 it all passes. Without (or with 2 etc...) it crashes - aka if the original socket is still listening for new incomming connections it crashes.

Copy link
Contributor Author

@andrewleech andrewleech May 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've now consolidating the two tests, modding the original to expose the bug and re-tested before and after with pyb-d

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean you combined the two tests into the original? Did you forget to push the changes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry yes, turns out git had complained about me missing the -f on the push

@dpgeorge dpgeorge added the extmod Relates to extmod/ directory in source label May 21, 2025
@andrewleech andrewleech force-pushed the socket_listed_recv branch from 210f5c9 to bdb6296 Compare May 21, 2025 03:03
Copy link

codecov bot commented May 21, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 98.57%. Comparing base (e57aa7e) to head (dd7888a).

Additional details and impacted files
@@           Coverage Diff           @@
##           master   #17333   +/-   ##
=======================================
  Coverage   98.57%   98.57%           
=======================================
  Files         169      169           
  Lines       21968    21968           
=======================================
  Hits        21654    21654           
  Misses        314      314           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@andrewleech andrewleech force-pushed the socket_listed_recv branch from bdb6296 to 22471c6 Compare May 21, 2025 04:00
@@ -11,13 +11,14 @@ def instance0():
s = socket.socket()
s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
s.bind(socket.getaddrinfo("0.0.0.0", PORT)[0][-1])
s.listen(1)
s.listen()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might change the original test with 1 as an argument.

Maybe it's better to do a loop here to perform both tests (with no argument, and with an argument of 1, and then maybe some other values like 0 for good measure).

Also need the client to do a loop as well, and use multitest.wait() and multitest.broadcast() to sync with the server.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I've got an updated looping test just pushed that passes on the pyb-d with this patch applied.

@andrewleech andrewleech force-pushed the socket_listed_recv branch from 22471c6 to baaf54f Compare May 21, 2025 05:41
@projectgus projectgus self-requested a review May 21, 2025 23:43
Copy link
Contributor

@projectgus projectgus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM! I ran the new unit test between PYBD_SF2 (fix applies) and ESP32_GENERIC_S3 (fix doesn't apply) and noted no issues.

I believe this also fixes #16697 - I can't reproduce the crash with test code from the issue any more, and it makes sense that this would be the root cause. So that's good news! EDIT: have added this to the description so that it creates the "close on merge" linkage.

@dpgeorge dpgeorge added this to the release-1.26.0 milestone Jun 13, 2025
@dpgeorge
Copy link
Member

I've run this updated test on a PYBD_SF6 and RPI_PICO2_W, and it passes without the fix from this PR.

Not sure what to make of that... @andrewleech can you please rerun the test here without the fix and see if it also passes for you?

@andrewleech
Copy link
Contributor Author

I believe this also fixes #16697

Ah I missed that one, yes I do believe it should be essentially the same bug. The uninitialised pbuf was the same problem I found when C debugging this hard fault.

@andrewleech
Copy link
Contributor Author

I've run this updated test on a PYBD_SF6 and RPI_PICO2_W, and it passes without the fix from this PR.

That's a bit strange, I'll try to test again to ensure the test is hitting the actual problem.

@andrewleech
Copy link
Contributor Author

./run-multitests.py -i pyb:/dev/ttyACM0 -i cpython multi_net/tcp_accept_recv_listen.py

-t was needed for me to figure out what was going one; yes it was "passing" in that it's not really running:

anl@STEP:~/micropython2/tests$ ./run-multitests.py -i pyb:/dev/ttyACM0 -i cpython -t multi_net/tcp_accept_recv.py
multi_net/tcp_accept_recv.py on ttyACM0|python3:
TRACE ttyACM0|python3:
   166 i0 : SET IP = '192.168.0.65'
   167 i0 : Test case 1/4: listen()
   167 i0 : BROADCAST server_ready_0
TRACE python3|python3:
   103 i0 : SET IP = '192.168.0.188'
   103 i0 : Test case 1/4: listen()
   103 i0 : BROADCAST server_ready_0
pass
1 tests performed
1 tests passed

There was a important next missing in the test.
Once that was fixed (pushed here now) running the test from this PR without code patch applied results in

anl@STEP:~/micropython2/tests$ ./run-multitests.py -i pyb:/dev/ttyACM0 -i cpython -t multi_net/tcp_accept_recv.py
multi_net/tcp_accept_recv.py on ttyACM0|python3:
TRACE ttyACM0|python3:
   164 i0 : SET IP = '192.168.0.65'
   165 i0 : NEXT
   267 i1 : NEXT
   267 i0 : Test case 1/4: listen()
   269 i0 : BROADCAST server_ready_0
Traceback (most recent call last):
  File "/home/anl/micropython2/tests/./run-multitests.py", line 632, in main
    all_pass &= run_tests(test_files, instances_truth, instances_test_permutation)
  File "/home/anl/micropython2/tests/./run-multitests.py", line 503, in run_tests
    error, skip, output_test, output_metrics = run_test_on_instances(
  File "/home/anl/micropython2/tests/./run-multitests.py", line 391, in run_test_on_instances
    out, err = instance.readline()
  File "/home/anl/micropython2/tests/./run-multitests.py", line 278, in readline
    if self.pyb.serial.inWaiting() == 0:
  File "/home/anl/.local/lib/python3.10/site-packages/serial/serialutil.py", line 594, in inWaiting
    return self.in_waiting
  File "/home/anl/.local/lib/python3.10/site-packages/serial/serialposix.py", line 549, in in_waiting
    s = fcntl.ioctl(self.fd, TIOCINQ, TIOCM_zero_str)
OSError: [Errno 5] Input/output error

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/anl/.local/lib/python3.10/site-packages/serial/serialposix.py", line 621, in write
    n = os.write(self.fd, d)
OSError: [Errno 5] Input/output error

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/anl/micropython2/tests/./run-multitests.py", line 645, in <module>
    main()
  File "/home/anl/micropython2/tests/./run-multitests.py", line 638, in main
    i.close()
  File "/home/anl/micropython2/tests/./run-multitests.py", line 254, in close
    self.pyb.exit_raw_repl()
  File "/home/anl/micropython2/tests/../tools/pyboard.py", line 387, in exit_raw_repl
    self.serial.write(b"\r\x02")  # ctrl-B: enter friendly REPL
  File "/home/anl/.local/lib/python3.10/site-packages/serial/serialposix.py", line 655, in write
    raise SerialException('write failed: {}'.format(e))
serial.serialutil.SerialException: write failed: [Errno 5] Input/output error

and the pyboard-d resets / hardfaults.

With the patch applied I get:

anl@STEP:~/micropython2/tests$ ./run-multitests.py -i pyb:/dev/ttyACM0 -i cpython -t multi_net/tcp_accept_recv.py
multi_net/tcp_accept_recv.py on ttyACM0|python3:
TRACE ttyACM0|python3:
   166 i0 : SET IP = '192.168.0.65'
   166 i0 : NEXT
   269 i1 : NEXT
   269 i0 : Test case 1/4: listen()
   270 i0 : BROADCAST server_ready_0
   677 i0 : True
   678 i0 : BROADCAST server_done_0
   680 i0 : Test case 2/4: listen(0)
   681 i0 : BROADCAST server_ready_1
   885 i0 : True
   886 i0 : BROADCAST server_done_1
   888 i0 : Test case 3/4: listen(1)
   889 i0 : BROADCAST server_ready_2
  1093 i0 : True
  1094 i0 : BROADCAST server_done_2
  1095 i0 : Test case 4/4: listen(2)
  1097 i0 : BROADCAST server_ready_3
  1300 i0 : True
  1302 i0 : BROADCAST server_done_3
  1303 i0 : NEXT
TRACE python3|python3:
   103 i0 : SET IP = '192.168.0.188'
   103 i0 : NEXT
   205 i1 : NEXT
   205 i0 : Test case 1/4: listen()
   206 i0 : BROADCAST server_ready_0
   310 i0 : True
   311 i0 : BROADCAST server_done_0
   312 i0 : Test case 2/4: listen(0)
   313 i0 : BROADCAST server_ready_1
   314 i0 : True
   315 i0 : BROADCAST server_done_1
   317 i0 : Test case 3/4: listen(1)
   318 i0 : BROADCAST server_ready_2
   319 i0 : True
   320 i0 : BROADCAST server_done_2
   423 i0 : Test case 4/4: listen(2)
   425 i0 : BROADCAST server_ready_3
   426 i0 : True
   427 i0 : BROADCAST server_done_3
   428 i0 : NEXT
pass
1 tests performed
1 tests passed

@projectgus
Copy link
Contributor

Oops, good catch @dpgeorge . Thanks @andrewleech for finding the root cause.

Copy link
Member

@dpgeorge dpgeorge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, this works now. Tested without the fix and the new test crashes PYBD_SF6 and RPI_PICO2_W. With the fix in this PR the test passes.

@dpgeorge dpgeorge force-pushed the socket_listed_recv branch from a8bba8c to cf751cf Compare June 23, 2025 14:49
@dpgeorge
Copy link
Member

I was just about to merge this when I realised that it's not handling the non-blocking case correctly.

The new check added in this PR should go at the top of lwip_tcp_receive(), before checking for socket->incoming.tcp.pbuf == NULL. Otherwise in the case of a non-blocking socket it'll raise EAGAIN instead of ENOTCONN (unix port of MicroPython and CPython both raise ENOTCONN in non-blocking mode).

I think the non-blocking version of the test should also be added to the test, as an additional combination. So, it'll have a loop of 8 different tests (4 in blocking mode, 4 in non-blocking mode). Add s.setblocking(False) right after the s.accept() line.

(@andrewleech sorry but I force pushed to your branch, so be sure to pull --rebase)

@andrewleech
Copy link
Contributor Author

Great catch @dpgeorge I've updated as suggested and re-tested:

corona@NUG:~/mpy/socket_listed_recv/tests$ ./run-multitests.py -t -i pyb:/dev/usb/tty-Pyboard_Virtual_Comm_Port_in_FS_Mode-3254335D3037 -i cpython multi_net/tcp_accept_recv.py

TRACE tty-Pyboard_Virtual_Comm_Port_in_FS_Mode-3254335D3037|python3:
   181 i0 : SET IP = '192.168.0.65'
   181 i0 : NEXT
   284 i1 : NEXT
   285 i0 : Test case 1/8: listen() blocking=True
   287 i0 : BROADCAST server_ready_1
   694 i0 : True
   696 i0 : BROADCAST server_done_1
   698 i0 : Test case 2/8: listen(0) blocking=True
   699 i0 : BROADCAST server_ready_2
   903 i0 : True
   905 i0 : BROADCAST server_done_2
   908 i0 : Test case 3/8: listen(1) blocking=True
   910 i0 : BROADCAST server_ready_3
  1114 i0 : True
  1116 i0 : BROADCAST server_done_3
  1118 i0 : Test case 4/8: listen(2) blocking=True
  1120 i0 : BROADCAST server_ready_4
  1325 i0 : True
  1326 i0 : BROADCAST server_done_4
  1329 i0 : Test case 5/8: listen() blocking=False
  1331 i0 : BROADCAST server_ready_5
  1535 i0 : True
  1537 i0 : BROADCAST server_done_5
  1539 i0 : Test case 6/8: listen(0) blocking=False
  1541 i0 : BROADCAST server_ready_6
  1644 i0 : True
  1646 i0 : BROADCAST server_done_6
  1648 i0 : Test case 7/8: listen(1) blocking=False
  1650 i0 : BROADCAST server_ready_7
  1855 i0 : True
  1857 i0 : BROADCAST server_done_7
  1859 i0 : Test case 8/8: listen(2) blocking=False
  1861 i0 : BROADCAST server_ready_8
  2065 i0 : True
  2067 i0 : BROADCAST server_done_8
TRACE python3|python3:
   103 i0 : SET IP = '192.168.0.197'
   103 i0 : NEXT
   206 i1 : NEXT
   206 i0 : Test case 1/8: listen() blocking=True
   207 i0 : BROADCAST server_ready_1
   311 i0 : True
   312 i0 : BROADCAST server_done_1
   313 i0 : Test case 2/8: listen(0) blocking=True
   314 i0 : BROADCAST server_ready_2
   316 i0 : True
   317 i0 : BROADCAST server_done_2
   318 i0 : Test case 3/8: listen(1) blocking=True
   319 i0 : BROADCAST server_ready_3
   320 i0 : True
   322 i0 : BROADCAST server_done_3
   323 i0 : Test case 4/8: listen(2) blocking=True
   324 i0 : BROADCAST server_ready_4
   325 i0 : True
   326 i0 : BROADCAST server_done_4
   328 i0 : Test case 5/8: listen() blocking=False
   329 i0 : BROADCAST server_ready_5
   330 i0 : True
   331 i0 : BROADCAST server_done_5
   333 i0 : Test case 6/8: listen(0) blocking=False
   334 i0 : BROADCAST server_ready_6
   335 i0 : True
   336 i0 : BROADCAST server_done_6
   337 i0 : Test case 7/8: listen(1) blocking=False
   339 i0 : BROADCAST server_ready_7
   340 i0 : True
   341 i0 : BROADCAST server_done_7
   342 i0 : Test case 8/8: listen(2) blocking=False
   343 i0 : BROADCAST server_ready_8
   345 i0 : True
   346 i0 : BROADCAST server_done_8
pass
1 tests performed
1 tests passed

Add check to prevent calling recv on a socket in the listening state.  This
prevents a crash/hard fault when user code mistakenly tries to recv on the
listening socket instead of on the accepted connection.

Add corresponding test case to demonstrate the bug.

Signed-off-by: Andrew Leech <[email protected]>
@dpgeorge dpgeorge force-pushed the socket_listed_recv branch from dd7888a to 2f04381 Compare June 26, 2025 02:48
Copy link
Member

@dpgeorge dpgeorge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I retested on PYBD_SF6, with the test here, both before and after the fix in this PR. Test fails without the fix and passes with the fix.

@dpgeorge dpgeorge merged commit 2f04381 into micropython:master Jun 26, 2025
65 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
extmod Relates to extmod/ directory in source
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Calling wrap_socket on a newly created non-SSL socket causes a hard-fault.
4 participants