Skip to content

multiprocessing.Process.is_alive() can incorrectly return True after join() #130895

Open
@colesbury

Description

@colesbury

Bug report

Bug description:

This came up in #130849 (comment)

The problem is that popen_fork.Popen (and popen_spawn.Popen and popen_forkserver.Popen) are not thread-safe:

def poll(self, flag=os.WNOHANG):
if self.returncode is None:
try:
pid, sts = os.waitpid(self.pid, flag)
except OSError:
# Child process not yet created. See #1731717
# e.errno == errno.ECHILD == 10
return None
if pid == self.pid:
self.returncode = os.waitstatus_to_exitcode(sts)
return self.returncode

The first successful call to os.waitpid() may reap the pid so that subsequent calls raise an OSError. I've only seen this on macOS (not Linux). We may not yet however have set self.returncode -- that happens a few statements later, so poll() can return None if:

  1. The process has finished
  2. Another thread called poll(), but hasn't yet set self.returncode

And then is_alive() can return True:

def is_alive(self):
'''
Return whether process is alive
'''
self._check_closed()
if self is _current_process:
return True
assert self._parent_pid == os.getpid(), 'can only test a child process'
if self._popen is None:
return False
returncode = self._popen.poll()
if returncode is None:
return True
else:
_children.discard(self)
return False

Note that some classes like concurrent.futures.ProcessPoolExecutor use threads internally, so the user may not even know that threads are involved.

Repro:

repro.py
import os
import multiprocessing as mp
import threading
import time
import sys

original_excepthook = threading.excepthook

def on_except(args):
    original_excepthook(args)
    os._exit(1)

threading.excepthook = on_except

def p1():
    pass

def thread1(p):
    while p.is_alive():
        time.sleep(0.00001)
        pass

def test():
    for i in range(1000):
        print(i)
        p = mp.Process(target=p1)
        p.start()

        t = threading.Thread(target=thread1, args=(p,))
        t.start()

        p.join()
        assert not p.is_alive()

        t.join()

def main():
    threads = [threading.Thread(target=test) for _ in range(10)]
    for t in threads:
        t.start()
    for t in threads:
        t.join()

if __name__ == "__main__":
    main()

NOTE:

  • This is unrelated to free threading
  • popen_fork.Popen (and subclasses) are distinct from subprocess.Popen

CPython versions tested on:

CPython main branch

Operating systems tested on:

macOS

Linked PRs

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions