Bug #22183
openFix deprecated fork in FUSE tests
Description
Running the FUSE tests with Python 3.13:
tests/test_cache.py::CacheTest::test_cache_spill
/usr/local/lib/python3.13/multiprocessing/popen_fork.py:67: DeprecationWarning: This process (pid=583455) is multi-threaded, use of fork() may lead to deadlocks in the child.
self.pid = os.fork()
Updated by Tom Clegg 2 months ago
ยท Edited
I couldn't get Python 3.13 to work at all (changed version in inventory.yml, ran ansible, and the resulting Python couldn't make a virtualenv).
tom@curve:~/arvados (main)$ WORKSPACE=`pwd` ./build/run-tests.sh --interactive
...
Error: Command '['/tmp/tmp.dYbfnCu5dV/VENV3DIR/bin/python3', '-m', 'ensurepip', '--upgrade', '--default-pip']' returned non-zero exit status 1.
Fatal: virtualenv creation failed (encountered in setup_virtualenv at ./build/run-tests.sh line 364)
tom@curve:~/arvados (main)$ python3 -m venv /tmp/venv
Error: Command '['/tmp/venv/bin/python3', '-m', 'ensurepip', '--upgrade', '--default-pip']' returned non-zero exit status 1.
tom@curve:~/arvados (main)$ /tmp/venv/bin/python3 -m ensurepip --upgrade --default-pip
Traceback (most recent call last):
File "<frozen runpy>", line 189, in _run_module_as_main
File "<frozen runpy>", line 148, in _get_module_details
File "<frozen runpy>", line 112, in _get_module_details
File "/usr/lib/python3.13/ensurepip/__init__.py", line 2, in <module>
import subprocess
File "/usr/lib/python3.13/subprocess.py", line 106, in <module>
from _posixsubprocess import fork_exec as _fork_exec
ModuleNotFoundError: No module named '_posixsubprocess'
I tried fixing the deprecation warning (and, supposedly, risk of deadlock) like this:
diff --git a/services/fuse/tests/integration_test.py b/services/fuse/tests/integration_test.py
index 24ac7baf04..dd02b89d44 100644
--- a/services/fuse/tests/integration_test.py
+++ b/services/fuse/tests/integration_test.py
@@ -20,6 +20,12 @@ import pytest
from . import run_test_server
+# Avoid deadlock risk (and deprecation warning in Python >= 3.12) by
+# using spawn instead of fork. See
+# https://docs.python.org/3.13/library/os.html#os.fork
+if __name__ == '__main__':
+ multiprocessing.set_start_method('spawn')
+
@atexit.register
def _pool_cleanup():
if _pool is None:
Python 3.10 doesn't give a deprecation warning so I can't tell whether the patch makes it go away.
However, it does seem to cause different problems.
======= test services/fuse ========================================== test session starts ========================================== platform linux -- Python 3.10.19, pytest-9.0.2, pluggy-1.6.0 rootdir: /home/tom/arvados/services/fuse configfile: pytest.ini testpaths: tests plugins: cwltest-2.6.20251216093331 collected 166 items / 165 deselected / 1 selected tests/test_cache.py Sent SIGTERM to 387490 (/home/tom/arvados/tmp/keep0.pid) Sent SIGTERM to 387509 (/home/tom/arvados/tmp/keep1.pid) !!!!!! _pytest.outcomes.Exit: llfuse thread outlived test - aborting test suite to avoid deadlock !!!!!!! ================================== 165 deselected in 63.37s (0:01:03) =================================== ***** services/fuse tests exited with code 2 -- retrying ***** ========================================== test session starts ========================================== platform linux -- Python 3.10.19, pytest-9.0.2, pluggy-1.6.0 rootdir: /home/tom/arvados/services/fuse configfile: pytest.ini testpaths: tests plugins: cwltest-2.6.20251216093331 collected 166 items / 165 deselected / 1 selected tests/test_cache.py Sent SIGTERM to 387723 (/home/tom/arvados/tmp/keep0.pid) Sent SIGTERM to 387742 (/home/tom/arvados/tmp/keep1.pid) !!!!!! _pytest.outcomes.Exit: llfuse thread outlived test - aborting test suite to avoid deadlock !!!!!!! ================================== 165 deselected in 63.37s (0:01:03) =================================== ***** services/fuse tests exited with code 2 -- retrying ***** ========================================== test session starts ========================================== platform linux -- Python 3.10.19, pytest-9.0.2, pluggy-1.6.0 rootdir: /home/tom/arvados/services/fuse configfile: pytest.ini testpaths: tests plugins: cwltest-2.6.20251216093331 collected 166 items / 165 deselected / 1 selected tests/test_cache.py Sent SIGTERM to 387877 (/home/tom/arvados/tmp/keep0.pid) Sent SIGTERM to 387895 (/home/tom/arvados/tmp/keep1.pid) !!!!!! _pytest.outcomes.Exit: llfuse thread outlived test - aborting test suite to avoid deadlock !!!!!!! ================================== 165 deselected in 63.24s (0:01:03) =================================== ======= services/fuse tests -- FAILED ======= test services/fuse -- 191s Pass: services/fuse install (4s) Failures (1): Fail: services/fuse tests (191s)
Running that single test case seems to deadlock/fail this way about 50% of the time with the 'spawn' patch in place. If I revert the patch, it passes consistently.
Updated by Brett Smith 2 months ago
Tom Clegg wrote in #note-1:
I couldn't get Python 3.13 to work at all (changed version in inventory.yml, ran ansible, and the resulting Python couldn't make a virtualenv).
Did you build Python 3.13 from source? (i.e., did you not set arvados_dev_from_pkgs: true?) The backtrace makes it look like it's picking up modules from the Debian install. I wonder if things are getting confused because a Python built from source is picking up modules built for Debian.
Updated by Tom Clegg 2 months ago
Brett Smith wrote in #note-2:
Did you build Python 3.13 from source? (i.e., did you not set
arvados_dev_from_pkgs: true?) The backtrace makes it look like it's picking up modules from the Debian install. I wonder if things are getting confused because a Python built from source is picking up modules built for Debian.
I did not set arvados_dev_from_pkgs: true and I remember the ansible logs indicating python 3.13 building from source.