Feature #23006
closedPort CWL integration tests to pytest
Description
source:sdk/cwl/test_with_arvbox.sh runs a few different test suites, which you can see in the big if branch near the bottom:
- It runs the CWL conformance tests for each of CWL 1.0, 1.1, and 1.2.
- It runs its own integration tests at source:sdk/cwl/tests/arvados_tests.sh which stages some collections, then runs a mix of command line tests, Python tests, and cwltest with our own custom tests.
These are all run by the Jenkins job arvados-cwl-conformance-tests, which has one suite per branch.
For background, we also have run-tests-cwl-suite, which runs the CWL 1.2 conformance tests against tordo; and run-tests-arvados-cwl, which similarly runs arvados_tests.sh against tordo.
Convert all of these tests to pytest such that they run on the current system just like "regular" tests. It is okay for the tests to just run the cwltest or arvados-cwl-runner command and check the exit status as the current tests do, but porting them to pytest reduces the amount of custom scaffolding we have and will help us build better tests in the future.
The following pieces will be helpful:
- A fixture to get the conformance tests by spinning a temporary worktree from a named ref out of
$WORKSPACE.- See
build/run-library.sh:get_ci_scriptsfor an example in shell. Discuss specifics with Brett about how the CWL repositories are set up in Jenkins.
- See
- A session fixture that writes an "entry-point" type executable script that just runs the version of a-c-r under test with the given arguments. It yields its path, then removes itself. This script can be called directly (for the first test in
arvados_tests.sh) or given tocwltestas the runner. - A (session?) fixture that ensures each of the collections at the top of
arvados_tests.shexists: it tries to get the collection by portable data hash; if not found, it uploads the corresponding data and checks the resulting hash; then yields the collection record. - Custom marks for conformance and integration tests to make it easier to select or exclude these during test runs.
It's okay for these fixtures to just live under sdk/cwl/tests for now. Try to be mindful of the fact we might want to move them under sdk/python/tests in the future, but actually making that move can be part of a separate story when we have a better sense of what we're doing.
This story should remove test_with_arvbox.sh because all the tests can now be run normally in Jenkins with run-tests.sh.
Updated by Brett Smith 9 months ago
- Related to Idea #22939: Standardize `arvados-server boot` for running from source added
Updated by Brett Smith 9 months ago
- Has duplicate Bug #22934: run cwl conformance tests using 'arvados-server boot' instead of arvbox added
Updated by Brett Smith 9 months ago
For extra clarity: I believe the ability to run multiple containers simultaneously is not actually required. arvbox currently uses crunch-dispatch-local, just like arvados-server boot does, so there should be no issue making this migration. The integration tests run a-c-r in --local mode (not --submit) which avoids the situation of "a-c-r is the running container so the requests it creates never get dispatched."
Noting 2025-07-24: This comment was very wrong, see later discussion.
Updated by Brett Smith 8 months ago
- Target version set to Development 2025-07-23
- Assigned To set to Brett Smith
- Status changed from New to In Progress
23006-cwl-integration-pytests @ 692bac527e4c5bddd9a8aff5c93bb70aea3c2dee
developer-run-tests: #4832 - The test failure is #23060 which we're also seeing in current main.
arvados-cwl-conformance-tests: #1997
This branch is built on top of the #21389 branch. It's not a super critical dependency, I could write a version without it, but the branch does use it and this avoids a future merge conflict against 36f65eebb69130520f2468572f2246144b9723bf.
The tests were run against 4072200ace567d68a9e49b862d47331f610fe241. The history is: I wrote this branch; in doing so I realized I could do #21389 better; so I did and now #23006 is based on that. There are no CWL code differences between the two versions, just different history to get there. You can confirm this yourself by diffing them. Sorry, I know this is complicated, and I would just rerun the tests, except #23037 is making that difficult right now.
This branch arranges to run all the a-c-r conformance and integration tests under pytest. This has a few benefits.
Test dependencies become fixtures and explicit dependencies of tests. You can just ask to run a test and pytest will figure out what needs to be instantiated and do that. You don't have to run the tests through a bespoke script.
We also get all the test selection and reporting interface improvements that come with pytest. You can say things like fail on first test, run failed tests first, etc.
With this branch, you can run the tests against a real cluster, assuming you have credentials in place, by running the following:
build/run-tests.sh --temp YOUR_TMPDIRinstall tools/crunchstat-summaryinstall sdk/cwl
. YOUR_TMPDIR/VENV3DIR/bin/activatecd sdk/cwlpytest [test selection arguments]
Note that this will make a bunch of cruft in your home project. Some tests automatically use a temporary project. Using that more consistently could be a follow-up improvement. For this branch I stuck very closely to porting the existing code, partly to make sure I wasn't screwing up the tests and partly to avoid scope creep.
A future branch can make it so you can run these tests directly from run-tests.sh. That requires work to add crunch-dispatch-local to the test cluster.
- All agreed upon points are implemented / addressed. Describe changes from pre-implementation design.
- This branch got big enough that it seemed to make sense to stop and do a review here. This branch is "just" the port of the tests to pytest, and the fact that it continues to work in arvbox shows that I haven't broken anything there. A follow-up branch can add the necessary support to
run-tests.shand actually make the switch.
- This branch got big enough that it seemed to make sense to stop and do a review here. This branch is "just" the port of the tests to pytest, and the fact that it continues to work in arvbox shows that I haven't broken anything there. A follow-up branch can add the necessary support to
- Anything not implemented (discovered or discussed during work) has a follow-up story.
- I will make follow-up stories assuming we're agreed this is a good scope.
- Code is tested and passing, both automated and manual, what manual testing was done is described.
- See above
- Tested code incorporates recent main branch changes.
- Yes
- New or changed UI/UX and has gotten feedback from stakeholders.
- No change yet. Ultimately this is heading to a place where we have fewer one-off scripts and do more through the standard interfaces of
run-tests.sh/pytest, and I think everyone's agreed that's a win.
- No change yet. Ultimately this is heading to a place where we have fewer one-off scripts and do more through the standard interfaces of
- Documentation has been updated.
- I don't think there's anything to update right now. It makes more sense to do once you can run the tests through
run-tests.sh.
- I don't think there's anything to update right now. It makes more sense to do once you can run the tests through
- Behaves appropriately at the intended scale (describe intended scale).
- I don't notice any significant difference in runtime. The tests themselves have enough overhead that swamp exactly how we invoke them.
- Considered backwards and forwards compatibility issues between client and server.
- N/A
- Follows our coding standards and GUI style guidelines.
- Yes
Updated by Brett Smith 8 months ago
- Target version changed from Development 2025-07-23 to Development 2025-08-06
Updated by Tom Clegg 8 months ago
Brett Smith wrote in #note-7:
For extra clarity: I believe the ability to run multiple containers simultaneously is not actually required. arvbox currently uses crunch-dispatch-local, just like
arvados-server bootdoes, so there should be no issue making this migration. The integration tests run a-c-r in--localmode (not--submit) which avoids the situation of "a-c-r is the running container so the requests it creates never get dispatched."
arvados-server boot doesn't use crunch-dispatch-local. It uses dispatch-cloud with the loopback driver, which doesn't run multiple containers at once.
Updated by Brett Smith 8 months ago
Tom Clegg wrote in #note-12:
arvados-server bootdoesn't use crunch-dispatch-local. It uses dispatch-cloud with the loopback driver, which doesn't run multiple containers at once.
Yeah basically everything I said was wrong. In addition to your point, the a-c-r tests do not use --local. Instead they're relying on the fact that crunch-dispatch-local does, in fact, dispatch multiple containers.
Because of this, I think the path forward from here is:
- Teach
run_test_server.pyto start c-d-l, because that's a relatively small change that will let us remove arvbox and that's a huge net win for maintenance. - Teach the loopback driver to dispatch multiple containers. That's useful for both testing and single-node installs.
- Replace
run_test_server.pywitharvados-server boot.
Updated by Brett Smith 8 months ago
- Related to Feature #23075: Run CWL integration tests from run-tests.sh added
Updated by Brett Smith 8 months ago
- Status changed from In Progress to Resolved