Feature #23091
closedUnify SLURM (SbatchArgumentsList) and LSF (BsubArgumentsList) configuration style
Description
Background¶
Currently the LSF dispatcher configuration uses a template approach:
# Template variables starting with % will be substituted as follows:
#
# %U uuid
# %C number of VCPUs
# %M memory in MB
# %T tmp in MB
# %G number of GPU devices (runtime_constraints.gpu.device_count)
[...]
BsubArgumentsList: ["-o", "/tmp/crunch-run.%%J.out", "-e", "/tmp/crunch-run.%%J.err", "-J", "%U", "-n", "%C", "-D", "%MMB", "-R", "rusage[mem=%MMB:tmp=%TMB] span[hosts=1]", "-R", "select[mem>=%MMB]", "-R", "select[tmp>=%TMB]", "-R", "select[ncpus>=%C]", "-We", "%W"]
The SLURM dispatcher just accepts literal command line arguments, with no template capability. The arguments used to set job name, CPU count, memory, etc., are hard-coded.
SbatchArgumentsList: []
Our documentation mentions:
Note: If an argument is supplied multiple times, slurm uses the value of the last occurrence of the argument on the command line. Arguments specified through Arvados are added after the arguments listed in SbatchArguments. This means, for example, an Arvados container with that specifies partitions in scheduling_parameter will override an occurrence of --partition in SbatchArguments. As a result, for container parameters that can be specified through Arvados, SbatchArguments can be used to specify defaults but not enforce specific policy.
Proposal¶
Update the SLURM dispatcher to behave similarly to the LSF dispatcher: use template variables and make the default SbatchArgumentsList something like
["--mem=%M", "--cpus-per-task=%C", "--tmp=%T", "--gpus=%G", "--no-requeue"]
A note about nice: The dispatcher updates nice after queueing the job to maintain relative priority ordering among arvados-submitted slurm jobs. Because of this, we will not provide any template support for setting nice. The configuration will document this limitation with this explanation.
Write an upgrade note that explains you must add previously-default arguments to your configuration. Unless it's utterly trivial, do not bother detecting and reporting the old format, or figuring out some fancy auto-detected migration.
Files
Updated by Tom Clegg 7 months ago
- Related to Feature #23076: arvados-dispatch-slurm supports GPU requirements added
Updated by Brett Smith 7 months ago
- Related to Feature #23110: Add SLURM.SbatchGPUArguments configuration added
Updated by Brett Smith 7 months ago
- Target version set to Development 2025-09-03
- Assigned To set to Tom Clegg
- Description updated (diff)
Updated by Tom Clegg 7 months ago
23091-sbatch-template-args @ bb493db09c0470294316ee94cceaa011234c5d89 -- developer-run-tests: #4859
Based on unmerged branch 23110-sbatch-gpu-arguments.
- All agreed upon points are implemented / addressed. Describe changes from pre-implementation design.
- ✅ SbatchArgumentsList and SbatchGPUArgumentsList configs are templated
- ✨ Updated existing tests (args for test cases got re-ordered)
- ✨ Updated existing tests to exercise a non-zero ReserveExtraRAM config (existing tests weren't checking that it was being used at all)
- ✨ Added an error-checking test for LSF BsubArgumentsList (noticed there wasn't one)
- Anything not implemented (discovered or discussed during work) has a follow-up story.
- n/a
- Code is tested and passing, both automated and manual, what manual testing was done is described.
- ✅ Updated existing tests
- The tested code incorporates recent main branch changes.
- ✅
- New or changed UI/UX has gotten feedback from stakeholders.
- ✅
- Documentation has been updated.
- ✅ Updated config reference.
- ✅ Added upgrade note (if SbatchArgumentsList is already configured, it must be updated during upgrade in order to retain previous behavior).
- Behaves appropriately at the intended scale (describe intended scale).
- ✅ n/a
- Considered backwards and forwards compatibility issues between client and server.
- ✅ n/a
- Follows our coding standards and GUI style guidelines.
- ✅
crunch-dispatch-slurm has a seemingly undocumented feature that if you have InstanceTypes in your Arvados config it will choose one and run sbatch --constraint=instancetype=X instead of sbatch --mem=X --cpus-per-task=Y --tmp=Z. This branch preserves the feature but you have to be more explicit with SbatchArgumentsList: ["--constraint=instancetype=%I"]. And now the capability is (very slightly) documented because the %I sequence is mentioned in the config reference. The current upgrade note doesn't mention it though, so we're basically assuming nobody is using it. Or does it deserve to be mentioned there?
Updated by Brett Smith 7 months ago
Tom Clegg wrote in #note-8:
23091-sbatch-template-args @ bb493db09c0470294316ee94cceaa011234c5d89 -- developer-run-tests: #4859
The test failure definitely looks like it could be related to the branch. Can you please at least look and weigh in?
When you start a code block in the docs, please write the first line on the same physical line as the opening <pre> to avoid this blank line on rendering:

It would also be nice to highlight the formerly-default arguments with <span class="userinput"> to call attention to where they are on the line. See the existing upgrade note about RHEL repository GPG keys for an example.
Code LGTM otherwise, thanks.
Updated by Tom Clegg 7 months ago
23091-sbatch-template-args @ a62dd27026fec42caeb1c2e9fcd2e1ed889692cc -- developer-run-tests: #4860
Fixed a bug that caused lsf test suite to deadlock and time out when there was more than one test func.
Updated by Tom Clegg 7 months ago
Brett Smith wrote in #note-9:
The test failure definitely looks like it could be related to the branch. Can you please at least look and weigh in?
Indeed. See #note-10 above.
When you start a code block in the docs, please write the first line on the same physical line as the opening <pre> to avoid this blank line on rendering:
Oops, fixed.
It would also be nice to highlight the formerly-default arguments with <span class="userinput"> to call attention to where they are on the line. See the existing upgrade note about RHEL repository GPG keys for an example.
Good point, done.
23091-sbatch-template-args @ 7ccd2d1f543b57a9e45931f34d30f37d2afc962e
Updated by Brett Smith 7 months ago
Tom Clegg wrote in #note-11:
23091-sbatch-template-args @ 7ccd2d1f543b57a9e45931f34d30f37d2afc962e
Manually reviewed the new docs and LGTM, thank you.
Updated by Tom Clegg 7 months ago
- Status changed from In Progress to Resolved
Applied in changeset arvados|c5638a23b65d100a619dca72fbf80ac067e6b659.