Actions
Feature #11146
open[Crunch2] [Workbench] Show slurm queue position of containers submitted to slurm but not yet running
Status:
New
Priority:
Normal
Assigned To:
-
Category:
Crunch
Target version:
-
Start date:
Due date:
% Done:
0%
Estimated time:
Story points:
3.0
Description
Background¶
From the user's perspective, it's hard to see what (if anything) is happening between the time a container is created/queued and the time it actually starts running.
In a SLURM setup, the container typically moves quickly from Queued to Locked state when crunch-dispatch-slurm puts it in the slurm queue, and then stays there for some time waiting for SLURM resources to run it.
Proposed feature¶
Soon after a container is submitted to the SLURM queue, Workbench should start indicating how close the resulting SLURM job is to the front of the queue.
Implementation¶
When checking squeue, crunch-dispatch-slurm should notice the slurm queue position for each "Locked" container, and propagate this information to the API server.- API: Add a new serialized Hash field
dispatch_info
- crunch-dispatch-slurm: store queue position as
dispatch_info["queue_position"]
- crunch-dispatch-slurm: only update containers for which this process has the lock
- crunch-dispatch-slurm: rate-limit queue position updates for any given container: max one update per second, avoid sending redundant updates like "update queue position from 5 to 5"
- crunch-dispatch-slurm: ensure no races between "update queue position" and "update container state" requests
- Workbench: display the latest queue position when available
Actions