Feature #5717
closed[Crunch] Use tasks_this_level to calculate a virtual max_tasks_per_node
100%
Description
Following up on #5642 and 6261cf90.
[From IRC. Edited for clarity.]
(10:44:23 AM) Me: What if we stuck to calculating $tasks_this_level before THISROUND, and then used it as a virtual max_tasks_per_node for the entire level, if one isn't explicitly set?
(10:44:53 AM) Me: The memory limits already consider max_tasks_per_node, so this change would make sure each task on the level runs with the same RAM limits.
(10:45:00 AM) tomclegg: brett: ideally the job would tell us ahead of time "I need much ram" or "I don't need many slots" and we wouldn't have to guess. (but the full solution there isn't obvious either)
(10:45:26 AM) Me: Yes, I'm trying to scope this more narrowly than an ideal solution right now. :)
(10:45:58 AM) Me: What I'm proposing would arguably punish you if you queue up a small number of tasks at a level, then many more tasks at that same level… but you shouldn't be doing that anyway.
(10:46:20 AM) tomclegg: brett: right, we could decide before THISROUND to reduce #slots for this level. that'd be safe, and would only have adverse effects for jobs that queue level-N tasks from level-N tasks.
(10:46:33 AM) tomclegg: heh. "shouldn't"
(10:47:26 AM) tomclegg: brett: (I think it's uncommon but I wouldn't go as far as "shouldn't": "do more work, it's ready to do right now" seems pretty reasonable)
(10:50:10 AM) tomclegg: brett: this actually seems pretty close to "ideal" if you think of "only queued 1 task on this level so far" as the closest we have to letting tasks specify resource constraints.
[…]
(10:57:02 AM) tomclegg: anyway, yes, I'm convinced. "virtual max_tasks_per_node for the entire level" sgtm.
(10:57:13 AM) Me: Cool, I'll work on that.