Project

General

Profile

Actions

Bug #13166

closed

[node manager] wishlist should consist of top priority containers

Added by Peter Amstutz almost 7 years ago. Updated over 6 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
-
Target version:
Start date:
03/26/2018
Due date:
% Done:

100%

Estimated time:
(Total: 0.00 h)
Story points:
-
Release relationship:
Auto

Description

Node manager calculates a "wishlist" which is how many nodes of each size are desired.

The current behavior reports the entire queue, the creates nodes starting with the largest size and working its way down to the smallest. The rationale being that small jobs can run on large nodes but not the other way around.

However, #12199 changes the behavior such that containers will be scheduled on specific node sizes. If the queue is much larger than the maximum number of nodes, this could lead to a situation in which the top of the queue consists of small jobs, but only large nodes are available. In this case, it will either schedule jobs out of order (with low-priority large-node jobs jumping the queue) or deadlock.

To fix:

  • After getting the contents of squeue, sort in decending order by slurm priority
  • Only consider the top (max_nodes - up_nodes) items in the wishlist and discard the remaining

Subtasks 1 (0 open1 closed)

Task #13273: Review 13166-nodemanager-whishlistClosedPeter Amstutz03/26/2018

Actions

Related issues 2 (0 open2 closed)

Related to Arvados - Story #12552: When priority is equal, the children of an earlier-submitted workflow should run firstResolvedTom Clegg11/03/2017

Actions
Related to Arvados - Bug #12199: Don't schedule jobs on nodes which are too much bigger than requestedResolvedTom Clegg01/29/2018

Actions
Actions

Also available in: Atom PDF