Project

General

Profile

Actions

Feature #14922

open

[crunch-dispatch-cloud] Run multiple containers concurrently on a single VM

Added by Tom Clegg about 5 years ago. Updated over 1 year ago.

Status:
New
Priority:
Normal
Assigned To:
-
Category:
Crunch
Target version:
-
Start date:
Due date:
% Done:

0%

Estimated time:
Story points:
2.0
Release:
Release relationship:
Auto

Description

Run a new container on an already-occupied VM (instead of using an idle VM or creating a new one) when the following conditions apply:
  • the occupied VM is the same price as the instance type that would normally be chosen for the new container
  • the occupied VM has enough unallocated RAM, scratch, and VCPUs to accommodate the new container
  • either all containers on the VM allocate >0 VCPUs, or the instance will still have non-zero unallocated space after adding the new container (this ensures that an N-VCPU container does not share an N-VCPU instance with anything, even 0-VCPU containers)
  • the occupied VM has IdleBehavior=run (not hold or drain)

If multiple occupied VMs satisfy these criteria, choose the one that has the most containers already running, or the most RAM already occupied. This will tend to drain the pool of shared VMs rather than keeping many underutilized VMs alive after a busy period subsides to a less-busy period.

Typically, "same price as a dedicated node, but has spare capacity" will only happen with the cheapest instance type, but it might also apply to larger sizes if the menu has big size steps. Either way, this rule avoids the risk of wasting money by scheduling small long-running containers onto big nodes. In future, this rule may be configurable (to accommodate workloads that benefit more by sharing than they lose by underusing nodes).

Typically, the smallest node type has 1 VCPU, so this feature is useful only if container requests and containers can either
  • specify a minimum of zero CPUs, or
  • specify a fractional number of CPUs.

...so ensure at least one of those is possible.


Related issues

Related to Arvados - Feature #15370: [arvados-dispatch-cloud] loopback driverResolvedTom Clegg05/17/2022

Actions
Actions #1

Updated by Tom Clegg about 5 years ago

  • Description updated (diff)
Actions #2

Updated by Tom Morris about 5 years ago

  • Story points set to 2.0
Actions #3

Updated by Tom Morris about 5 years ago

  • Target version changed from To Be Groomed to Arvados Future Sprints
Actions #4

Updated by Tom Clegg almost 5 years ago

  • Related to Feature #15370: [arvados-dispatch-cloud] loopback driver added
Actions #5

Updated by Tom Clegg over 4 years ago

  • Subject changed from [crunch-dispatch-cloud] Run multiple containers on a single VM to [crunch-dispatch-cloud] Run multiple containers concurrently on a single VM
Actions #6

Updated by Peter Amstutz almost 3 years ago

  • Target version deleted (Arvados Future Sprints)
Actions #7

Updated by Lucas Di Pentima over 1 year ago

  • Release set to 60
Actions

Also available in: Atom PDF