Project

General

Profile

Actions

Container dispatch » History » Revision 18

« Previous | Revision 18/26 (diff) | Next »
Tom Clegg, 01/27/2016 03:42 PM


Container dispatch

Summary

A dispatcher uses available compute resources to execute queued containers.

Dispatch is meant to be a small simple component rather than a pluggable framework: e.g., "slurm dispatch" can be a small standalone program, rather than a plugin for a big generic dispatch program.

Pseudocode

  • Notice there is a queued container
  • Decide whether the required resources are available to run the container
  • Lock the container (this avoids races with other dispatch processes)
  • Translate the container's runtime constraints and priority to instructions for the lower-level scheduler, if any
  • Invoke the "crunch2 run" executor
  • When the priority changes on a container taken by this dispatch process, update the lower-level scheduler accordingly (cancel if priority is zero)
  • If the lower-level scheduler indicates the container is finished or abandoned, but the Container record is locked by this dispatcher and has state=Running, fail the container

Examples

slurm batch mode
  • Use "sinfo" to determine whether it is possible to run the container
  • Submit a batch job to the queue: "echo crunch-run --job {uuid} | sbatch -N1"
  • When container priority changes, use scontrol and scancel to propagate changes to slurm
  • Use strigger to run a cleanup script when a container exits
standalone worker
  • Inspect /proc/meminfo, /proc/cpuinfo, "docker ps", etc. to determine local capacity
  • Invoke crunch-run as a child process (or perhaps a detached daemon process)
  • Signal crunch-run to stop if container priority changes to zero

Arvados API support

Each dispatch process has an Arvados API token that allows it to see queued containers.
  • No two dispatch processes can run at the same time with the same token. One way to achieve this is to make a user record for each dispatch service.
Container APIs relevant to a dispatch program:
  • List Queued containers (might be a subset of Queued containers)
  • List containers with state=Locked or state=Running associated with current token
    • arvados.v1.containers.current (equivalent to filters=[["dispatch_auth_uuid","=",current_client_auth.uuid]])
  • Receive event when container is created or modified and state is Queued (it might become runnable)
  • Change state Queued->Locked
  • Change state Locked->Queued
  • Change state Locked->Running
  • Change state Running->Complete
  • Receive event when priority changes
  • Receive event when state changes to Complete
  • Retrieve an API token to pass into the container and its arv-mount process (via crunch-run)
    • Token is automatically created/assigned when container state changes to Locked
    • Token is automatically expired/destroyed when container state changes away from Running
    • arvados.v1.api_client_authorizations.get(uuid=container.auth_uuid)
  • Create events/logs
    • Decided not to run this container
    • Decided to run this container (e.g., no node with those resources)
    • Lock failed
    • Dispatched to crunch-run
    • Cleaned up crashed crunch-run (lower-level scheduler indicates the job finished, but crunch-run didn't leave the container in a final state)
    • Cleaned up abandoned container (container belongs to this process, but dispatch and lower-level scheduler don't know about it)

Non-responsibilities

Dispatch doesn't retry failed containers. If something needs to be reattempted, a new container will appear in the queue.

Dispatch doesn't fail a container that it can't run. It doesn't know whether other dispatchers will be able to run it.

Additional notes

(see also #6429 and #6518 and #8028)

Using websockets to listen for container events (new containers added, priority changes) will benefit from some Go SDK support.

Updated by Tom Clegg almost 9 years ago · 26 revisions