Project

General

Profile

Actions

Feature #20471

closed

Admin CLI for managing dispatcher / cloud VMs

Added by Tom Clegg almost 3 years ago. Updated about 2 months ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
Crunch
Target version:
Story points:
-
Release relationship:
Auto

Description

arvados-server instance list

arvados-server instance kill -reason "optional reason" {instanceID|containerUUID}

See Admin CLI for managing dispatcher / cloud VMs


Subtasks 1 (0 open1 closed)

Task #23372: Review 20471-instances-listResolvedTom Clegg01/20/2026Actions

Related issues 1 (0 open1 closed)

Has duplicate Arvados - Feature #16843: [a-d-c] admin cliDuplicateActions
Actions #1

Updated by Tom Clegg 3 months ago

Actions #2

Updated by Brett Smith 3 months ago

  • Target version changed from Future to Development 2026-01-06
  • Assigned To set to Tom Clegg
  • Category set to Crunch
Actions #3

Updated by Brett Smith 3 months ago

  • Subtask #23372 added
Actions #4

Updated by Tom Clegg 3 months ago · Edited

In progress:

20471-instances-list @ 734922787244520846f1b9ff4f6544bb9226915f -- developer-run-tests: #4999

Basic instance list functionality (even less than described on Admin CLI for managing dispatcher / cloud VMs wiki). No tests.

tordo:~# ./arvados-server-20471 instance list -header
instance        address state   idle-behavior   config-type     provider-type   price   last-container
i-03d59cfcfacf307ff     10.253.254.184  running run     c5large c5.large        0.085000        tordo-dz642-r6fz90awybvywr6
i-0df614e93e4170ae7     10.253.254.157  booting run     t3small t3.small        0.020800
Actions #5

Updated by Tom Clegg 3 months ago

  • Status changed from New to In Progress
Actions #6

Updated by Brett Smith 3 months ago

  • Target version changed from Development 2026-01-06 to Development 2026-01-21
Actions #7

Updated by Tom Clegg 3 months ago

20471-instances-list @ c7456fd0f8252bc8ae965a567f1528de568a35b3 -- developer-run-tests: #5000

with instance kill, drain, hold, and run.

Actions #8

Updated by Tom Clegg 3 months ago

Actions #9

Updated by Tom Clegg 2 months ago

Actions #10

Updated by Tom Clegg 2 months ago

  • All agreed upon points are implemented / addressed. Describe changes from pre-implementation design.
  • Anything not implemented (discovered or discussed during work) has a follow-up story.
  • Code is tested and passing, both automated and manual, what manual testing was done is described.
    • Added tests for list/hold/drain
    • Tested list/hold/drain/run manually on tordo
  • The tested code incorporates recent main branch changes.
  • New or changed UI/UX has gotten feedback from stakeholders.
    • n/a
  • Documentation has been updated.
  • Behaves appropriately at the intended scale (describe intended scale).
    • ✅ no effect on scale
  • Considered backwards and forwards compatibility issues between client and server.
    • ✅ n/a
  • Follows our coding standards and GUI style guidelines.
Actions #11

Updated by Brett Smith 2 months ago

Tom Clegg wrote in #note-9:

20471-instances-list @ e0a69df653fceaf8b4bd7d8278a64099ac2bebb1 -- developer-run-tests: #5004

I will keep reading the branch but this test failure looks real and needs to be addressed.

Actions #12

Updated by Brett Smith 2 months ago

Implementation

See earlier note about the test failures.

As an admin, the "last-container" header makes me wonder: is this the last container that started? Or completed? It looks like this terminology is already baked into the API and I realize there's some value to staying consistent. At the very least I think the documentation should clarify this. I'm open to other ideas about how to make this clearer.

Documentation

The -reason flag should be documented.

I propose laying out the command synopses like:

# arvados-server instance drain <instance-id> [instance-id …]
I think a pre block just makes for nicer presentation. I think listing a second instance-id in brackets follows man page conventions a little closer and helps clarify the command takes one or more IDs as arguments.

Fobbing people off to the API documentation for a description of each state is reader unfriendly. IMO you should just copy and paste the descriptions. I would also be fine if you at least made an anchor for each relevant section and linked to that; or made includes describing each state and used them in both places.

It would be nice if the API documentation was updated to link to the new doc page and suggest using arvados-server.

Tests

If the new tests fail, they're also reader unfriendly: because we're doing a bunch of not-really-related tests in a large function, and a lot of them have similar assertions, the only real way you have to figure out what failed is to open the source. At the very least, I think you should add Commentfs to each assertion to explain what specifically failed. If you want to go farther like separating out test functions, I wouldn't complain.

Actions #13

Updated by Tom Clegg 2 months ago

Brett Smith wrote in #note-12:

See earlier note about the test failures.

Fixed.

As an admin, the "last-container" header makes me wonder: is this the last container that started? Or completed? It looks like this terminology is already baked into the API and I realize there's some value to staying consistent. At the very least I think the documentation should clarify this. I'm open to other ideas about how to make this clearer.

Last started. Added to management API doc page, and changed "not yet run" to "not yet started" on CLI doc page.

The -reason flag should be documented.

Added.

I propose laying out the command synopses like: [...]I think a pre block just makes for nicer presentation. I think listing a second instance-id in brackets follows man page conventions a little closer and helps clarify the command takes one or more IDs as arguments.

Updated formatting.

Fobbing people off to the API documentation for a description of each state is reader unfriendly. IMO you should just copy and paste the descriptions. I would also be fine if you at least made an anchor for each relevant section and linked to that; or made includes describing each state and used them in both places.

Copied descriptions.

It would be nice if the API documentation was updated to link to the new doc page and suggest using arvados-server.

Done.

If the new tests fail, they're also reader unfriendly: because we're doing a bunch of not-really-related tests in a large function, and a lot of them have similar assertions, the only real way you have to figure out what failed is to open the source. At the very least, I think you should add Commentfs to each assertion to explain what specifically failed. If you want to go farther like separating out test functions, I wouldn't complain.

The new tests are a sequence of operations with checks at each step. But yes, they are independent of the existing management API test, so I've moved them to a new test func.

Added some inline comments, which is the gocheck-recommended way to add static comment strings.

          c.Check(exitcode, check.Equals, 0)         // `instance drain` should succeed
          c.Check(stdout.String(), check.Equals, ``) // `instance drain` should output nothing
          c.Check(stderr.String(), check.Matches,    // `instance drain` should show feedback on stderr

20471-instances-list @ 013de101e3531a1947d12cc3f6ca69db77f4cd8f -- developer-run-tests: #5015

Actions #14

Updated by Brett Smith 2 months ago

Tom Clegg wrote in #note-13:

20471-instances-list @ 013de101e3531a1947d12cc3f6ca69db77f4cd8f -- developer-run-tests: #5015

I propose 736ef61a2afe41d76cff953df9c73bee92ad3e4d to make the CLI link in the API docs more prominent. Please take a look and merge either with or without that change. Thanks.

Actions #15

Updated by Tom Clegg 2 months ago

  • Status changed from In Progress to Resolved
Actions #16

Updated by Brett Smith about 2 months ago

  • Release set to 84
Actions

Also available in: Atom PDF