Project

General

Profile

Actions

Idea #22939

open

Standardize `arvados-server boot` for running from source

Added by Peter Amstutz 10 months ago. Updated 9 months ago.

Status:
New
Priority:
Normal
Assigned To:
-
Target version:
-
Start date:
Due date:
Story points:
-
Release:
Release relationship:
Auto

Description

Point for discussion:

We have too many ways to install, configuration and run Arvados. I'd like to propose that the following three cases are distinct and require distinct solutions, but within each case it should be possible to standardize on a single approach.

  1. Running from source for development and testing. This means building services from source on demand, writing a self-contained configuration, and launching services fully automatically.
  2. Running from packages. This is the production install which needs to provide maximum operational flexibility, but also should lean on ops tools like systemd, ansible, etc.
  3. Running each service in a separate standalone container with container orchestration (e.g. docker composer or Kubernetes). We don't do this currently, but I think doing this cleanly would require treating it as a distinct configuration from the first two.

In the first case, we currently have run_test_server.py, arvbox, and arvados-server boot. I'd like to propose migrating the places where we use the other two solutions to standardize on arvados-server boot.

In the second case, we already have consensus on migrating from the salt installer to ansible for installation and configuration. This migration is already underway.

In the third case, we do not support fine-grained container deployment, but it would be good to have consensus that if we did so in the future, it would be distinct from the other two cases (possibly container images could be built using the same packages as the 2nd case, but configuration and orchestration would be handled completely differently).


Related issues 3 (1 open2 closed)

Related to Arvados - Idea #22580: arvbox 2.0NewActions
Related to Arvados - Bug #22934: run cwl conformance tests using 'arvados-server boot' instead of arvboxDuplicateActions
Related to Arvados - Feature #23006: Port CWL integration tests to pytestResolvedBrett SmithActions
Actions #1

Updated by Peter Amstutz 10 months ago

  • Position changed from 154179 to 154188
Actions #2

Updated by Peter Amstutz 10 months ago

  • Description updated (diff)
  • Subject changed from Standardize arvados-boot for running from source and packaging/systemd everywhere else to Standardize arvados-boot for running from source and systemd everywhere else
Actions #3

Updated by Peter Amstutz 10 months ago

  • Subject changed from Standardize arvados-boot for running from source and systemd everywhere else to Standardize arvados-boot for running from source
Actions #4

Updated by Brett Smith 10 months ago

  • Subject changed from Standardize arvados-boot for running from source to Standardize `arvados-server boot` for running from source
Actions #5

Updated by Brett Smith 10 months ago

Actions #6

Updated by Brett Smith 10 months ago

  • Related to Bug #22934: run cwl conformance tests using 'arvados-server boot' instead of arvbox added
Actions #7

Updated by Peter Amstutz 10 months ago

  • Target version changed from Development 2025-07-09 to Future
Actions #8

Updated by Brett Smith 9 months ago

Auditing things that use run_test_server.TestCaseWithServers:

  • Most tests only start RailsAPI and/or a Keep server. One test runs keepproxy.
  • The only configuration they do is enabling/disabling Keep signing. (I wonder how much effort it would be to just make the old tests ready to handle signed locators.)

Omitting run_test_server.py itself:

$ git grep -nE '(MAIN|WS|KEEP|KEEP_PROXY|KEEP_WEB)_SERVER *=' 9b00512b586547cc36be1ade99c311cc54a4f197
9b00512b586547cc36be1ade99c311cc54a4f197:sdk/python/tests/test_api.py:41:    MAIN_SERVER = {}
9b00512b586547cc36be1ade99c311cc54a4f197:sdk/python/tests/test_api.py:530:    MAIN_SERVER = {}
9b00512b586547cc36be1ade99c311cc54a4f197:sdk/python/tests/test_arv_copy.py:23:    MAIN_SERVER = {}
9b00512b586547cc36be1ade99c311cc54a4f197:sdk/python/tests/test_arv_copy.py:24:    KEEP_SERVER = {}
9b00512b586547cc36be1ade99c311cc54a4f197:sdk/python/tests/test_arv_get.py:25:    MAIN_SERVER = {}
9b00512b586547cc36be1ade99c311cc54a4f197:sdk/python/tests/test_arv_get.py:26:    KEEP_SERVER = {}
9b00512b586547cc36be1ade99c311cc54a4f197:sdk/python/tests/test_arv_put.py:814:    MAIN_SERVER = {}
9b00512b586547cc36be1ade99c311cc54a4f197:sdk/python/tests/test_arv_put.py:959:    MAIN_SERVER = {}
9b00512b586547cc36be1ade99c311cc54a4f197:sdk/python/tests/test_arv_put.py:960:    KEEP_SERVER = {'blob_signing': True}
9b00512b586547cc36be1ade99c311cc54a4f197:sdk/python/tests/test_benchmark_collections.py:16:    MAIN_SERVER = {}
9b00512b586547cc36be1ade99c311cc54a4f197:sdk/python/tests/test_collections.py:32:    MAIN_SERVER = {}
9b00512b586547cc36be1ade99c311cc54a4f197:sdk/python/tests/test_collections.py:911:    MAIN_SERVER = {}
9b00512b586547cc36be1ade99c311cc54a4f197:sdk/python/tests/test_collections.py:912:    KEEP_SERVER = {}
9b00512b586547cc36be1ade99c311cc54a4f197:sdk/python/tests/test_collections.py:1122:    MAIN_SERVER = {}
9b00512b586547cc36be1ade99c311cc54a4f197:sdk/python/tests/test_collections.py:1123:    KEEP_SERVER = {}
9b00512b586547cc36be1ade99c311cc54a4f197:sdk/python/tests/test_events.py:73:    MAIN_SERVER = {}
9b00512b586547cc36be1ade99c311cc54a4f197:sdk/python/tests/test_http_cache.py:105:    MAIN_SERVER = {}
9b00512b586547cc36be1ade99c311cc54a4f197:sdk/python/tests/test_keep_client.py:40:    MAIN_SERVER = {}
9b00512b586547cc36be1ade99c311cc54a4f197:sdk/python/tests/test_keep_client.py:41:    KEEP_SERVER = {}
9b00512b586547cc36be1ade99c311cc54a4f197:sdk/python/tests/test_keep_client.py:138:    MAIN_SERVER = {}
9b00512b586547cc36be1ade99c311cc54a4f197:sdk/python/tests/test_keep_client.py:139:    KEEP_SERVER = {'blob_signing': True}
9b00512b586547cc36be1ade99c311cc54a4f197:sdk/python/tests/test_keep_client.py:188:    MAIN_SERVER = {}
9b00512b586547cc36be1ade99c311cc54a4f197:sdk/python/tests/test_keep_client.py:189:    KEEP_SERVER = {}
9b00512b586547cc36be1ade99c311cc54a4f197:sdk/python/tests/test_keep_client.py:190:    KEEP_PROXY_SERVER = {}
Actions #9

Updated by Brett Smith 9 months ago

One thing that would make it easier to standardize on arvados-server boot is if it had a nicer interface. Looking at source:lib/boot/example.sh (omitting unrelated lines):

coproc boot (arvados-server boot -type test -config doc/examples/config/zzzzz.yml -own-temporary-database -timeout 20m)
read controllerURL <&"${boot[0]}" 
# Copy coproc's stdout to stderr, to ensure `arvados-server boot`
# doesn't get blocked trying to write stdout.
exec 7<&"${boot[0]}"; coproc consume_stdout (cat <&7 >&2)

No offense but this is a bit of a howler. Read one line from stdout then redirect it to stderr? Sure you can write this code in any language, but who wants to? Might be nice if arvados-server boot grew an option to write information about the cluster to a given path (and/or fd?) in a standard format (I would take either environment variables or JSON, and I could maybe be talked into YAML).

Actions #10

Updated by Brett Smith 9 months ago

Outside of run-tests.sh, there is only one place we call run_test_server.py as a script:

$ git grep -nF run_test_server.py 9b00512b586547cc36be1ade99c311cc54a4f197
9b00512b586547cc36be1ade99c311cc54a4f197:sdk/go/arvadostest/run_servers.go:78:  cmdArgs := []string{"run_test_server.py", "start_keep", "--num-keep-servers", strconv.Itoa(numKeepServers)}
9b00512b586547cc36be1ade99c311cc54a4f197:sdk/go/arvadostest/run_servers.go:91:  cmd := exec.Command("python", "run_test_server.py", "stop_keep", "--num-keep-servers", strconv.Itoa(numKeepServers))
Actions #11

Updated by Brett Smith 9 months ago

  • Target version deleted (Future)
  • Category deleted (Deployment)
  • Project changed from Arvados to Arvados Epics
Actions #12

Updated by Brett Smith 9 months ago

  • Related to Feature #23006: Port CWL integration tests to pytest added
Actions #13

Updated by Brett Smith 9 months ago

Brett Smith wrote in #note-8:

Auditing things that use run_test_server.TestCaseWithServers:

Tom suspects these are noops because the code sees the cluster started by run-tests.sh and declines to do anything. To confirm.

Outside of run-tests.sh, there is only one place we call run_test_server.py as a script:

These are for keep-balance tests which understandably needs different numbers of keepstores to test against.

Actions #14

Updated by Brett Smith 9 months ago

  • Release set to 28
Actions

Also available in: Atom PDF