Project

General

Profile

Actions

Cluster configuration » History » Revision 23

« Previous | Revision 23/33 (diff) | Next »
Tom Clegg, 01/23/2019 08:38 PM


Cluster configuration

We are (2019) consolidating configuration from per-microservice yaml/json/ini files into a single cluster configuration document that is used by all components.
  • Long term: system nodes automatically keep their configs synchronized (using something like consul).
  • Short term: sysadmin uses tools like puppet and terraform to ensure /etc/arvados/config.yml is identical on all system nodes.
  • Hosts without config files (e.g., hosts outside the cluster) can retrieve the config document from the API server.

Discovery document

Previously, we copied selected config values from the API server config into the API discovery document so clients could see them. When clients can get the configuration document itself, this won't be needed. The discovery document should advertise APIs provided by the server, not cluster configuration.

Secrets

Secrets like BlobSigningKey can be given literally in the config file (convenient for dev/test, consul-template, etc) or indirectly using a secret backend. Anticipated backends:
  • BlobSigningKey: foobar ⇒ the secret is literally foobar
  • BlobSigningKey: "vault:foobar" ⇒ the secret can be obtained from vault using the vault key "foobar"
  • BlobSigningKey: "file:/foobar" ⇒ the secret can be read from the local file /foobar
  • BlobSigningKey: "env:FOOBAR" ⇒ the secret can be read from the environment variable FOOBAR

Instructions for ops

Tentative instructions for switching config file format/location in an operator-friendly way:
  1. Upgrade Arvados to a version that supports loading the new config file (maybe 1.4). Services will restart with your old configuration, but they will log some deprecation warnings at startup.
  2. Migrate your configuration to the new config file, one component at a time. For each component:
    1. Restart the component.
    2. Inspect the deprecation warning that is logged at startup. It will tell you either "old config file is superfluous" or "new config file is incomplete".
    3. If your old config file is superfluous, delete it. You're done.
    4. Run the component with the "--config-diff" flag. This suggests changes to your new config file which will make your old config file obsolete. (Alternatively, run the component with the "--config-dump" flag. This outputs a new config file that would make your old config file obsolete. Saving this might be easier than applying a diff, but it will reorder keys and lose comments.)
    5. Make the suggested changes.
    6. Repeat until finished.
  3. Upgrade to a version that doesn't support old config files at all (maybe 1.5).

Implementation

Development strategy for facilitating the above ops instructions:
  1. Read the new config file into an internal struct, if the new config file exists.
  2. Copy old config file values into the new config struct.
  3. Use the new config struct internally (the old config is no longer referenced except in the load-and-copy-to-new-struct step).
  4. Add a mechanism for showing the effect of the old config file on the resulting config struct (see "--config-diff" above).
  5. At startup, if the old config has any effect (i.e., some parts haven't been migrated to the new config file by the operator), log a deprecation warning recommending "--config-diff" and RTFM.
  6. Wait one minor version release cycle.
  7. Error out if the new config file does not exist.
  8. Error out if the old config file exists (...and some parts of the old config are not redundant [optional?]).

Example config file

(Format not yet frozen!)

Notes:
  • Keys are CamelCase — except in special cases like PostgreSQL connection settings, which are passed through to another system without being interpreted by Arvados.
  • Arrays and lists are not permitted. These cannot be expressed natively in consul, and tend to be troublesome anyway: "what changed?" is harder to answer usefully, significance of duplicate elements is unclear, etc.
Clusters:
  xyzzy:
    ManagementToken: eec1999ccb6d75840a2c09bc70b6d3cbc990744e
    BlobSigningKey: ungu355able
    BlobSignatureTTL: 172800
    SessionKey: 186005aa54cab1ca95a3738e6e954e0a35a96d3d13a8ea541f4156e8d067b4f3
    PostgreSQL:
      ConnectionPool: 32 # max concurrent connections per arvados server daemon
      Connection:
        # All parameters here are passed to the PG client library in a connection string;
        # see https://www.postgresql.org/docs/current/static/libpq-connect.html#LIBPQ-PARAMKEYWORDS
        Host: localhost
        Port: 5432
        User: arvados
        Password: s3cr3t
        DBName: arvados_production
        client_encoding: utf8
        fallback_application_name: arvados
    HTTPRequestTimeout: 5m
    Defaults:
      CollectionReplication: 2
      TrashLifetime: 2w
    UserActivation:
      ActivateNewUsers: true
      AutoAdminUser: root@example.com
      UserProfileNotificationAddress: notify@example.com
      NewUserNotificationRecipients: {}
      NewInactiveUserNotificationRecipients: {}
    RequestLimits:
      MaxRequestLogParamsSize: 2KB
      MaxRequestSize: 128MiB
      MaxIndexDatabaseRead: 128MiB
      MaxItemsPerResponse: 1000
      MultiClusterRequestConcurrency: 4
    LogLevel: info
    CloudVMs:
      BootProbeCommand: "docker ps -q" 
      SSHPort: 22
      SyncInterval: 1m    # how often to get list of active instances from cloud provider
      TimeoutIdle: 1m     # shutdown if idle longer than this
      TimeoutBooting: 10m # shutdown if exists longer than this without running BootProbeCommand successfully
      TimeoutProbe: 2m    # shutdown if (after booting) communication fails longer than this, even if ctrs are running
      TimeoutShutdown: 1m # shutdown again if node still exists this long after shutdown
      Driver: Amazon
      DriverParameters:
        Region: us-east-1
        APITimeout: 20s
        AWSAccessKeyID: abcdef
        AWSSecretAccessKey: abcdefghijklmnopqrstuvwxyz
        ImageID: ami-0a01b48b88d14541e
        SubnetID: subnet-24f5ae62
        SecurityGroups: sg-3ec53e2a
    AuditLogs:
      MaxAge: 2w
      DeleteBatchSize: 100000
      UnloggedAttributes: {} # example: {"manifest_text": true}
    ContainerLogStream:
      BatchSize: 4KiB
      BatchTime: 1s
      ThrottlePeriod: 1m
      ThrottleThresholdSize: 64KiB
      ThrottleThresholdLines: 1024
      TruncateSize: 64MiB
      PartialLineThrottlePeriod: 5s
    Timers:
      TrashSweepInterval: 60s
      ContainerDispatchPollInterval: 10s
      APIRequestTimeout: 20s
    Scaling:
      MaxComputeNodes: 64
      EnablePreemptibleInstances: false
    DisableAPIMethods: {} # example: {"jobs.create": true}
    DockerImageFormats: {"v2": true}
    Crunch1:
      Enable: true
      CrunchJobWrapper: none
      CrunchJobUser: crunch
      CrunchRefreshTrigger: /tmp/crunch_refresh_trigger
      DefaultDockerImage: false
    NodeProfiles:
      # Key is a profile name; can be specified on service prog command line, defaults to $(hostname)
      keep:
        # Don’t run other services automatically -- only specified ones
        Default: {Disable: true}
        Keepstore: {Listen: ":25107"}
      apiserver:
        Default: {Disable: true}
        RailsAPI: {Listen: ":9000", TLS: true}
        Controller: {Listen: ":9100"}
        Websocket: {Listen: ":9101"}
        Health: {Listen: ":9199"}
      keep:
        Default: {Disable: true}
        KeepProxy: {Listen: ":9102"}
        KeepWeb: {Listen: ":9103"}
      *:
        # This section used for a node whose profile name is not listed above
        Default: {Disable: false} # (this is the default behavior)
    Volumes:
      xyzzy-keep-0:
        Type: s3
        Region: us-east
        Bucket: xyzzy-keep-0
        # [rest of keepstore volume config goes here]
    WebRoutes:
      # “default” means route according to method/host/path (e.g., if host is a login shell, route there)
      xyzzy.arvadosapi.com: default
      # “collections” means always route to keep-web
      collections.xyzzy.arvadosapi.com: collections
      # leading * is a wildcard (longest match wins)
      "*--collections.xyzzy.arvadosapi.com": collections
      cloud.curoverse.com: workbench
      workbench.xyzzy.arvadosapi.com: workbench
      "*.xyzzy.arvadosapi.com": default
    InstanceTypes:
      m4.large:
        VCPUs: 2
        RAM: 8000000000
        Scratch: 31000000000
        Price: 0.1
      m4.large-1t:
        # same instance type as m4.large but our scripts attach more scratch
        ProviderType: m4.large
        VCPUs: 2
        RAM: 8000000000
        Scratch: 999000000000
        Price: 0.12
      m4.xlarge:
        VCPUs: 4
        RAM: 16000000000
        Scratch: 78000000000
        Price: 0.2
      m4.8xlarge:
        VCPUs: 40
        RAM: 160000000000
        Scratch: 156000000000
        Price: 2
      m4.16xlarge:
        VCPUs: 64
        RAM: 256000000000
        Scratch: 310000000000
        Price: 3.2
      c4.large:
        VCPUs: 2
        RAM: 3750000000
        Price: 0.1
      c4.8xlarge:
        VCPUs: 36
        RAM: 60000000000
        Price: 1.591
    RemoteClusters:
      xrrrr:
        Host: xrrrr.arvadosapi.com
        Proxy: true        # proxy requests to xrrrr on behalf of our clients
        AuthProvider: true # users authenticated by xrrrr can use our cluster

Updated by Tom Clegg almost 6 years ago · 33 revisions