Cluster configuration » History » Revision 27
Revision 26 (Peter Amstutz, 02/06/2019 02:37 PM) → Revision 27/33 (Tom Clegg, 04/24/2019 01:00 PM)
h1. Cluster configuration We are (2019) consolidating configuration from per-microservice yaml/json/ini files into a single cluster configuration document that is used by all components. * Long term: system nodes automatically keep their configs synchronized (using something like consul). * Short term: sysadmin uses tools like puppet and terraform to ensure /etc/arvados/config.yml is identical on all system nodes. * Hosts without config files (e.g., hosts outside the cluster) can retrieve the config document from the API server. h2. Discovery document Previously, we copied selected config values from the API server config into the API discovery document so clients could see them. When clients can get the configuration document itself, this won't be needed. The discovery document should advertise APIs provided by the server, not cluster configuration. h2. Secrets Secrets like BlobSigningKey can be given literally in the config file (convenient for dev/test, consul-template, etc) or indirectly using a secret backend. Anticipated backends: * <code class="yaml">BlobSigningKey: foobar</code> ⇒ the secret is literally <code>foobar</code> * <code class="yaml">BlobSigningKey: "vault:foobar"</code> ⇒ the secret can be obtained from vault using the vault key "foobar" * <code class="yaml">BlobSigningKey: "file:/foobar"</code> ⇒ the secret can be read from the local file @/foobar@ * <code class="yaml">BlobSigningKey: "env:FOOBAR"</code> ⇒ the secret can be read from the environment variable @FOOBAR@ h2. Instructions for ops Tentative instructions for switching config file format/location: # Upgrade Arvados to a version that supports loading all configs from the new cluster-wide config file (maybe 1.4). When services come back up, they will still use your old configuration files, but they will log some deprecation warnings. # Migrate your configuration to the new config file, one component at a time. For each component: ## Restart the component. ## Inspect the deprecation warning that is logged at startup. It will tell you either "old config file is superfluous" or "new config file is incomplete". ## If your old config file is superfluous, delete it. You're done. ## Run the component with the "--config-diff" flag. This suggests changes to your new config file which will make your old config file obsolete. (Alternatively, run the component with the "--config-dump" flag. This outputs a new config file that would make your old config file obsolete. Saving this might be easier than applying a diff, but it will reorder keys and lose comments.) ## Make the suggested changes. ## Repeat until finished. # Upgrade to a version that doesn't support old config files at all (maybe 1.5). h2. Implementation Development strategy for facilitating the above ops instructions: # Read the new config file into an internal struct, if the new config file exists. # Copy old config file values into the new config struct. # Use the new config struct internally (the old config is no longer referenced except in the load-and-copy-to-new-struct step). # Add a mechanism for showing the effect of the old config file on the resulting config struct (see "--config-diff" above). # At startup, if the old config has any effect (i.e., some parts haven't been migrated to the new config file by the operator), log a deprecation warning recommending "--config-diff" and RTFM. # Wait one minor version release cycle. # Error out if the new config file does not exist. # Error out if the old config file exists (...and some parts of the old config are not redundant [optional?]). h2. Example/template Example config file See also [[Config migration key mapping]] (Format not yet frozen!) Notes: * Keys are CamelCase — except in special cases like PostgreSQL connection settings, which are passed through to another system without being interpreted by Arvados. * Arrays and lists are not permitted. These cannot be expressed natively in consul, and tend to be troublesome anyway: "what changed?" is harder to answer usefully, significance of duplicate elements is unclear, etc. <pre><code class="yaml"> Clusters: xyzzy: # api-server/uuid_prefix, sso/uuid_prefix SystemRootToken: # arvados-git-sync.rb/arvados_api_token, keepstore/SystemAuthTokenFile, c-d-s/AuthToken ManagementToken: # {arvados-ws,keepstore,keepproxy,keep-balance}/ManagementToken (& others) eec1999ccb6d75840a2c09bc70b6d3cbc990744e Services: BlobSigningKey: ungu355able BlobSignatureTTL: 172800 SessionKey: 186005aa54cab1ca95a3738e6e954e0a35a96d3d13a8ea541f4156e8d067b4f3 PostgreSQL: RailsAPI: InternalURLs: "http://zzzzz:8000/": {} ConnectionPool: 32 # api-server/(protocol,host,port) ExternalURL: “https://zzzzz.arvadosapi.com/" Insecure: false max concurrent connections per arvados server daemon GitHTTP: InternalURLs: "http://git:9001/": {} ExternalURL: "https://git.zzzzz.arvadosapi.com/" # api-server/git_repo_https_base Keepstore: InternalURLs: "http://keep0:25107/": {Unlisted: true} "http://keep1:25107/": {Debug: true} Controller: InternalURLs: "http://zzzzz:9004/": {} # controller/NodeProfiles.$cluster.Controller.Listen ExternalURL: "https://zzzzz.arvadosapi.com/" # composer/apiEndPoint, workbench2/API_HOST, workbench/arvados_{login,v1}_base, arvados-ws/Client, keepproxy/Client Websocket: InternalURLs: "http://ws:9003/": {} # arvados-ws/Listen ExternalURL: "https://ws.zzzzz.arvadosapi.com/" # api-server/websocket_address Keepbalance: InternalURLs: "http://zzzzz:9005": {} # keepbalance/Listen GitHTTP: InternalURLs: "http://zzzzz:9001": {} # arvados-git-httpd/Listen ExternalURL: "https://git.zzzzz.arvadosapi.com/" # api-server/git_repo_https_base GitSSH: ExternalURL: "git@git.zzzzz.arvadosapi.com" # api-server/git_repo_ssh_base DispatchCloud: InternalURLs: "http://zzzzz:9006": {} # a-d-c/NodeProfiles SSO: ExternalURL: "https://auth.zzzzz.arvadosapi.com/" # api-server/sso_provider_url Keepproxy: InternalURLs: "http://keep:25107/": {} # keepproxy/Listen ExternalURL: "https://keep.zzzzz.arvadosapi.com/" WebDAV: InternalURLs: "http://keep:9002/": {} # keep-web/Listen ExternalURL: "https://*.collections.zzzzz.arvadosapi.com/" # api-server/keep_web_service_url, workbench/keep_web_url WebDAVDownload: InternalURLs: "http://keep:9002/": {} # keep-web/Listen ExternalURL: "https://download.zzzzz.arvadosapi.com/" # keep-web/AttachmentOnlyHost, workbench/keep_web_download_url Keepstore: InternalURLs: "https://keep0:25107/": {} # keepstore/Listen "https://keep1:25107/": {} # keepstore/Listen Composer: ExternalURL: "http://composer.zzzzz.arvadosapi.com/" # workbench/composer_url WebShell: ExternalURL: "http://webshell.zzzzz.arvadosapi.com/" # workbench/shell_in_a_box_url Workbench1: InternalURLs: "http://workbench:9000": {} # workbench/Nginx.server.listen ExternalURL: "http://workbench.zzzzz.arvadosapi.com/" # workbench/Nginx.server.listen, api-server/workbench_address Workbench2: ExternalURL: "http://workbench2.zzzzz.arvadosapi.com/" # workbench/workbench2_url PostgreSQL: Connection: # arvados-ws/Postgres, controller/PostgreSQL.Connection # All parameters here are passed to the PG client library in a connection string; # see https://www.postgresql.org/docs/current/static/libpq-connect.html#LIBPQ-PARAMKEYWORDS Host: localhost Port: 5432 User: arvados Password: s3cr3t DBName: arvados_production client_encoding: utf8 fallback_application_name: arvados ConnectionPool: # arvados-ws/PostgresPool TLS: HTTPRequestTimeout: 5m Defaults: Certificate: # (literal, file, or acme dir) keepstore/TLSCertificateFile CollectionReplication: 2 Key: # (literal, file, or acme dir) keepstore/TLSKeyFile TrashLifetime: 2w UserActivation: Insecure: ActivateNewUsers: true # workbench/arvados_insecure_https, api-server/sso_insecure Git: GitoliteAdminRepo: # arvados-git-sync.rb/gitolite_url AutoAdminUser: root@example.com GitoliteAdminPublicKey: # arvados-git-sync.rb/gitolite_arvados_git_user_key UserProfileNotificationAddress: notify@example.com GitoliteSyncWorkDir: # arvados-git-sync.rb/gitolite_tmp NewUserNotificationRecipients: {} GitCommand: # arv-git-httpd/GitCommand GitoliteHome: # arv-git-httpd/GitoliteHome Repositories: # api-server/git_repositories_dir (crunch1 only; just assume {GitoliteHome}/repositories?) NewInactiveUserNotificationRecipients: {} API: RequestLimits: DisabledAPIs: # api-server/disable_api_methods MaxRequestLogParamsSize: 2KB WebsocketKeepaliveTimeout: # arvados-ws/PingTimeout WebsocketClientEventQueue: # arvados-ws/ClientEventQueue WebsocketServerEventQueue: # arvados-ws/ServerEventQueue KeepServiceRequestTimeout: # keepproxy/Timeout MaxMemoryBuffers: # keepstore/MaxBuffers MaxConcurrentRequests: # keepstore/MaxRequests MaxRequestSize: # api-server/max_request_size 128MiB MaxIndexDatabaseRead: # api-server/max_index_database_read 128MiB MaxItemsPerResponse: # api-server/max_items_per_response, keep-balance/CollectionBatchSize, keep-balance/CollectionBuffers 1000 MaxRequestAmplification: # controller/RequestLimits.MultiClusterRequestConcurrency MultiClusterRequestConcurrency: 4 LogLevel: info CloudVMs: AsyncPermissionsUpdateInterval: # api-server/async_permissions_update_interval Users: BootProbeCommand: "docker ps -q" AutoSetupNewUsers: # api-server/auto_setup_new_users SSHPort: 22 AutoSetupNewUsersWithVmUUID: SyncInterval: 1m # api-server/auto_setup_new_users_with_vm_uuid how often to get list of active instances from cloud provider AutoSetupNewUsersWithRepository: TimeoutIdle: 1m # api-server/auto_setup_new_users_with_repository shutdown if idle longer than this AutoSetupUsernameBlacklist: TimeoutBooting: 10m # api-server/auto_setup_name_blacklist shutdown if exists longer than this without running BootProbeCommand successfully NewUsersAreActive: # api-server/new_users_are_active AutoAdminUserWithEmail: # api-server/auto_admin_user AutoAdminFirstUser: # api-server/auto_admin_first_user UserProfileNotificationAddress: # api-server/user_profile_notification_address AdminNotifierEmailFrom: # api-server/admin_notifier_email_from EmailSubjectPrefix: # api-server/email_subject_prefix UserNotifierEmailFrom: # api-server/user_notifier_email_from NewUserNotificationRecipients: TimeoutProbe: 2m # api-server/new_user_notification_recipients shutdown if (after booting) communication fails longer than this, even if ctrs are running NewInactiveUserNotificationRecipients: TimeoutShutdown: 1m # api-server/new_inactive_user_notification_recipients shutdown again if node still exists this long after shutdown AnonymousUserToken: # workbench/anonymous_user_token, keep-web/AnonymousTokens Login: Driver: Amazon SiteTitle: # sso/site_title DefaultLinkTitle: # sso/default_link_title DefaultLinkURL: # sso/default_link_url AllowAccountRegistration: # sso/allow_account_registration RequireEmailConfirmation: # sso/require_email_confirmation Google: DriverParameters: ClientID: # sso/google_oauth2_client_id Region: us-east-1 ClientSecret: # sso/google_oauth2_client_secret LDAP: # sso/use_ldap APITimeout: 20s Title: # sso/use_ldap.title AWSAccessKeyID: abcdef Host: # sso/use_ldap.host AWSSecretAccessKey: abcdefghijklmnopqrstuvwxyz Port: # sso/use_ldap.port ImageID: ami-0a01b48b88d14541e Method: # sso/use_ldap.method SubnetID: subnet-24f5ae62 Base: # sso/use_ldap.base Uid: # sso/use_ldap.uid EmailDomain: # sso/use_ldap.email_domain BindDN: # sso/use_ldap.BindDN Password: # sso/user_ldap.password SecretToken: # sso/secret_token ProviderAppSecret: # api-server/sso_app_secret ProviderAppID: # api-server/sso_app_id SecurityGroups: sg-3ec53e2a AuditLogs: Enable: MaxAge: # api-server/max_audit_log_age 2w MaxDeleteBatch: # api-server/max_audit_log_delete_batch DeleteBatchSize: 100000 UnloggedAttributes: {} # api-server/unlogged_attributes (applies to logs table) example: {"manifest_text": true} SystemLogs: ContainerLogStream: LogLevel: # keepstore/Debug, keepproxy/Debug, arvados-ws/LogLevel BatchSize: 4KiB Format: # keepstore/LogFormat, arvados-ws/LogFormat BatchTime: 1s MaxRequestLogParamsSize: # api-server/max_request_log_params_size Collections: ThrottlePeriod: 1m DefaultReplication: # api-server/default_collection_replication, keepproxy/DefaultReplicas ThrottleThresholdSize: 64KiB DefaultTrashLifetime: # api-server/default_trash_lifetime ThrottleThresholdLines: 1024 CollectionVersioning: # api-server/collection_versioning TruncateSize: 64MiB PreserveVersionIfIdle: # api-server/preserve_version_if_idle PartialLineThrottlePeriod: 5s Timers: TrustAllContent: # keep-web/TrustAllContent, workbench/trust_all_content TrashSweepInterval: # api-server/trash_sweep_interval 60s BlobSigningKey: # api-server/blob_signing_key, keepstore/BlobSigningKeyFile ContainerDispatchPollInterval: 10s BlobSigningTTL: # api-server/blob_signature_ttl, keepstore/BlobSignatureTTL APIRequestTimeout: 20s Scaling: BlobSigning: # keepstore/RequireSignatures, api-server/permit_create_collection_with_unsigned_manifest MaxComputeNodes: 64 BlobTrash: EnablePreemptibleInstances: false DisableAPIMethods: {} # keepstore/EnableDelete example: {"jobs.create": true} DockerImageFormats: {"v2": true} Crunch1: BlobTrashLifetime: # keepstore/TrashLifetime Enable: true BlobTrashCheckInterval: # keepstore/TrashCheckInterval CrunchJobWrapper: none BlobTrashConcurrency: # keepstore/TrashWorkers, keep-balance/-commit-trash CrunchJobUser: crunch BlobDeleteConcurrency: # keepstore/EmptyTrashWorkers CrunchRefreshTrigger: /tmp/crunch_refresh_trigger BlobReplicateConcurrency: # keepstore/PullWorkers, keep-balance/-commit-pulls KeepBalanceRunPeriod: 10m # keepbalance/RunPeriod WebDAVCache: TTL: # keep-web/Cache.TTL UUIDTTL: # keep-web/Cache.UUIDTTL MaxCollectionEntries: # keep-web/Cache.MaxCollectionEntries MaxCollectionBytes: # keep-web/Cache.MaxCollectionBytes MaxPermissionEntries: # keep-web/Cache.MaxPermissionEntries MaxUUIDEntries: # keep-web/Cache.MaxUUIDEntries DefaultDockerImage: false Containers: # control how Arvados runs user containers NodeProfiles: SupportedDockerImageFormats: # api-server/docker_image_formats LogReuseDecisions: # api-server/log_reuse_decisions DefaultKeepCacheRAM: # api-server/container_default_keep_cache_ram MaxDispatchAttempts: # api-server/max_container_dispatch_attempts MaxRetryAttempts: # api-server/container_count_max PollInterval: 10s # c-d-s/PollPeriod, a-d-c/Dispatch/PollInterval MinRetryPeriod: 30s # c-d-s/MinRetryPeriod (optional? in case ContainerDispatchPollInterval Key is too short) CrunchRunCommand: "crunch-run" # c-d-s/CrunchRunCommand CrunchRunArguments: ‘[“-cgroup-parent-subsystem=memory”, “-foo=bar”]’ # c-d-s/CrunchRunCommand (should this a profile name; can be named CrunchRunArgumentsJSON?) specified on service prog command line, defaults to $(hostname) ReserveExtraRAM: 256MiB # c-d-s/ReserveExtraRAM UsePreemptibleInstances: # api-server/preemptible_instances MaxComputeVMs: # api-server/max_compute_nodes DispatchPrivateKey: # a-d-c/Dispatch/PrivateKey StaleLockTimeout: # a-d-c/Dispatch/StaleLockTimeout Logging: keep: LogBytesPerEvent: # api-server/crunch_log_bytes_per_event Don’t run other services automatically -- only specified ones LogSecondsBetweenEvents: # api-server/crunch_log_seconds_between_events Default: {Disable: true} LogThrottlePeriod: # api-server/crunch_log_throttle_period Keepstore: {Listen: ":25107"} apiserver: LogThrottleBytes: # api-server/crunch_log_throttle_bytes Default: {Disable: true} LogThrottleLines: # api-server/crunch_log_throttle_lines RailsAPI: {Listen: ":9000", TLS: true} LimitLogBytesPerJob: # api-server/crunch_limit_log_bytes_per_job Controller: {Listen: ":9100"} LogPartialLineThrottlePeriod: # api-server/crunch_log_partial_line_throttle_period Websocket: {Listen: ":9101"} LogUpdatePeriod: # api-server/crunch_log_update_period LogUpdateSize: # api-server/crunch_log_update_size MaxAge: # api-server/clean_container_log_rows_after, api-server/clean_job_log_rows_after Health: {Listen: ":9199"} CloudVMs: keep: Enable: # arvados-dispatch-cloud is in use Default: {Disable: true} BootProbeCommand: # a-d-c/CloudVMs/BootProbeCommand KeepProxy: {Listen: ":9102"} ProbeInterval: # a-d-c/Dispatch/ProbeInterval MaxProbesPerSecond: # a-d-c/Dispatch/MaxProbesPerSecond TimeoutSignal: # a-d-c/Dispatch/TimeoutSignal TimeoutTERM: # a-d-c/Dispatch/TimeoutTERM MaxCloudOpsPerSecond: # a-d-c/CloudVMs/MaxCloudOpsPerSecond SSHPort: # a-d-c/CloudVMs/SSHPort SyncInterval: # a-d-c/CloudVMs/SyncInterval TimeoutIdle: # a-d-c/CloudVMs/TimeoutIdle TimeoutBooting: # a-d-c/CloudVMs/TimeoutBooting TimeoutProbe: # a-d-c/CloudVMs/TimeoutProbe TimeoutShutdown: # a-d-c/CloudVMs/TimeoutShutdown ImageID: # a-d-c/CloudVMs/ImageID Driver: Amazon # a-d-c/CloudVMs/Driver DriverParameters: # a-d-c/CloudVMs/DriverParameters Region: us-east-1 APITimeout: 20s AWSAccessKeyID: abcdef AWSSecretAccessKey: abcdefghijklmnopqrstuvwxyz ImageID: ami-0a01b48b88d14541e SubnetID: subnet-24f5ae62 SecurityGroups: sg-3ec53e2a KeepWeb: {Listen: ":9103"} SLURM: *: Enable: # crunch-dispatch-slurm This section used for a node whose profile name is in use not listed above PrioritySpread: 1000 Default: {Disable: false} # c-d-s/PrioritySpread SbatchArguments: ‘[“-partition=PartitionName”]’ # c-d-s/SbatchArguments KeepServices: 00000-bi6l4-000000000000000: “http://127.0.0.1:25107” # c-d-s/KeepServiceURIs Managed: Enable: # arvados-node-manager (this is in use DNSServerConfDir: # api-server/dns_server_conf_dir DNSServerConfTemplate: # api-server/dns_server_conf_template DNSServerReloadCommand: # api-server/dns_server_reload_command DNSServerUpdateCommand: # api-server/dns_server_update_command ComputeNodeDomain: # api-server/compute_node_domain ComputeNodeNameservers: # api-server/compute_node_nameservers AssignNodeHostname: # api-server/assign_node_hostname the default behavior) Volumes: JobsAPI: xyzzy-keep-0: Enable: # api-server/enable_legacy_jobs_api (crunch1) Type: s3 CrunchJobWrapper: # api-server/crunch_job_wrapper (crunch1) Region: us-east CrunchJobUser: # api-server/crunch_job_user (crunch1) Bucket: xyzzy-keep-0 CrunchRefreshTrigger: # api-server/crunch_refresh_trigger (crunch1) GitInternalDir: # api-server/git_internal_dir (crunch1) ReuseJobIfOutputsDiffer: # api-server/reuse_job_if_outputs_differ DefaultDockerImage: # api-server/default_docker_image_for_jobs [rest of keepstore volume config goes here] Volumes: WebRoutes: # keepstore/Volumes, keep-balance/KeepServiceTypes “default” means route according to method/host/path (e.g., if host is a login shell, route there) xyzzy.arvadosapi.com: default # TODO: some keepstores are closer “collections” means always route to specific volumes keep-web zzzzz-ivpuk-voihjznerfweefq: AccessViaHosts: collections.xyzzy.arvadosapi.com: collections # replaces differing configs on keepstore hosts “http://keep0:25107”: {ReadOnly: true} “http://keep1:25107”: {} “http://keep2:25107”: {ReadOnly: true} “http://keep3:25107”: {ReadOnly: true} leading * is a wildcard (longest match wins) "*--collections.xyzzy.arvadosapi.com": collections cloud.curoverse.com: workbench workbench.xyzzy.arvadosapi.com: workbench "*.xyzzy.arvadosapi.com": default InstanceTypes: m4.large: StorageClasses: # keepstore/S3Volume.StorageClasses, keepstore/AzureBlobVolume.StorageClasses, keepstore/UnixVolume.StorageClasses default: true cold: true Replication: VCPUs: 2 # keepstore/S3Volume.S3Replication, keepstore/AzureBlobVolume.AzureReplication, keepstore/UnixVolume.DirectoryReplication ReadOnly: false # keepstore/S3Volume.ReadOnly, keepstore/AzureBlobVolume.ReadOnly, keepstore/UnixVolume.ReadOnly RAM: 8000000000 Driver: S3 # keepstore/Volumes[].Type Scratch: 31000000000 DriverParameters: AccessKey: # keepstore/S3Volume.AccessKey SecretKey: # keepstore/S3Volume.SecretKey Endpoint: # keepstore/S3Volume.Endpoint Region: # keepstore/S3Volume.Region Bucket: # keepstore/S3Volume.Bucket LocationConstraint: # keepstore/S3Volume.LocationConstraint IndexPageSize: # keepstore/S3Volume.IndexPageSize S3Replication: ConnectTimeout: # keepstore/S3Volume.ConnectTimeout ReadTimeout: # keepstore/S3Volume.ReadTimeout RaceWindow: # keepstore/S3Volume.RaceWindow ReadOnly: # UnsafeDelete: # keepstore/S3Volume.UnsafeDelete Price: 0.1 zzzzz-ivpuk-adbtuyuiivjhbnmb: m4.large-1t: AccessViaHosts: # replaces differing configs on keepstore hosts (TBD: do we need “readonly from these hosts”?) “http://keep1:25107”: {ReadOnly: false} same instance type as m4.large but our scripts attach more scratch StorageClasses: # keepstore/S3Volume.StorageClasses, keepstore/AzureBlobVolume.StorageClasses, keepstore/UnixVolume.StorageClasses default: true cold: false ProviderType: m4.large Replication: VCPUs: 2 # keepstore/S3Volume.S3Replication, keepstore/AzureBlobVolume.AzureReplication, keepstore/UnixVolume.DirectoryReplication ReadOnly: false # keepstore/S3Volume.ReadOnly, keepstore/AzureBlobVolume.ReadOnly, keepstore/UnixVolume.ReadOnly RAM: 8000000000 Driver: Azure # keepstore/Volumes[].Type Scratch: 999000000000 DriverParameters: StorageAccountName: # keepstore/AzureBlobVolume.StorageAccountName StorageAccountKey: # keepstore/AzureBlobVolume.StorageAccountKeyFile StorageBaseURL: # keepstore/AzureBlobVolume.StorageBaseURL ContainerName: # keepstore/AzureBlobVolume.ContainerName RequestTimeout: # keepstore/AzureBlobVolume.RequestTimeout Price: 0.12 zzzzz-ivpuk-2344guvaiubbae4wa: m4.xlarge: Driver: Filesystem # keepstore/Volumes[].Type VCPUs: 4 DriverParameters: Root: # keepstore/UnixVolume.Root Serialize: # keepstore/UnixVolume.Serialize BlockDeviceUUID: # (disable if this is non-empty and does not match the local filesystem device) Mail: MailchimpAPIKey: # api-server/mailchimp_api_key MailchimpListID: # api-server/mailchimp_list_id SendUserSetupNotificationEmail: # workbench/send_user_setup_notification_email IssueReporterEmailFrom: # workbench/issue_reporter_email_from IssueReporterEmailTo: # workbench/issue_reporter_email_to SupportEmailAddress: # workbench/support_email_address EmailFrom: # workbench/email_from RemoteClusters: # api-server/remote_hosts xyzzx: RAM: 16000000000 Host: Scratch: 78000000000 Proxy: false Price: 0.2 m4.8xlarge: Scheme: https VCPUs: 40 Insecure: false RAM: 160000000000 ActivateUsers: false “*”: # api-server/remote_hosts_via_dns Scratch: 156000000000 ActivateUsers: false Workbench: Price: 2 Theme: default # workbench/arvados_theme ActivationContactLink: # workbench/activation_contact_link ArvadosDocsite: # workbench/arvados_docsite ArvadosPublicDataDocURL: # workbench/arvados_public_data_doc_url ShowUserAgreementInline: # workbench/show_user_agreement_inline SecretToken: # workbench/secret_token SecretKeyBase: # workbench/secret_key_base RepositoryCache: # workbench/repository_cache UserProfileFormFields: # workbench/user_profile_form_fields UserProfileFormMessage # workbench/user_profile_form_message ApplicationMimetypesWithViewIcon: # workbench/application_mimetypes_with_view_icon LogViewerMaxBytes: # workbench/log_viewer_max_bytes EnablePublicProjectsPage: # workbench/enable_public_projects_page EnableGettingStartedPopup: # workbench/enable_getting_started_popup ApiResponseCompression: # workbench/api_response_compression APIClientConnectTimeout: # workbench/api_client_connect_timeout APIClientReceiveTimeout: # workbench/api_client_receive_timeout RunningJobLogRecordsToFetch: # workbench/running_job_log_records_to_fetch ShowRecentCollectionsOnDashboard: # workbench/show_recent_collections_on_dashboard ShowUserNotifications: # workbench/show_user_notifications MultiSiteSearch: # workbench/multi_site_search Repositories: # workbench/repositories SiteName: # workbench/site_name VocabularyURL: # workbench2/VOCABULARY_URL FileViewersConfigURL: # workbench2/FILE_VIEWERS_CONFIG_URL InstanceTypes: x1l: m4.16xlarge: ProviderType: x1.large VCPUs: 16 64 RAM: 128GiB 256000000000 Scratch: 128GB 310000000000 IncludedScratch: 128GB Price: 3.2 c4.large: AddedScratch: 0 VCPUs: 2 RAM: 3750000000 Price: 1.23 0.1 c4.8xlarge: Preemptible: false VCPUs: 36 RAM: 60000000000 Price: 1.591 TODO: RemoteClusters: KeepproxyDisableGet xrrrr: Host: xrrrr.arvadosapi.com Proxy: true # keepproxy/DisableGet (retire this feature / use Nginx instead / use a per-token permission instead) KeepproxyDisablePut proxy requests to xrrrr on behalf of our clients AuthProvider: true # keepproxy/DisablePut (retire this feature / users authenticated by xrrrr can use Nginx instead / use a per-token permission instead) RailsSessionSecretToken: # api-server/secret_token (should this be generated at runtime from superusertoken?) InternalIPNetworks: # Nginx $external_client our cluster </code></pre> h2. Go Configuration Framework Options Viper and go-config seem to be the leading go config framework contenders considering some of our long term goals (config synchronization); but viper seems to be the more widely adopted of the two. *spf13/viper:* https://github.com/spf13/viper *micro/go-config* https://github.com/micro/go-config - more useful - https://micro.mu/docs/go-config.html Both solutions are very similar in terms of reported functionality. Both have watch support, and would allow for merging flags, environment variables, remote key stores (Consul), and our master YAML config. Viper also supports encrypted remote key/value access.