Version 1 - History - Multi-cluster user database - Arvados

1

Tom Clegg

h1. Multi-cluster user database

2

3

It is sometimes desirable to share a single user database across multiple Arvados clusters. For example:

4

* Clusters aaaaa, bbbbb, ccccc are on different continents.

5

* A down/unreachable cluster should not prevent a user from accessing _other_ clusters -- even if the down/unreachable cluster is normally the best/default one from that user's perspective.

6

7

This requires some changes to authentication (obtaining and validating API tokens).

8

9

h2. Obtaining tokens

10

11

Each user must be able to log in to their account using any cluster, regardless of where/whether they have logged in previously. This contrasts with the current setup, where each user account has a "home cluster" which must be used to log in.

12

13

To achieve this (without depending real-time communication between clusters) we need all of the participating clusters to agree on a mapping of upstream authentication results to Arvados user UUIDs. For example, if the upstream authentication result is @"ldap://ldap.example foo@bar.example"@ ("ldap://ldap.example assures us this user is foo@bar.example"):

14

# If a row already exists in the users table with <code>upstream == "ldap://ldap.example foo@bar.example"</code> then use that row

15

# Otherwise, create a new row with user UUID "fffff-tpzed-${sha1part(upstream)}" (where fffff is a common prefix used by all participating clusters and sha1part() is the first 15 chars of base-36-encoded sha1())

16

17

To avoid changing existing user accounts' UUIDs after this change, we would do a one-time synchronization across all participating clusters. For example, if "aaaaa-tpzed-012340123401234" exists on cluster aaaaa, we would add that row to bbbbb and ccccc as well. Next time that user logs in to bbbbb, bbbbb would issue a token itself, rather than deferring to aaaaa.

18

19

h2. Validating tokens

20

21

Each cluster must be able to validate a token that was issued by a different, currently unreachable, cluster. This contrasts with the current setup, where aaaaa validates tokens issued by bbbbb by doing a callback to bbbbb.

22

23

This seems easy enough: instead of random strings, tokens can be [like] JWT, signed by a private key whose public part is known by all clusters. (This would be more efficient than callbacks even for mutually untrusted clusters.)

Project

General

Profile

Arvados

Multi-cluster user database » History » Version 1