Project

General

Profile

Actions

Bug #16683

closed

Trouble sharing with federated users

Added by Peter Amstutz over 4 years ago. Updated over 4 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
API
Target version:
Start date:
08/13/2020
Due date:
% Done:

100%

Estimated time:
(Total: 0.00 h)
Story points:
-
Release relationship:
Auto

Description

Workbench 1

Share with group

Steps to reproduce:

  1. Create a LoginCluster federation with 2 instances
  2. Go to the 2nd instance and log in with a non-admin federated user
  3. Create a project
  4. Go to the sharing tab
  5. Share with "all users"

On reloading the sharing tab, it'll get fiddlesticks:

request failed: x1ly6-j7d0g-fffffffffffffff%22%5D%5D%5D&limit=9223372036854775807&offset=0">https://172.17.0.3:8000/arvados/v1/users?cluster_id=&count=&filters=%5B%5B%22uuid%22%2C%22in%22%2C%5B%22x1ly6-j7d0g-fffffffffffffff%22%5D%5D%5D&limit=9223372036854775807&offset=0: 400 Bad Request: cannot execute federated list query unless count=="none" [API: 400]

Workbench 2

Sharing dialog

Fails to open with the similar error:

request failed: x1ly6-j7d0g-fffffffffffffff%22%5D%5D%5D&offset=0">https://172.17.0.3:8000/arvados/v1/users?cluster_id=&count=&filters=%5B%5B%22uuid%22%2C%22in%22%2C%5B%22x1ly6-j7d0g-fffffffffffffff%22%5D%5D%5D&offset=0: 400 Bad Request: cannot execute federated list query unless count=="none"

Sharing with user

Also observed: attempting to share with a user gets a "unknown uuid" error. This is probably due to the fact that although the user can see other users from the login cluster, the permissions are checked on the federated cluster, which doesn't know that. The validity check on the link tail for permission links should probably just suppress raising an error when the cluster id isn't the local one.

(alternately, we could skip the check entirely, this would partially address the previously reported problem of curators being unable to share with groups that they don't have access to, although they would need to add the permission link by uuid, possibly via the command line, which might not be acceptable.)


Subtasks 2 (0 open2 closed)

Task #16691: Review 16683-fed-sharing (arvados)ResolvedPeter Amstutz08/13/2020

Actions
Task #16700: Review 16683-fed-sharing (workbench2)ResolvedPeter Amstutz08/13/2020

Actions

Related issues 3 (0 open3 closed)

Related to Arvados - Bug #16726: other cluster's special users (root and anonymous) can appear in user listResolvedPeter Amstutz08/31/2020

Actions
Is duplicate of Arvados - Bug #16681: Sharing dialog fails with "count must be none" when using LoginCluster featureDuplicate

Actions
Blocks Arvados - Story #16688: Release Arvados 2.0.4ResolvedPeter Amstutz08/14/202008/17/2020

Actions
Actions #1

Updated by Peter Amstutz over 4 years ago

  • Description updated (diff)
Actions #2

Updated by Peter Amstutz over 4 years ago

  • Is duplicate of Bug #16681: Sharing dialog fails with "count must be none" when using LoginCluster feature added
Actions #3

Updated by Peter Amstutz over 4 years ago

  • Status changed from New to Duplicate
Actions #4

Updated by Peter Amstutz over 4 years ago

  • Status changed from Duplicate to New
  • Description updated (diff)
Actions #5

Updated by Peter Amstutz over 4 years ago

Adding

.with_count("none")
.fetch_multiple_pages(false)

seems like it should produce a working query but now it fails this way:

2020-08-11_19:41:34.01115 {"PID":1419,"RequestID":"req-29gqcx0h0hqobxo9efl6","level":"info","msg":"response","remoteAddr":"127.
0.0.1:34258","reqBytes":0,"reqForwardedFor":"172.17.0.3","reqHost":"172.17.0.4:8000","reqMethod":"GET","reqPath":"arvados/v1/us
ers","reqQuery":"cluster_id=\u0026count=none\u0026filters=%5B%5B%22uuid%22%2C%22in%22%2C%5B%22x1ly6-j7d0g-fffffffffffffff%22%5D
%5D%5D\u0026offset=0","respBody":"{\"errors\":[\"Get https://172.17.0.3:8000/arvados/v1/users?cluster_id=\\u0026count=none\\u00
26filters=%5B%5B%22uuid%22%2C%22in%22%2C%5B%22x1ly6-j7d0g-fffffffffffffff%22%5D%5D%5D\\u0026offset=0: EOF\"]}\n","respBytes":196,"respStatus":"Internal Server Error","respStatusCode":500,"time":"2020-08-11T19:41:34.011033813Z","timeToStatus":0.003099,"timeTotal":0.003152,"timeWriteBody":0.000053}

Retries also seem to be doing something weird, is it interpreting the '%' as a format character?

https://172.17.0.3:8000/arvados/v1/users?cluster_id=\\u0026cou
nt=none\\u0026filters=%5B%5B%22uuid%22%2C%22in%22%2C%5B%22x1ly6-j7d0g-fffffffffffffff%22%5D%5D%5D\\u0026offset=0: 502 Bad Gatew
ay: request failed: https://172.17.0.4:8000/arvados/v1/users?cluster_id=\\u0026count=none\\u0026filters=%!B(MISSING)%!B(MISSING
)%!u(MISSING)uid%2C%!i(MISSING)n%2C%!B(MISSING)%!x(MISSING)1ly6-j7d0g-fffffffffffffff%5D%!D(MISSING)%!D(MISSING)\\u0026offset=0
: 500 Internal Server Error: Get https://172.17.0.3:8000/arvados/v1/users?cluster_id=\\u0026count=none\\u0026filters=%!B(MISSIN
G)%!B(MISSING)%!u(MISSING)uid%2C%!i(MISSING)n%2C%!B(MISSING)%!x(MISSING)1ly6-j7d0g-fffffffffffffff%5D%!D(MISSING)%!D(MISSING)\\
u0026offset=0: EOF\

Actions #6

Updated by Peter Amstutz over 4 years ago

Response on LoginCluster

2020-08-11_20:13:41.04890 {"PID":1419,"RequestID":"req-79u0kwp3hsvwgb19cdci","level":"info","msg":"response","remoteAddr":"127.0.0.1:59
428","reqBytes":0,"reqForwardedFor":"172.17.0.3","reqHost":"172.17.0.4:8000","reqMethod":"GET","reqPath":"arvados/v1/users","reqQuery":
"cluster_id=\u0026count=none\u0026filters=%5B%5B%22uuid%22%2C%22in%22%2C%5B%22x1ly6-j7d0g-fffffffffffffff%22%5D%5D%5D\u0026offset=0","r
espBody":"{\"errors\":[\"Get https://172.17.0.3:8000/arvados/v1/users?cluster_id=\\u0026count=none\\u0026filters=%5B%5B%22uuid%22%2C%22
in%22%2C%5B%22x1ly6-j7d0g-fffffffffffffff%22%5D%5D%5D\\u0026offset=0: EOF\"]}\n","respBytes":196,"respStatus":"Internal Server Error","
respStatusCode":500,"time":"2020-08-11T20:13:41.048774508Z","timeToStatus":0.005508,"timeTotal":0.005535,"timeWriteBody":0.000027}

Response on federate:

2020-08-11_20:13:41.04952 {"PID":14215,"RequestID":"req-79u0kwp3hsvwgb19cdci","level":"info","msg":"response","remoteAddr":"127.0.0.1:5
9424","reqBytes":0,"reqForwardedFor":"172.17.0.4","reqHost":"172.17.0.3:8000","reqMethod":"GET","reqPath":"arvados/v1/users","reqQuery"
:"cluster_id=\u0026count=none\u0026filters=%5B%5B%22uuid%22%2C%22in%22%2C%5B%22x1ly6-j7d0g-fffffffffffffff%22%5D%5D%5D\u0026offset=0","
respBody":"{\"errors\":[\"request failed: https://172.17.0.4:8000/arvados/v1/users?cluster_id=\\u0026count=none\\u0026filters=%!B(MISSI
NG)%!B(MISSING)%!u(MISSING)uid%2C%!i(MISSING)n%2C%!B(MISSING)%!x(MISSING)1ly6-j7d0g-fffffffffffffff%5D%!D(MISSING)%!D(MISSING)\\u0026of
fset=0: 500 Internal Server Error: Get https://172.17.0.3:8000/arvados/v1/users?cluster_id=\\u0026count=none\\u0026filters=%!B(MISSING)
!B(MISSING)!u(MISSING)uid%2C%!i(MISSING)n%2C%!B(MISSING)%!x(MISSING)1ly6-j7d0g-fffffffffffffff%5D%!D(MISSING)%!D(MISSING)\\u0026offse
t=0: EOF\"]}\n","respBytes":532,"respStatus":"Bad Gateway","respStatusCode":502,"time":"2020-08-11T20:13:41.049414288Z","timeToStatus":0.015050,"timeTotal":0.015071,"timeWriteBody":0.000021}

Actions #7

Updated by Peter Amstutz over 4 years ago

The %(MISSING) part is just a silly logging bug. Fix pending.

What's really going on:

It's getting stuck in a request loop. The federate ("x1ly6") forwards the user list request to the login cluster ("xrmjz"). The login cluster sees this filter:

[["uuid", "in", "x1ly6-j7d0g-fffffffffffffff"]]

It recognizes it as a uuid and so it forwards the request to the federate. Which forwards the request back to login cluster, which forwards it back to the federate, and so forth until it gives up. (I don't know how it decides to break out of the loop, but at least it doesn't get stuck infinitely).

Issues:

"x1ly6-j7d0g-fffffffffffffff" isn't a user. I think Workbench just simplistically sends the same "uuid in" list for both users and groups (should be harmless, right?). The intended behavior is to share with the "all users" group on the x1ly6 cluster.

Possible fixes:

  • Check the type portion of the uuid
  • Filter "uuid in" before sending the list request to LoginCluster
  • The call chain sets "bypass federation" flag to suppress request propagation.
Actions #8

Updated by Peter Amstutz over 4 years ago

  • Release set to 34
Actions #9

Updated by Peter Amstutz over 4 years ago

Actions #10

Updated by Peter Amstutz over 4 years ago

  • Description updated (diff)
Actions #11

Updated by Peter Amstutz over 4 years ago

  • Description updated (diff)
Actions #12

Updated by Peter Amstutz over 4 years ago

  • Assigned To set to Peter Amstutz
Actions #13

Updated by Peter Amstutz over 4 years ago

  • Status changed from New to In Progress
Actions #16

Updated by Lucas Di Pentima over 4 years ago

Reviewing 877689f

  • File services/api/app/models/link.rb
    • Updates don’t check for cluster federated configuration, do you think it’s convenient to add it to at least avoid some of the copy&paste errors?
  • File services/api/app/models/arvados_model.rb
    • Line 761 & 762: shouldn’t be added only to the Link model?
    • Are we already using links to/from PDHs? I haven’t found documentation about it, maybe it’ll need an update? For example, https://doc.arvados.org/v2.0/api/permission-model.html#links talks about tail_uuid being only from a User or Group.
  • I think having tests for the above changes would be convenient.
  • Ran the fed migration tests on Jenkins and had a failure: arv-federation-migrate-test: #104
Actions #17

Updated by Lucas Di Pentima over 4 years ago

On a second thought: Maybe the fed-migrate tests on Jenkins don't start a federation from the provided commit hash. I was looking to avoid running them locally that usually is slower.

Actions #18

Updated by Peter Amstutz over 4 years ago

Lucas Di Pentima wrote:

Reviewing 877689f

  • File services/api/app/models/link.rb
    • Updates don’t check for cluster federated configuration, do you think it’s convenient to add it to at least avoid some of the copy&paste errors?

Added a call to ApiClientAuthorization.remote_host to check for a valid cluster id.

  • File services/api/app/models/arvados_model.rb
    • Line 761 & 762: shouldn’t be added only to the Link model?
    • Are we already using links to/from PDHs? I haven’t found documentation about it, maybe it’ll need an update? For example, https://doc.arvados.org/v2.0/api/permission-model.html#links talks about tail_uuid being only from a User or Group.

I removed the check since it was functionally useless and didn't exist in the old code.

As it stands, if something doesn't fit the uuid format, it just won't be checked, so it doesn't matter if it is a PDH. That said there are some legacy links classes that use PDH, like 'docker_image_migration'.

  • I think having tests for the above changes would be convenient.

Added a test.

We discussed this at standup, the jenkins test could (and probably should) do this, but it requires some fiddling, and may come at the expense of the job taking longer. I don't know if we want to block on it.

16683-fed-sharing @ 78444f2fb480801787e486d4b65198d72ab4fe15

Actions #19

Updated by Lucas Di Pentima over 4 years ago

This (78444f2fb480801787e486d4b65198d72ab4fe15) LGTM, thanks!!
Will take a look at wb2.

Actions #20

Updated by Lucas Di Pentima over 4 years ago

Reviewing arvados-workbench2|63e2f0e

WB2 still fails when re-opening the share dialog after being shared with all. The following fixes the problem:

diff --git a/src/views-components/sharing-dialog/participant-select.tsx b/src/views-components/sharing-dialog/participant-select.tsx
index 5d062da0..5260743f 100644
--- a/src/views-components/sharing-dialog/participant-select.tsx
+++ b/src/views-components/sharing-dialog/participant-select.tsx
@@ -131,14 +131,14 @@ export const ParticipantSelect = connect()(
             const filterUsers = new FilterBuilder()
                 .addILike('any', value)
                 .getFilters();
-            const userItems: ListResults<any> = await userService.list({ filters: filterUsers, limit });
+            const userItems: ListResults<any> = await userService.list({ filters: filterUsers, limit, count: "none" });

             const filterGroups = new FilterBuilder()
                 .addNotIn('group_class', [GroupClass.PROJECT])
                 .addILike('name', value)
                 .getFilters();

-            const groupItems: ListResults<any> = await groupsService.list({ filters: filterGroups, limit });
+            const groupItems: ListResults<any> = await groupsService.list({ filters: filterGroups, limit, count: "none" });
             this.setState({
                 suggestions: this.props.onlyPeople
                     ? userItems.items

With that, it LGTM.

Actions #21

Updated by Peter Amstutz over 4 years ago

Noticed during testing that "share public" doesn't work -- will fix

Actions #22

Updated by Peter Amstutz over 4 years ago

I incorporated your fix and did a little more manual testing. I noticed the "share public" option failed, so I fixed that. I added {count: "none"} in a few more places.

16683-fed-sharing arvados-workbench2|b1250d13a43d4351c6cbca7990c996b3219693d0

developer-tests-workbench2: #73

Actions #23

Updated by Lucas Di Pentima over 4 years ago

Did some manual testing against ce8i5 and all seems to work correctly, looks good to merge, thanks!

Actions #24

Updated by Peter Amstutz over 4 years ago

  • Status changed from In Progress to Resolved
Actions #25

Updated by Peter Amstutz over 4 years ago

We set up a configuration with jutro (LoginCluster) and pirca (federate).

On pirca, the sharing dialog works. This is the case we tested for.

On jutro, the sharing dialog fails:

request failed: pirca-tpzed-anonymouspublic%22%5D%5D%5D&forwarded_for=jutro-&offset=0&reader_tokens=%5B%22v2%2Fjutro-gj3su-e2o9x84aeg7q005%2Fc23929fe5eb184a81e0afda2cff8aed9d10e9ed2%22%5D">https://pirca.arvadosapi.com/arvados/v1/users?cluster_id=&count=none&filters=%5B%5B%22uuid%22%2C%22in%22%2C%5B%22pirca-tpzed-anonymouspublic%22%5D%5D%5D&forwarded_for=jutro-&offset=0&reader_tokens=%5B%22v2%2Fjutro-gj3su-e2o9x84aeg7q005%2Fc23929fe5eb184a81e0afda2cff8aed9d10e9ed2%22%5D: 401 Unauthorized: request failed: pirca-tpzed-anonymouspublic%22%5D%5D%5D&forwarded_for=jutro-&offset=0&reader_tokens=%5B%22v2%2Fjutro-gj3su-e2o9x84aeg7q005%2Fc23929fe5eb184a81e0afda2cff8aed9d10e9ed2%22%5D">https://jutro.arvadosapi.com/arvados/v1/users?cluster_id=&count=none&filters=%5B%5B%22uuid%22%2C%22in%22%2C%5B%22pirca-tpzed-anonymouspublic%22%5D%5D%5D&forwarded_for=jutro-&offset=0&reader_tokens=%5B%22v2%2Fjutro-gj3su-e2o9x84aeg7q005%2Fc23929fe5eb184a81e0afda2cff8aed9d10e9ed2%22%5D: 401 Unauthorized: request failed: pirca-tpzed-anonymouspublic%22%5D%5D%5D&forwarded_for=jutro-jutro-&offset=0&reader_tokens=%5B%22v2%2Fjutro-gj3su-e2o9x84aeg7q005%2Fc23929fe5eb184a81e0afda2cff8aed9d10e9ed2%22%5D">http://localhost:8004/arvados/v1/users?cluster_id=&count=none&filters=%5B%5B%22uuid%22%2C%22in%22%2C%5B%22pirca-tpzed-anonymouspublic%22%5D%5D%5D&forwarded_for=jutro-jutro-&offset=0&reader_tokens=%5B%22v2%2Fjutro-gj3su-e2o9x84aeg7q005%2Fc23929fe5eb184a81e0afda2cff8aed9d10e9ed2%22%5D: 401 Unauthorized: Not logged in (req-datocp8oyu00b24wspew) [API: 502]

On further research:

It is failing because it was mistakenly shared with the pirca anonymous user (pirca-tpzed-anonymouspublic). This causes some weird federation queries that don't work.

This happened because on jutro, there are two anonymous users to choose from, the jutro one and the pirca one. That shouldn't happen, and it is very confusing.

a) I'm not sure how the pirca one ended up on jutro. Maybe accessing a federated collection as the anonymous user, starting from pirca it used pirca's anonymous user credentials to access the collection on jutro?

b) I can't delete the user from jutro using the API, because controller redirects "user delete" API calls to the cluster that owns the user (pirca, of course). I tried to delete the user on jutro and I'm pretty sure I deleted pirca's anonymous user instead. So now the anonymous user on pirca is gone.

c) When LoginCluster is set, every cluster should use the LoginCluster's anonymous user and token.

Actions #26

Updated by Peter Amstutz over 4 years ago

  • Status changed from Resolved to Feedback
Actions #27

Updated by Javier Bértoli over 4 years ago

Applied the changes for the anonymous user in {pirca,jutro} through salt (commit b7de27c@saltstack). We have the automated runs disabled atm, as we are doing some manual changes in the hosts, so deployed with these steps in the salt master:

cd /data/saltstack/pillarstack && gut pull
salt '*jutro*' saltutil.refresh_pillar
salt '*jutro*' state.apply custom_files,packages,services,exim 2>&1 |tee /tmp/salida_jutro
salt '*pirca*' saltutil.refresh_pillar
salt '*pirca*' state.apply custom_files,packages,services,exim 2>&1 |tee /tmp/salida_pirca
Actions #28

Updated by Peter Amstutz over 4 years ago

  • Related to Bug #16726: other cluster's special users (root and anonymous) can appear in user list added
Actions #29

Updated by Peter Amstutz over 4 years ago

  • Status changed from Feedback to Resolved
Actions

Also available in: Atom PDF