Feature #22819
closedarvados-cwl-runner can use new credential API to get AWS credentials to use to download from S3
Description
The minimal version of this is some version of either/both:
User has to provide the name or UUID of the credential to use on the command line
If the user doesn't have any (local) credentials available and/or provides a different command line switch, check Arvados for an AWS credential, and if so, try to use that.
Updated by Peter Amstutz 11 months ago
- Blocked by Feature #22680: store persistent credentials to external resources and provide them to a container added
Updated by Peter Amstutz 10 months ago
22819-cwl-use-s3-creds @ 4bbf9377d656586eeceac886a127a39aab9959a3
- All agreed upon points are implemented / addressed. Describe changes from pre-implementation design.
- Now uses the
credentialsAPI to try and find AWS credentials to use when there ares3://URLs present in the input
- Now uses the
- Anything not implemented (discovered or discussed during work) has a follow-up story.
- The main outstanding to-do is UI support for managing external credentials: #22820
- Code is tested and passing, both automated and manual, what manual testing was done is described.
- Updated
test_submit_defer_s3_download_no_credential_captureto include calling the credential API - Manually testing on tordo by adding credentials via the
arvcommand line, then submitting a workflow. See https://workbench.tordo.arvadosapi.com/processes/tordo-xvhdp-8nk7e6f4955rxpx
- Updated
- New or changed UX/UX and has gotten feedback from stakeholders.
- There is a new
--use-credentialcommand line option, so by our process that might merit a separate UX discusion.
- There is a new
- Documentation has been updated.
- Updated "arvados-cwl-runner options" and "Guidelines for Writing High-Performance Portable Workflows" pages
- Behaves appropriately at the intended scale (describe intended scale).
- Checking and fetching credentials is a minimal operation. Downloading data is not really any different from existing HTTP transfer feature.
- Considered backwards and forwards compatibility issues between client and server.
- Checks for the presence of the credential API and raises an error if it isn't available; error message offers
--enable-aws-credential-captureto use newer a-c-r with older API servers.
- Checks for the presence of the credential API and raises an error if it isn't available; error message offers
- Follows our coding standards and GUI style guidelines.
- yes
The default behavior is still to try to download from S3 before submitting the workflow. However, if S3 credentials are not available locally, it will recommend using --defer-downloads.
The original (unreleased) S3 download branch made --enable-aws-credential-capture the default option. That is no longer the case (i.e. credential capture is no longer the default behavior). The behavior now preferentially tries to use the credentials API, and only recommends --enable-aws-credential-capture if no AWS credentials can be found.
The new flow is like this:
- When we see a s3 URL during initial workflow submission processing, look for a credential that matches the S3 bucket in
scopes, or has a blankscopes- If we find exactly one credential, that is passed along using --use-credential when creating the workflow submission
- If we find multiple, we kick it back to the user and have them tell us which one to use
- If we find none, tell them to use
--enable-aws-credential-capture
- When we're running inside the container and ready to download stuff, use make the API call to get the secret key associated with the credential that was selected earlier
- Now we can download each S3 object to Keep as normal
Couple notes:
- Selects a single credential and uses it everywhere. While this basic design could certainly support different credentials for different resources, it would make the implementation more complicated and it isn't clear that it is a requirement.
- The test coverage isn't great. A full integration test is hard because downloading from a live s3 bucket that requires credentials is impractical for many reasons and even a stub S3 service isn't ideal. I might see about doing some targeted testing of PathMapper with the boto3 stuff mocked out just so we can catch routine errors (like misspelled variable names).
Updated by Lucas Di Pentima 10 months ago
Some comments & questions:
- Log message from the example container: "S3 downloads will use access key id XXXX in region None" -- this might be a sign of missing data?
- It may also benefit of logging the credentials' UUID for potential debugging needs?
- File
sdk/cwl/arvados_cwl/arvcontainer.py- Lines 52 & 60: Not sure if a list call includes the expired credentials. If that's the case, I think it would be useful to filter out expired credentials on searches by bucket.
- Is there a reason to have
resolve_aws_key()code duplicated betweenarvcontainer.pyandpathmapper.py? The latter does an extra check but I think we could still DRY to keep code slimmer.- Related to this, the
arvcontainer.pyversion ofresolve_aws_key()does not look for blank scopes like it does inpathmapper.py, not sure if this is on purpose.
- Related to this, the
- With this change, are we still supporting S3 downloads with no credentials?
Updated by Peter Amstutz 10 months ago
Lucas Di Pentima wrote in #note-11:
Some comments & questions:
- Log message from the example container: "S3 downloads will use access key id XXXX in region None" -- this might be a sign of missing data?
I took the region out of the log statement. From a few minutes of googling I wasn't able to get a satisfactory answer about whether AWS credentials are tied to a specific region, but I suspect that if they were, it would be much more clearly documented.
- It may also benefit of logging the credentials' UUID for potential debugging needs?
Done.
- File
sdk/cwl/arvados_cwl/arvcontainer.py
- Lines 52 & 60: Not sure if a list call includes the expired credentials. If that's the case, I think it would be useful to filter out expired credentials on searches by bucket.
Added a filter to avoid credentials that have expired or will expire in the next 5 minutes.
- Is there a reason to have
resolve_aws_key()code duplicated betweenarvcontainer.pyandpathmapper.py? The latter does an extra check but I think we could still DRY to keep code slimmer.
No, that was a mistake, it wasn't even being used there, thank you for catching it.
- With this change, are we still supporting S3 downloads with no credentials?
It actually requires a little extra work, because you have to specifically make "unsigned" requests. I added a new flag --s3-public-bucket.
22819-cwl-use-s3-creds @ 40a315bdb7460df457a66fe9e0a9c690290dede8
Updated by Lucas Di Pentima 10 months ago
The changes LGTM, but there's a sdk/cwl test failing that might to be related changes made in commit 8fc423db1f (util.py file?)
Updated by Peter Amstutz 10 months ago
Lucas Di Pentima wrote in #note-14:
The changes LGTM, but there's a
sdk/cwltest failing that might to be related changes made in commit 8fc423db1f (util.py file?)
Ah, yea, it was yelling at me about utcnow() being deprecated, so I replaced it with now(timezone.UTC) but I didn't update one of the mocks.