Project

General

Profile

Actions

Feature #22819

closed

arvados-cwl-runner can use new credential API to get AWS credentials to use to download from S3

Added by Peter Amstutz 11 months ago. Updated 10 months ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Peter Amstutz
Category:
CWL
Target version:
Story points:
-
Release relationship:
Auto

Description

The minimal version of this is some version of either/both:

User has to provide the name or UUID of the credential to use on the command line

If the user doesn't have any (local) credentials available and/or provides a different command line switch, check Arvados for an AWS credential, and if so, try to use that.


Subtasks 1 (0 open1 closed)

Task #22874: Review 22819-cwl-use-s3-credsResolvedLucas Di Pentima06/02/2025Actions

Related issues 1 (0 open1 closed)

Blocked by Arvados - Feature #22680: store persistent credentials to external resources and provide them to a containerResolvedPeter AmstutzActions
Actions #1

Updated by Peter Amstutz 11 months ago

  • Position changed from -950302 to -950295
Actions #2

Updated by Peter Amstutz 11 months ago

  • Blocked by Feature #22680: store persistent credentials to external resources and provide them to a container added
Actions #3

Updated by Peter Amstutz 11 months ago

  • Description updated (diff)
Actions #4

Updated by Peter Amstutz 11 months ago

  • Description updated (diff)
Actions #5

Updated by Peter Amstutz 11 months ago

  • Target version changed from Development 2025-05-14 to Development 2025-05-28
Actions #6

Updated by Peter Amstutz 11 months ago

  • Release set to 79
Actions #7

Updated by Peter Amstutz 10 months ago

  • Assigned To set to Peter Amstutz
Actions #8

Updated by Peter Amstutz 10 months ago

  • Subtask #22874 added
Actions #9

Updated by Peter Amstutz 10 months ago

  • Status changed from New to In Progress
Actions #10

Updated by Peter Amstutz 10 months ago

22819-cwl-use-s3-creds @ 4bbf9377d656586eeceac886a127a39aab9959a3

developer-run-tests: #4789

  • All agreed upon points are implemented / addressed. Describe changes from pre-implementation design.
    • Now uses the credentials API to try and find AWS credentials to use when there are s3:// URLs present in the input
  • Anything not implemented (discovered or discussed during work) has a follow-up story.
    • The main outstanding to-do is UI support for managing external credentials: #22820
  • Code is tested and passing, both automated and manual, what manual testing was done is described.
  • New or changed UX/UX and has gotten feedback from stakeholders.
    • There is a new --use-credential command line option, so by our process that might merit a separate UX discusion.
  • Documentation has been updated.
    • Updated "arvados-cwl-runner options" and "Guidelines for Writing High-Performance Portable Workflows" pages
  • Behaves appropriately at the intended scale (describe intended scale).
    • Checking and fetching credentials is a minimal operation. Downloading data is not really any different from existing HTTP transfer feature.
  • Considered backwards and forwards compatibility issues between client and server.
    • Checks for the presence of the credential API and raises an error if it isn't available; error message offers --enable-aws-credential-capture to use newer a-c-r with older API servers.
  • Follows our coding standards and GUI style guidelines.
    • yes

The default behavior is still to try to download from S3 before submitting the workflow. However, if S3 credentials are not available locally, it will recommend using --defer-downloads.

The original (unreleased) S3 download branch made --enable-aws-credential-capture the default option. That is no longer the case (i.e. credential capture is no longer the default behavior). The behavior now preferentially tries to use the credentials API, and only recommends --enable-aws-credential-capture if no AWS credentials can be found.

The new flow is like this:

  1. When we see a s3 URL during initial workflow submission processing, look for a credential that matches the S3 bucket in scopes, or has a blank scopes
    1. If we find exactly one credential, that is passed along using --use-credential when creating the workflow submission
    2. If we find multiple, we kick it back to the user and have them tell us which one to use
    3. If we find none, tell them to use --enable-aws-credential-capture
  2. When we're running inside the container and ready to download stuff, use make the API call to get the secret key associated with the credential that was selected earlier
  3. Now we can download each S3 object to Keep as normal

Couple notes:

  • Selects a single credential and uses it everywhere. While this basic design could certainly support different credentials for different resources, it would make the implementation more complicated and it isn't clear that it is a requirement.
  • The test coverage isn't great. A full integration test is hard because downloading from a live s3 bucket that requires credentials is impractical for many reasons and even a stub S3 service isn't ideal. I might see about doing some targeted testing of PathMapper with the boto3 stuff mocked out just so we can catch routine errors (like misspelled variable names).
Actions #11

Updated by Lucas Di Pentima 10 months ago

Some comments & questions:

  • Log message from the example container: "S3 downloads will use access key id XXXX in region None" -- this might be a sign of missing data?
  • It may also benefit of logging the credentials' UUID for potential debugging needs?
  • File sdk/cwl/arvados_cwl/arvcontainer.py
    • Lines 52 & 60: Not sure if a list call includes the expired credentials. If that's the case, I think it would be useful to filter out expired credentials on searches by bucket.
  • Is there a reason to have resolve_aws_key() code duplicated between arvcontainer.py and pathmapper.py? The latter does an extra check but I think we could still DRY to keep code slimmer.
    • Related to this, the arvcontainer.py version of resolve_aws_key() does not look for blank scopes like it does in pathmapper.py, not sure if this is on purpose.
  • With this change, are we still supporting S3 downloads with no credentials?
Actions #12

Updated by Peter Amstutz 10 months ago

  • Target version changed from Development 2025-05-28 to Development 2025-06-25
Actions #13

Updated by Peter Amstutz 10 months ago

Lucas Di Pentima wrote in #note-11:

Some comments & questions:

  • Log message from the example container: "S3 downloads will use access key id XXXX in region None" -- this might be a sign of missing data?

I took the region out of the log statement. From a few minutes of googling I wasn't able to get a satisfactory answer about whether AWS credentials are tied to a specific region, but I suspect that if they were, it would be much more clearly documented.

  • It may also benefit of logging the credentials' UUID for potential debugging needs?

Done.

  • File sdk/cwl/arvados_cwl/arvcontainer.py
    • Lines 52 & 60: Not sure if a list call includes the expired credentials. If that's the case, I think it would be useful to filter out expired credentials on searches by bucket.

Added a filter to avoid credentials that have expired or will expire in the next 5 minutes.

  • Is there a reason to have resolve_aws_key() code duplicated between arvcontainer.py and pathmapper.py? The latter does an extra check but I think we could still DRY to keep code slimmer.

No, that was a mistake, it wasn't even being used there, thank you for catching it.

  • With this change, are we still supporting S3 downloads with no credentials?

It actually requires a little extra work, because you have to specifically make "unsigned" requests. I added a new flag --s3-public-bucket.

22819-cwl-use-s3-creds @ 40a315bdb7460df457a66fe9e0a9c690290dede8

developer-run-tests: #4799

Actions #14

Updated by Lucas Di Pentima 10 months ago

The changes LGTM, but there's a sdk/cwl test failing that might to be related changes made in commit 8fc423db1f (util.py file?)

Actions #15

Updated by Peter Amstutz 10 months ago

Lucas Di Pentima wrote in #note-14:

The changes LGTM, but there's a sdk/cwl test failing that might to be related changes made in commit 8fc423db1f (util.py file?)

Ah, yea, it was yelling at me about utcnow() being deprecated, so I replaced it with now(timezone.UTC) but I didn't update one of the mocks.

22f74019009cac291977bf3b8ed7dd8dee6a040a

developer-run-tests: #4800

Actions #16

Updated by Lucas Di Pentima 10 months ago

LGTM, thanks!

Actions #17

Updated by Peter Amstutz 10 months ago

  • Status changed from In Progress to Resolved
Actions

Also available in: Atom PDF