Project

General

Profile

Actions

Bug #23227

closed

In a CWL tool description, when enableReuse is an expression, it is not evaluated

Added by Zoë Ma 5 months ago. Updated 5 months ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
CWL
Target version:
Story points:
-

Description

Background: https://www.commonwl.org/v1.2/CommandLineTool.html#WorkReuse
According to a straight reading of this specification, the enableReuse field can take an expression that determines whether the CWL tool should be allowed to reuse previous runs.

An experiment shows that Arvados doesn't seem to evaluate the expression if given as the value of enableReuse.

The files and test runs are in the project here: https://workbench.tordo.arvadosapi.com/projects/tordo-j7d0g-13o964wdf1tyq8l

Explanation: I made a simple CLI script that does this: if given a CLI argument, echo it back; if no argument is given, output a random integer to the stdout.

The intention is that when the script is called without arguments, the behavior is non-deterministic and depends on external state, so it shouldn't be reused. To express this, I encoded the idea in the CWL file:

requirements:                                                                   
  WorkReuse:                                                                    
    enableReuse: $(inputs.message !== null)                

(Here message is the CWL input name for the CLI argument to the script.)

Expected behavior: when running with Arvados, as long as the input message is null, the processes should not be reused, despite having the same input and the same CWL content and same Docker image, because enableReuse evaluates to false.

Actual behavior: the second time the process is run, with the input message null, the subprocess from a previous run with the same null input is reused; it is as if enableReuse were true or ignored. See https://workbench.tordo.arvadosapi.com/processes/tordo-xvhdp-r9rhqzdece93gng

Further explanation: To run the CWL tool locally, download the collection https://workbench.tordo.arvadosapi.com/collections/tordo-4zz18-erdwgjd87ngbexk and build the script's Docker image with the hard-coded tag 'test/simpletool':

$ docker build -t test/simpletool .

and then run/register the CWL tool with arvados-cwl-runner tool-docker.cwl


Subtasks 1 (0 open1 closed)

Task #23252: Review 23227-enablereuse-expressionResolvedBrett Smith10/29/2025Actions
Actions #1

Updated by Zoë Ma 5 months ago

This issue was discovered when we worked on #23214#note-10

Actions #2

Updated by Brett Smith 5 months ago

  • Target version set to Development 2025-11-12
  • Assigned To set to Zoë Ma
  • Status changed from New to In Progress
Actions #3

Updated by Zoë Ma 5 months ago

23227-enablereuse-expression @ 056008987c8e9fd51c93978d0efd11bcde76d262

  • All agreed upon points are implemented / addressed. Describe changes from pre-implementation design.
    • The implementation now evals the expression value of the enableReuse field of the WorkReuse requirement (CWL v1.1 standardized). For legacy arv:ReuseRequirement, the enableReuse field is not evaluated, as specified in the extension schema file. If both requirements are present, the standardized WorkReuse one takes precedence. A parameterized test is added for a simple CWL tool description that has enableReuse set based on input.
  • Anything not implemented (discovered or discussed during work) has a follow-up story.
    • N/A
  • Code is tested and passing, both automated and manual, what manual testing was done is described.
    • All tests passing, including the newly added one.
  • The tested code incorporates recent main branch changes.
    • Yes
  • New or changed UI/UX has gotten feedback from stakeholders.
    • Pending - we may go back to #23214 which is the original use case that prompted this bug.
  • Documentation has been updated.
    • Yes (in Arvados CWL Extensions, mention the standardized WorkReuse can take an expression as its enableReuse field value)
  • Behaves appropriately at the intended scale (describe intended scale).
    • In theory this may add an eval overhead to any the container request, but IRL I'd expect most of expressions that goes into enableReuse to be simple. The vast majority of cases already give it a bool constant value for which eval is not necessary (and not performed).
  • Considered backwards and forwards compatibility issues between client and server.
    • N/A (please comment?)
  • Follows our coding standards and GUI style guidelines.
    • New code follows PEP-8
Actions #4

Updated by Brett Smith 5 months ago

  • Subtask #23252 added
Actions #5

Updated by Brett Smith 5 months ago

Zoë Ma wrote in #note-3:

23227-enablereuse-expression @ 056008987c8e9fd51c93978d0efd11bcde76d262

My only comment is that the tests seem like they could be simplified a bit to use a single variable, the should_reuse boolean. From there, the input shouldEnableReuse can be str(should_reuse), and the reuse expression can check $(inputs.shouldEnableReuse === "True"). This avoids the unused variable and reduces the amount of stuff that needs to kept in sync between the test cases and the CWL tool definition.

If that change sounds good to you too, please make it and I'll merge it. Thanks.

Actions #6

Updated by Zoë Ma 5 months ago

Thank you; this is done in f801f6ca02e2f55f3be41ed379b0a41f37a011bc

Actions #7

Updated by Zoë Ma 5 months ago

(sorry last minute code formatting fix in 3120c998ca6150c5058061e526d8b17a299c8090)

Actions #8

Updated by Brett Smith 5 months ago

  • Status changed from In Progress to Resolved
Actions

Also available in: Atom PDF