Bug #19571
closedarvados-cwl-runner scattering bug
100%
Description
Hi all,
The attached simple CWL workflow and inputs file returns the following error when run with arvados-cwl-runner:
$ arvados-cwl-runner workflow-fixed.cwl cwl.inputs.json
INFO /usr/bin/arvados-cwl-runner 2.4.2, arvados-python-client 2.4.2, cwltool 3.1.20220623174452
INFO Resolved 'workflow-fixed.cwl' to 'file:///home/tom/temp/crunch-failure/workflow.cwl'
INFO Using cluster xxxxx (https://xxxxx.yyyyy.com/)
ERROR Input object failed validation:
identifier field '['string one', 'second string', 'three three three']' must be a string
This was surprising given that this workflow works just fine with cwltool (3.1.20220802125926).
We suspect that this may have to do with us calling the same scattered step twice, but unsure exactly as to why.
Files
Updated by Peter Amstutz about 2 years ago
- Target version set to 2022-10-12 sprint
Updated by Peter Amstutz about 2 years ago
- Target version changed from 2022-10-12 sprint to 2022-10-26 sprint
Updated by Peter Amstutz about 2 years ago
- Target version changed from 2022-10-26 sprint to 2022-11-09 sprint
Updated by Peter Amstutz about 2 years ago
- Related to Bug #19678: arvados-cwl-runner: id name must be a string added
Updated by Peter Amstutz about 2 years ago
- Related to deleted (Bug #19678: arvados-cwl-runner: id name must be a string)
Updated by Peter Amstutz about 2 years ago
- Has duplicate Bug #19678: arvados-cwl-runner: id name must be a string added
Updated by Peter Amstutz about 2 years ago
I think this is actually the same bug as was reported again in #19678, the having a input parameter named name
runs into trouble.
Updated by Joshua Randall about 2 years ago
Having an input parameter with an id of 'name' does work as long as the type is "string" - the problem occurs when the type is anything other than "string".
Note also that this does not only apply to input parameters but to field names as well.
i.e. this is valid:
'''
type:
type: record
fields:
- name: name
type: string
'''
but this is rejected:
'''
type:
type: record
fields:
- name: name
type: boolean
'''
We have confirmed that simply removing the two places in schema_salad where it explicitly throws a ValidationException when it finds a value for a name field that is not of type str solves this problem, and arvados-cwl-runner still works for all of our test cases. We have not yet investigated why this problem appears to be specific to arvados-cwl-runner and is not an issue for cwltool. That seems a bit surprising since they both use schema_salad, but perhaps they are invoking it in a different way?
Updated by Joshua Randall about 2 years ago
- File 19571-ref_resolver.patch 19571-ref_resolver.patch added
For our use-cases the attached patch (simply removing the two raise statements) completely solves this issue.
Updated by Peter Amstutz about 2 years ago
Hey Josh! Good to hear from you.
The way that arvados-cwl-runner
reads the workflow, packs it, and re-reads the packed version has occasionally turned up bugs on the second trip through the sausage machine (parsing and loading). I'll give this a look.
Updated by Peter Amstutz about 2 years ago
19678-job-loader @ e2267bd99209651c61425f335230e515421b2ef4
- Fix for parameters called 'name'
- Also fix regression involving default file references appearing in
nested processes (inline declaration of a tool within a workflow).
- Also fixed some dependency issues preventing arvados/jobs developer
image from working.
Updated by Peter Amstutz about 2 years ago
- Status changed from New to In Progress
Updated by Peter Amstutz about 2 years ago
- Status changed from In Progress to Resolved