Project

General

Profile

Actions

Bug #19571

closed

arvados-cwl-runner scattering bug

Added by Tom Schoonjans about 2 years ago. Updated about 2 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
CWL
Target version:
Start date:
10/28/2022
Due date:
% Done:

100%

Estimated time:
(Total: 0.00 h)
Story points:
-

Description

Hi all,

The attached simple CWL workflow and inputs file returns the following error when run with arvados-cwl-runner:

$ arvados-cwl-runner workflow-fixed.cwl cwl.inputs.json
INFO /usr/bin/arvados-cwl-runner 2.4.2, arvados-python-client 2.4.2, cwltool 3.1.20220623174452
INFO Resolved 'workflow-fixed.cwl' to 'file:///home/tom/temp/crunch-failure/workflow.cwl'
INFO Using cluster xxxxx (https://xxxxx.yyyyy.com/)
ERROR Input object failed validation:
identifier field '['string one', 'second string', 'three three three']' must be a string

This was surprising given that this workflow works just fine with cwltool (3.1.20220802125926).

We suspect that this may have to do with us calling the same scattered step twice, but unsure exactly as to why.


Files

cwl.inputs.json (60 Bytes) cwl.inputs.json Tom Schoonjans, 09/23/2022 12:34 PM
workflow-simplified.cwl (1.44 KB) workflow-simplified.cwl Tom Schoonjans, 09/23/2022 12:34 PM
19571-ref_resolver.patch (1.1 KB) 19571-ref_resolver.patch Joshua Randall, 10/28/2022 04:08 PM

Subtasks 1 (0 open1 closed)

Task #19611: Review 19678-job-loaderResolvedPeter Amstutz10/28/2022

Actions

Related issues 1 (0 open1 closed)

Has duplicate Arvados - Bug #19678: arvados-cwl-runner: id name must be a stringResolvedPeter Amstutz

Actions
Actions #1

Updated by Peter Amstutz about 2 years ago

  • Target version set to 2022-10-12 sprint
Actions #2

Updated by Peter Amstutz about 2 years ago

  • Target version changed from 2022-10-12 sprint to 2022-10-26 sprint
Actions #3

Updated by Peter Amstutz about 2 years ago

  • Assigned To set to Peter Amstutz
Actions #4

Updated by Peter Amstutz about 2 years ago

  • Category set to CWL
Actions #5

Updated by Peter Amstutz about 2 years ago

  • Target version changed from 2022-10-26 sprint to 2022-11-09 sprint
Actions #6

Updated by Peter Amstutz about 2 years ago

  • Related to Bug #19678: arvados-cwl-runner: id name must be a string added
Actions #7

Updated by Peter Amstutz about 2 years ago

  • Related to deleted (Bug #19678: arvados-cwl-runner: id name must be a string)
Actions #8

Updated by Peter Amstutz about 2 years ago

  • Has duplicate Bug #19678: arvados-cwl-runner: id name must be a string added
Actions #9

Updated by Peter Amstutz about 2 years ago

I think this is actually the same bug as was reported again in #19678, the having a input parameter named name runs into trouble.

Actions #10

Updated by Joshua Randall about 2 years ago

Having an input parameter with an id of 'name' does work as long as the type is "string" - the problem occurs when the type is anything other than "string".

Note also that this does not only apply to input parameters but to field names as well.

i.e. this is valid:

'''
type:
type: record
fields:
- name: name
type: string
'''

but this is rejected:
'''
type:
type: record
fields:
- name: name
type: boolean
'''

We have confirmed that simply removing the two places in schema_salad where it explicitly throws a ValidationException when it finds a value for a name field that is not of type str solves this problem, and arvados-cwl-runner still works for all of our test cases. We have not yet investigated why this problem appears to be specific to arvados-cwl-runner and is not an issue for cwltool. That seems a bit surprising since they both use schema_salad, but perhaps they are invoking it in a different way?

Actions #11

Updated by Joshua Randall about 2 years ago

For our use-cases the attached patch (simply removing the two raise statements) completely solves this issue.

Actions #12

Updated by Peter Amstutz about 2 years ago

Hey Josh! Good to hear from you.

The way that arvados-cwl-runner reads the workflow, packs it, and re-reads the packed version has occasionally turned up bugs on the second trip through the sausage machine (parsing and loading). I'll give this a look.

Actions #13

Updated by Peter Amstutz about 2 years ago

19678-job-loader @ e2267bd99209651c61425f335230e515421b2ef4

  • Fix for parameters called 'name'
  • Also fix regression involving default file references appearing in
    nested processes (inline declaration of a tool within a workflow).
  • Also fixed some dependency issues preventing arvados/jobs developer
    image from working.

developer-run-tests: #3347

Actions #14

Updated by Peter Amstutz about 2 years ago

  • Status changed from New to In Progress
Actions #15

Updated by Lucas Di Pentima about 2 years ago

This LGTM, thanks.

Actions #16

Updated by Peter Amstutz about 2 years ago

  • Status changed from In Progress to Resolved
Actions

Also available in: Atom PDF