Bug #14726
Updated by Peter Amstutz almost 6 years ago
bcbio prep_samples_to_rec takes GRCh37.fa with secondaryFiles, and then return that same file in cwl.output.json. However, when it returns it, several secondary files have been added. From the perspective of arvados-cwl-runner, these files have appeared out of nowhere, because they were not declared in the inputs, and are not found in the output directory. However, this isn't detected as a user error but instead results in a failure, so the message in this case is extremely confusing and does not communicate to the user how to fix it.
<pre>
2019-01-11 14:54:53 cwltool DEBUG: [job prep_samples_to_rec] initializing from file:///home/peter/work/tmp/kfang/workflow.json#prep_samples_to_rec.cwl as part of step prep_samples_to_rec
2019-01-11 14:54:53 cwltool DEBUG: [job prep_samples_to_rec] {
"rgnames__sample": [
"RMNISTHS_30xdownsample"
],
"reference__fasta__base": [
{
"basename": "GRCh37.fa",
"nameroot": "GRCh37",
"nameext": ".fa",
"location": "keep:b334527110a98f97af35dfd3912fc989+40015/GRCh37/seq/GRCh37.fa",
"secondaryFiles": [
{
"basename": "GRCh37.fa.fai",
"nameroot": "GRCh37.fa",
"nameext": ".fai",
"location": "keep:b334527110a98f97af35dfd3912fc989+40015/GRCh37/seq/GRCh37.fa.fai",
"class": "File",
"size": 2746
},
{
"basename": "GRCh37.dict",
"nameroot": "GRCh37",
"nameext": ".dict",
"location": "keep:b334527110a98f97af35dfd3912fc989+40015/GRCh37/seq/GRCh37.dict",
"class": "File",
"size": 10257
}
],
"class": "File",
"size": 3153506519
}
],
"config__algorithm__variant_regions": [
null
],
"description": [
"RMNISTHS_30xdownsample"
],
"resources": [
"{}"
]
}
2019-01-11 14:54:53 arvados.arv-run INFO: Using empty collection d41d8cd98f00b204e9800998ecf8427e+0
2019-01-11 14:54:53 cwltool DEBUG: [job prep_samples_to_rec] path mappings is {
"keep:b334527110a98f97af35dfd3912fc989+40015/GRCh37/seq/GRCh37.fa": [
"keep:b334527110a98f97af35dfd3912fc989+40015/GRCh37/seq/GRCh37.fa",
"/keep/b334527110a98f97af35dfd3912fc989+40015/GRCh37/seq/GRCh37.fa",
"File",
true
],
"keep:b334527110a98f97af35dfd3912fc989+40015/GRCh37/seq/GRCh37.dict": [
"keep:b334527110a98f97af35dfd3912fc989+40015/GRCh37/seq/GRCh37.dict",
"/keep/b334527110a98f97af35dfd3912fc989+40015/GRCh37/seq/GRCh37.dict",
"File",
true
],
"keep:b334527110a98f97af35dfd3912fc989+40015/GRCh37/seq/GRCh37.fa.fai": [
"keep:b334527110a98f97af35dfd3912fc989+40015/GRCh37/seq/GRCh37.fa.fai",
"/keep/b334527110a98f97af35dfd3912fc989+40015/GRCh37/seq/GRCh37.fa.fai",
"File",
true
]
}
2019-01-11 14:55:05 cwltool DEBUG: Raw output from keep:dc8d284de7b3e4743524020de33c2799+290/cwl.output.json: {
"prep_samples_rec": [
{
"rgnames__sample": "RMNISTHS_30xdownsample",
"reference__fasta__base": {
"path": "/keep/b334527110a98f97af35dfd3912fc989+40015/GRCh37/seq/GRCh37.fa",
"class": "File",
"secondaryFiles": [
{
"path": "/keep/b334527110a98f97af35dfd3912fc989+40015/GRCh37/seq/GRCh37.fa.fai",
"class": "File"
},
{
"path": "/keep/b334527110a98f97af35dfd3912fc989+40015/GRCh37/seq/GRCh37.dict",
"class": "File"
},
{
"path": "/keep/b334527110a98f97af35dfd3912fc989+40015/GRCh37/seq/GRCh37.fa.gz",
"class": "File"
},
{
"path": "/keep/b334527110a98f97af35dfd3912fc989+40015/GRCh37/seq/GRCh37.fa.gz.gzi",
"class": "File"
},
{
"path": "/keep/b334527110a98f97af35dfd3912fc989+40015/GRCh37/seq/GRCh37-resources.yaml",
"class": "File"
},
{
"path": "/keep/b334527110a98f97af35dfd3912fc989+40015/GRCh37/seq/GRCh37.fa.gz.fai",
"class": "File"
}
]
},
"config__algorithm__variant_regions": null,
"description": "RMNISTHS_30xdownsample",
"resources": "{\"default\":{\"cores\":1,\"jvm_opts\":[\"-Xms1000m\",\"-Xmx16384m\"],\"memory\":\"16384M\"}}"
}
]
}
2019-01-11 14:55:05 arvados.cwl-runner ERROR: [container prep_samples_to_rec] while getting output object: u'keep:b334527110a98f97af35dfd3912fc989+40015/GRCh37/seq/GRCh37.fa.gz'
Traceback (most recent call last):
File "/home/peter/.arvbox/arvbox/arvados/sdk/cwl/arvados_cwl/arvcontainer.py", line 350, in done
outputs = done.done_outputs(self, container, "/tmp", self.outdir, "/keep")
File "/home/peter/.arvbox/arvbox/arvados/sdk/cwl/arvados_cwl/done.py", line 53, in done_outputs
return self.collect_outputs("keep:" + record["output"])
File "/home/peter/work/scripts/venv/local/lib/python2.7/site-packages/cwltool/command_line_tool.py", line 616, in collect_output_ports
visit_class(ret, ("File", "Directory"), cast(Callable[[Any], Any], revmap))
File "/home/peter/work/scripts/venv/local/lib/python2.7/site-packages/cwltool/utils.py", line 214, in visit_class
visit_class(rec[d], cls, op)
File "/home/peter/work/scripts/venv/local/lib/python2.7/site-packages/cwltool/utils.py", line 217, in visit_class
visit_class(d, cls, op)
File "/home/peter/work/scripts/venv/local/lib/python2.7/site-packages/cwltool/utils.py", line 214, in visit_class
visit_class(rec[d], cls, op)
File "/home/peter/work/scripts/venv/local/lib/python2.7/site-packages/cwltool/utils.py", line 214, in visit_class
visit_class(rec[d], cls, op)
File "/home/peter/work/scripts/venv/local/lib/python2.7/site-packages/cwltool/utils.py", line 217, in visit_class
visit_class(d, cls, op)
File "/home/peter/work/scripts/venv/local/lib/python2.7/site-packages/cwltool/utils.py", line 212, in visit_class
op(rec)
File "/home/peter/work/scripts/venv/local/lib/python2.7/site-packages/cwltool/command_line_tool.py", line 159, in revmap_file
if revmap_f and not builder.pathmapper.mapper(revmap_f[0]).type.startswith("Writable"):
File "/home/peter/work/scripts/venv/local/lib/python2.7/site-packages/cwltool/pathmapper.py", line 318, in mapper
return self._pathmap[src]
KeyError: u'keep:b334527110a98f97af35dfd3912fc989+40015/GRCh37/seq/GRCh37.fa.gz'
2019-01-11 14:55:05 cwltool ERROR: [step prep_samples_to_rec] Output is missing expected field file:///home/peter/work/tmp/kfang/workflow.json#main/prep_samples_to_rec/prep_samples_rec
</pre>