Bug #13766
open[API] bug in libyaml / Pysch used by API server to parse yaml
50%
Description
Since 0.13.11 ruamel.yaml produces files like this:
- id: #main/x type: File default: {class: File, location: keep:169f39d466a5438ac4a90e779bf750c7+53/blorp.txt, size: 16, basename: blorp.txt, nameroot: blorp, nameext: .txt}
The issue comes up from keep:169f39d466a5438ac4a90e779bf750c7+53/blorp.txt in "flow style" has an embedded ':' character.
Relevant changelog:
(0.13.11) 2017-01-23: - allow ':' in flow style scalars if not followed by space. Also don't quote such scalar as this is no longer necessary. - add python 3.6 manylinux wheel to PyPI
This is explicitly allowed by the YAML spec, however the API server rejects it:
<HttpError 422 when requesting https://172.17.0.2:8000/arvados/v1/workflows?alt=json returned "Definition is not valid yaml abc: (<unknown>): found unexpected ':' while scanning a plain scalar at line 22 column 38">
This seems to be a bug in libyaml, which is used by Pysch (the Ruby YAML engine). It was recently fixed:
https://github.com/yaml/libyaml/pull/104
However at the time of this writing there is no stable release of libyaml with this bugfix, only a prerelease 0.2.2-pre1
Pysch is part of the standard library. I don't know if it is tied to a particular libyaml, and how to upgrade which libyaml gets used.
Options:
- Wait for stable releases of libyaml / pysch (unknown amount of time)
- Downgrade ruamel.yaml (messy)
- Some kind of regex output fixup in a-c-r to ensure these values are quoted
- Some kind of regex input fixup on api server to ensure these values are quoted
- Emit plain block style yaml, not "round trip" (breaks tests, tedious but fixable)
- Emit plain json (breaks tests, tedious but fixable)
Updated by Peter Amstutz over 6 years ago
- Status changed from New to In Progress
Updated by Peter Amstutz over 6 years ago
- Related to Bug #13681: [CWL] Cannot set submit-runner-ram when running from Workbench added
Updated by Peter Amstutz over 6 years ago
- Status changed from In Progress to New
- Assigned To set to Peter Amstutz
Updated by Peter Amstutz over 6 years ago
Wait for stable releases of libyaml / pysch (unknown amount of time)
We'll keep an eye on it, but I think this would be hairy for ops, because it requires compiling and packaging a C library for all our supported platforms.
Downgrade ruamel.yaml (messy)
Would revert this change, but may revert other bugfixes we want as well. (A downgrade would also probably inflict a lot of pain on ops).
Some kind of regex output fixup in a-c-r to ensure these values are quotedSome kind of regex input fixup on api server to ensure these values are quoted
Hard to write a regex that works in every situation, since it isn't actually parsing YAML.
Emit plain block style yaml, not "round trip" (breaks tests, tedious but fixable)
Turns out yaml.safe_dump doesn't work because round_trip_load introduces object types that are not plain dicts, and yaml.dump() has the same problem as yaml.round_trip_dump().
cannot represent an object: ordereddict([('class', 'CommandLineTool'), ('requirements', [ordereddict([('class', 'DockerRequirement'), ('dockerPull', 'debian:8')])]), ('inputs', [ordereddict([('id', u'#submit_tool.cwl/x'), ('type', 'File'), ('default', ordereddict([('class', 'File'), ('location', u'keep:5d373e7629203ce39e7c22af98a0f881+52/blub.txt')])), ('inputBinding', ordereddict([('position', 1)]))])]), ('outputs', []), ('baseCommand', 'cat'), (u'id', u'#submit_tool.cwl')])
- Emit plain json (breaks tests, tedious but fixable)
This is the fallback solution.
Updated by Peter Amstutz over 6 years ago
13766-libyaml-workaround @ 335ee76030e85fa6ac4da79b598cf4c4a212443d
- Use json.dumps() instead of yaml.round_trip_dump()
- Fix tests.
Another way to avoid this problem would be to make the primary workflow definition reference a collection or git commit, and have the record only store metadata / input parameter definition.
Updated by Peter Amstutz over 6 years ago
Updated by Lucas Di Pentima over 6 years ago
335ee76030e85fa6ac4da79b598cf4c4a212443d LGTM. Thanks.
Updated by Peter Amstutz over 6 years ago
- Status changed from New to In Progress
Updated by Tom Morris over 6 years ago
- Target version changed from 2018-07-18 Sprint to 2018-08-01 Sprint
Updated by Peter Amstutz over 6 years ago
- Target version changed from 2018-08-01 Sprint to 2018-08-15 Sprint
Updated by Peter Amstutz over 6 years ago
- Target version changed from 2018-08-15 Sprint to Arvados Future Sprints
Updated by Peter Amstutz over 3 years ago
- Target version deleted (
Arvados Future Sprints)