Bug #16482
closed[crunch] bump a-c-r's cwltool dependency to pass CWL v1.2.0-dev3 tests
Added by Ward Vandewege over 4 years ago. Updated over 4 years ago.
100%
Updated by Ward Vandewege over 4 years ago
- Status changed from New to In Progress
Updated by Ward Vandewege over 4 years ago
- Subject changed from [crunch] bump cwltool dependency version on a-c-r to pass CWL v1.2.0-dev3 tests to [crunch] bump a-c-r's cwltool dependency to pass CWL v1.2.0-dev3 tests
Updated by Ward Vandewege over 4 years ago
Running developer tests at developer-run-tests: #1887 , and they passed.
Ready for review 0f97ce28deb04faf2d6b19c7312ef233f28665ad on branch 16482-bump-cwltool-version
Updated by Peter Amstutz over 4 years ago
Ward Vandewege wrote:
Running developer tests at developer-run-tests: #1887 , and they passed.
Ready for review 0f97ce28deb04faf2d6b19c7312ef233f28665ad on branch 16482-bump-cwltool-version
LGTM.
Updated by Anonymous over 4 years ago
- Status changed from In Progress to Resolved
- % Done changed from 0 to 100
Applied in changeset arvados|0aec9ab099a57996f52f3c5d120ab0bafde6b2ab.
Updated by Ward Vandewege over 4 years ago
- Related to Bug #16382: arvados-cwl-conformance-tests failing in jenkins added
Updated by Ward Vandewege over 4 years ago
- Status changed from Resolved to In Progress
- Target version changed from 2020-06-03 Sprint to 2020-06-17 Sprint
It turns out we needed an even newer version of cwltool, which was released today. This also required bumping the version of schema-salad. I pushed an updated 8e9f21692e6a815b4aac226f8fb87ec3d716f781 on the 16482-bump-cwltool-version branch, and ran the developer tests. The sdk/cwl tests (see developer-run-tests-remainder: #1964 /consoleFull) are now failing with
======================================================================
16:12:54 ERROR: test_submit (unittest.loader._FailedTest)
16:12:54 ----------------------------------------------------------------------
16:12:54 ImportError: Failed to import test module: test_submit
16:12:54 Traceback (most recent call last):
16:12:54 File "/usr/lib/python3.7/unittest/loader.py", line 154, in loadTestsFromName
16:12:54 module = import(module_name)
16:12:54 File "/tmp/workspace/developer-run-tests-remainder/sdk/cwl/tests/test_submit.py", line 36, in <module>
16:12:54 import arvados_cwl
16:12:54 File "/tmp/workspace/developer-run-tests-remainder/sdk/cwl/arvados_cwl/__init__.py", line 26, in <module>
16:12:54 from cwltool.pathmapper import adjustFileObjs, adjustDirObjs, get_listing
16:12:54 ImportError: cannot import name 'get_listing' from 'cwltool.pathmapper' (/home/jenkins/tmp/VENV3DIR/lib/python3.7/site-packages/cwltool/pathmapper.py)
16:12:54
Updated by Ward Vandewege over 4 years ago
I tracked the above down to a change in cwltool, and made the according change in a-c-r at fbc4a41fab79220108602f1cadd30f34cdbcea11 on branch 16482-bump-cwltool-version.
The tests now fail like this:
test_tq_error (tests.test_tq.TestTaskQueue) ... 2020-06-04 00:27:52 arvados.cwl-runner ERROR: Unhandled exception running task Traceback (most recent call last): File "/root/arvados/sdk/cwl/arvados_cwl/task_queue.py", line 36, in task_queue_func task() File "/root/arvados/sdk/cwl/tests/test_tq.py", line 20, in fail_task raise Exception("Testing error handling") Exception: Testing error handling ok test_create (tests.test_submit.TestCreateWorkflow) ... INFO setup.py 2.1.0.dev20200603211541, arvados-python-client 2.1.0.dev20200521142235, cwltool 3.0.20200530110633 INFO Resolved 'tests/wf/submit_wf.cwl' to 'file:///root/arvados/sdk/cwl/tests/wf/submit_wf.cwl' ERROR I'm sorry, I couldn't load this CWL file. The error was: Traceback (most recent call last): File "/root/arvados/sdk/cwl/.eggs/cwltool-3.0.20200530110633-py3.7.egg/cwltool/main.py", line 940, in main skip_schemas=args.skip_schemas, File "/root/arvados/sdk/cwl/.eggs/cwltool-3.0.20200530110633-py3.7.egg/cwltool/load_tool.py", line 360, in resolve_and_validate_document (sch_document_loader, avsc_names) = process.get_schema(cwlVersion)[:2] File "/root/arvados/sdk/cwl/.eggs/cwltool-3.0.20200530110633-py3.7.egg/cwltool/process.py", line 221, in get_schema SCHEMA_CACHE[version] = load_schema(custom_schemas[version][0], cache=cache) File "/root/arvados/sdk/cwl/.eggs/schema_salad-6.0.20200601095207-py3.7.egg/schema_salad/schema.py", line 242, in load_schema schema_doc, schema_metadata = metaschema_loader.resolve_ref(schema_ref, "") File "/root/arvados/sdk/cwl/.eggs/schema_salad-6.0.20200601095207-py3.7.egg/schema_salad/ref_resolver.py", line 718, in resolve_ref doc = self.fetch(doc_url, inject_ids=(not mixin)) File "/root/arvados/sdk/cwl/.eggs/schema_salad-6.0.20200601095207-py3.7.egg/schema_salad/ref_resolver.py", line 1155, in fetch text = self.fetch_text(url) File "/root/arvados/sdk/cwl/.eggs/schema_salad-6.0.20200601095207-py3.7.egg/schema_salad/ref_resolver.py", line 178, in fetch_text assert isinstance(result, str) AssertionError FAIL ...
The assertion fails because 'result' is not of type `str`, but of type `bytes`. The latter is for binary data, so this is a bit mysterious. Why does it think that data is binary? Here's what it looks like when printed out:
test_create (tests.test_submit.TestCreateWorkflow) ... INFO setup.py 2.1.0.dev20200603211541, arvados-python-client 2.1.0.dev20200521142235, cwltool 3.0.20200530110633 INFO Resolved 'tests/wf/submit_wf.cwl' to 'file:///root/arvados/sdk/cwl/tests/wf/submit_wf.cwl' b'# Copyright (C) The Arvados Authors. All rights reserved.\n#\n# SPDX-License-Identifier: Apache-2.0\n\n$base: "http://arvados.org/cwl#"\n$namespaces:\n cwl: "https://w3id.org/cwl/cwl#"\n cwltool: "http://commonwl.org/cwltool#"\n$graph:\n- $import: https://w3id.org/cwl/CommonWorkflowLanguage.yml\n\n- name: cwltool:LoadListingRequirement\n type: record\n extends: cwl:ProcessRequirement\n inVocab: false\n fields:\n class:\n type: string\n doc: "Always \'LoadListingRequirement\'"\n jsonldPredicate:\n "_id": "@type"\n "_type": "@vocab"\n loadListing:\n type:\n - "null"\n - type: enum\n name: LoadListingEnum\n symbols: [no_listing, shallow_listing, deep_listing]\n\n- name: cwltool:Secrets\n type: record\n inVocab: false\n extends: cwl:ProcessRequirement\n fields:\n class:\n type: string\n doc: "Always \'Secrets\'"\n jsonldPredicate:\n "_id": "@type"\n "_type": "@vocab"\n secrets:\n type: string[]\n doc: |\n List one or more input parameters that are sensitive (such as passwords)\n which will be deliberately obscured from logging.\n jsonldPredicate:\n "_type": "@id"\n refScope: 0\n\n- name: cwltool:TimeLimit\n type: record\n inVocab: false\n extends: cwl:ProcessRequirement\n doc: |\n Set an upper limit on the execution time of a CommandLineTool or\n ExpressionTool. A tool execution which exceeds the time limit may\n be preemptively terminated and considered failed. May also be\n used by batch systems to make scheduling decisions.\n fields:\n - name: class\n type: string\n doc: "Always \'TimeLimit\'"\n jsonldPredicate:\n "_id": "@type"\n "_type": "@vocab"\n - name: timelimit\n type: [long, string]\n doc: |\n The time limit, in seconds. A time limit of zero means no\n time limit. Negative time limits are an error.\n\n- name: RunInSingleContainer\n type: record\n extends: cwl:ProcessRequirement\n inVocab: false\n doc: |\n Indicates that a subworkflow should run in a single container\n and not be scheduled as separate steps.\n fields:\n - name: class\n type: string\n doc: "Always \'arv:RunInSingleContainer\'"\n jsonldPredicate:\n _id: "@type"\n _type: "@vocab"\n\n- name: OutputDirType\n type: enum\n symbols:\n - local_output_dir\n - keep_output_dir\n doc:\n - |\n local_output_dir: Use regular file system local to the compute node.\n There must be sufficient local scratch space to store entire output;\n specify this with `outdirMin` of `ResourceRequirement`. Files are\n batch uploaded to Keep when the process completes. Most compatible, but\n upload step can be time consuming for very large files.\n - |\n keep_output_dir: Use writable Keep mount. Files are streamed to Keep as\n they are written. Does not consume local scratch space, but does consume\n RAM for output buffers (up to 192 MiB per file simultaneously open for\n writing.) Best suited to processes which produce sequential output of\n large files (non-sequential writes may produced fragmented file\n manifests). Supports regular files and directories, does not support\n special files such as symlinks, hard links, named pipes, named sockets,\n or device nodes.\n\n\n- name: RuntimeConstraints\n type: record\n extends: cwl:ProcessRequirement\n inVocab: false\n doc: |\n Set Arvados-specific runtime hints.\n fields:\n - name: class\n type: string\n doc: "Always \'arv:RuntimeConstraints\'"\n jsonldPredicate:\n _id: "@type"\n _type: "@vocab"\n - name: keep_cache\n type: int?\n doc: |\n Size of file data buffer for Keep mount in MiB. Default is 256\n MiB. Increase this to reduce cache thrashing in situations such as\n accessing multiple large (64+ MiB) files at the same time, or\n performing random access on a large file.\n - name: outputDirType\n type: OutputDirType?\n doc: |\n Preferred backing store for output staging. If not specified, the\n system may choose which one to use.\n\n- name: PartitionRequirement\n type: record\n extends: cwl:ProcessRequirement\n inVocab: false\n doc: |\n Select preferred compute partitions on which to run jobs.\n fields:\n - name: class\n type: string\n doc: "Always \'arv:PartitionRequirement\'"\n jsonldPredicate:\n _id: "@type"\n _type: "@vocab"\n - name: partition\n type:\n - string\n - string[]\n\n- name: APIRequirement\n type: record\n extends: cwl:ProcessRequirement\n inVocab: false\n doc: |\n Indicates that process wants to access to the Arvados API. Will be granted\n limited network access and have ARVADOS_API_HOST and ARVADOS_API_TOKEN set\n in the environment.\n fields:\n - name: class\n type: string\n doc: "Always \'arv:APIRequirement\'"\n jsonldPredicate:\n _id: "@type"\n _type: "@vocab"\n\n- name: IntermediateOutput\n type: record\n extends: cwl:ProcessRequirement\n inVocab: false\n doc: |\n Specify desired handling of intermediate output collections.\n fields:\n class:\n type: string\n doc: "Always \'arv:IntermediateOutput\'"\n jsonldPredicate:\n _id: "@type"\n _type: "@vocab"\n outputTTL:\n type: int\n doc: |\n If the value is greater than zero, consider intermediate output\n collections to be temporary and should be automatically\n trashed. Temporary collections will be trashed `outputTTL` seconds\n after creation. A value of zero means intermediate output should be\n retained indefinitely (this is the default behavior).\n\n Note: arvados-cwl-runner currently does not take workflow dependencies\n into account when setting the TTL on an intermediate output\n collection. If the TTL is too short, it is possible for a collection to\n be trashed before downstream steps that consume it are started. The\n recommended minimum value for TTL is the expected duration of the\n entire the workflow.\n\n- name: ReuseRequirement\n type: record\n extends: cwl:ProcessRequirement\n inVocab: false\n doc: |\n Enable/disable work reuse for current process. Default true (work reuse enabled).\n fields:\n - name: class\n type: string\n doc: "Always \'arv:ReuseRequirement\'"\n jsonldPredicate:\n _id: "@type"\n _type: "@vocab"\n - name: enableReuse\n type: boolean\n\n- name: WorkflowRunnerResources\n type: record\n extends: cwl:ProcessRequirement\n inVocab: false\n doc: |\n Specify memory or cores resource request for the CWL runner process itself.\n fields:\n class:\n type: string\n doc: "Always \'arv:WorkflowRunnerResources\'"\n jsonldPredicate:\n _id: "@type"\n _type: "@vocab"\n ramMin:\n type: int?\n doc: Minimum RAM, in mebibytes (2**20)\n jsonldPredicate: "https://w3id.org/cwl/cwl#ResourceRequirement/ramMin"\n coresMin:\n type: int?\n doc: Minimum cores allocated to cwl-runner\n jsonldPredicate: "https://w3id.org/cwl/cwl#ResourceRequirement/coresMin"\n keep_cache:\n type: int?\n doc: |\n Size of collection metadata cache for the workflow runner, in\n MiB. Default 256 MiB. Will be added on to the RAM request\n when determining node size to request.\n jsonldPredicate: "http://arvados.org/cwl#RuntimeConstraints/keep_cache"\n\n- name: ClusterTarget\n type: record\n extends: cwl:ProcessRequirement\n inVocab: false\n doc: |\n Specify where a workflow step should run\n fields:\n class:\n type: string\n doc: "Always \'arv:ClusterTarget\'"\n jsonldPredicate:\n _id: "@type"\n _type: "@vocab"\n cluster_id:\n type: string?\n doc: The cluster to run the container\n project_uuid:\n type: string?\n doc: The project that will own the container requests and intermediate collections\n' ERROR I'm sorry, I couldn't load this CWL file. The error was: Traceback (most recent call last): File "/root/arvados/sdk/cwl/.eggs/cwltool-3.0.20200530110633-py3.7.egg/cwltool/main.py", line 940, in main skip_schemas=args.skip_schemas, File "/root/arvados/sdk/cwl/.eggs/cwltool-3.0.20200530110633-py3.7.egg/cwltool/load_tool.py", line 360, in resolve_and_validate_document (sch_document_loader, avsc_names) = process.get_schema(cwlVersion)[:2] File "/root/arvados/sdk/cwl/.eggs/cwltool-3.0.20200530110633-py3.7.egg/cwltool/process.py", line 221, in get_schema SCHEMA_CACHE[version] = load_schema(custom_schemas[version][0], cache=cache) File "/root/arvados/sdk/cwl/.eggs/schema_salad-6.0.20200601095207-py3.7.egg/schema_salad/schema.py", line 242, in load_schema schema_doc, schema_metadata = metaschema_loader.resolve_ref(schema_ref, "") File "/root/arvados/sdk/cwl/.eggs/schema_salad-6.0.20200601095207-py3.7.egg/schema_salad/ref_resolver.py", line 723, in resolve_ref doc = self.fetch(doc_url, inject_ids=(not mixin)) File "/root/arvados/sdk/cwl/.eggs/schema_salad-6.0.20200601095207-py3.7.egg/schema_salad/ref_resolver.py", line 1160, in fetch text = self.fetch_text(url) File "/root/arvados/sdk/cwl/.eggs/schema_salad-6.0.20200601095207-py3.7.egg/schema_salad/ref_resolver.py", line 183, in fetch_text assert isinstance(result, str) AssertionError FAIL
Updated by Michael Crusoe over 4 years ago
Ward Vandewege wrote:
The assertion fails because 'result' is not of type `str`, but of type `bytes`. The latter is for binary data, so this is a bit mysterious. Why does it think that data is binary? Here's what it looks like when printed out:
Note the `b` in `b'# Copyright (C)...'`, you are passing in binary.
Updated by Ward Vandewege over 4 years ago
Michael Crusoe wrote:
Ward Vandewege wrote:
The assertion fails because 'result' is not of type `str`, but of type `bytes`. The latter is for binary data, so this is a bit mysterious. Why does it think that data is binary? Here's what it looks like when printed out:
Note the `b` in `b'# Copyright (C)...'`, you are passing in binary.
Indeed; and thanks for your help with the identification of where this happens (resource_stream calls). It's fixed in f423aff73c1927a74e39c738e08bd6f1100a94c5 on branch 16482-bump-cwltool-version, tests running at developer-run-tests-remainder: #1966 /console, and they passed.
Updated by Ward Vandewege over 4 years ago
f423aff73c1927a74e39c738e08bd6f1100a94c5 on branch 16482-bump-cwltool-version is ready for review
tests passed at developer-run-tests-remainder: #1966 /console
Updated by Ward Vandewege over 4 years ago
- Target version changed from 2020-06-17 Sprint to 2020-07-01 Sprint
Updated by Peter Amstutz over 4 years ago
Ward Vandewege wrote:
f423aff73c1927a74e39c738e08bd6f1100a94c5 on branch 16482-bump-cwltool-version is ready for review
tests passed at developer-run-tests-remainder: #1966 /console
LGTM.
Updated by Anonymous over 4 years ago
- Status changed from In Progress to Resolved
Applied in changeset arvados|a5a6111e355f743c1f6882316959f6ecae4af00a.