Story #15535
open[CWL] Run from original CWL, not packed
0%
Description
arvados-cwl-runner, on submitting a workflow, uses "pack" to create a single-stream document.
This is because the "workflow" record used to display workflows on workbench only stores a single raw text field, into which the multi-document CWL file has to be stuffed. The rationale for having a workflow record be a text field and not a PDH or git hash to avoid requiring workbench be able to fetch a collection / git repo to display a workflow. Although this isn't a limitation when submitting from the command line, this also uses the "pack" function to minimize having multiple code paths.
Unfortunately the packed version often bares little resemblance to user's original document. It would be better to execute the original document.
Proposal:
At CLI: Upload original workflow files & dependencies to a collection, preserving original filesystem structure. Submit a container request that mounts the collection and runs the workflow.
At workbench: to register workflow record, create a wrapper workflow that has the same input/output interface as the workflow, with a single step with a run line like:
run: keep:pdh/workflow
To submit the workflow, workbench introspects the step and sets up the correct collection mount.
To display cwl-svg of a workflow, workbench2 needs to be able to fetch the files from keep-web.
To maximize reuse, dependencies of each CommandLineTool are still copied to separate collections.
Updated by Peter Amstutz over 5 years ago
- Status changed from New to In Progress
Updated by Peter Amstutz over 5 years ago
- Description updated (diff)
- Status changed from In Progress to New
Updated by Peter Amstutz over 5 years ago
- Related to Story #15580: [CWL] Register workflow and run from git repo added
Updated by Tom Morris over 5 years ago
- Target version changed from To Be Groomed to Arvados Future Sprints
- Story points set to 3.0
Updated by Peter Amstutz over 3 years ago
- Target version deleted (
Arvados Future Sprints)