Story #13429
closed
[API] [arvados-cwl-runner] Save workflow outputs to desired storage classes
Added by Tom Morris over 6 years ago.
Updated over 6 years ago.
Estimated time:
(Total: 0.00 h)
Release relationship:
Auto
Description
arvados-cwl-runner should accept a command line option that determines the storage classes used to save the workflow output.
The minimal implementation is to update the storage_classes_desired field of the output collection. It's OK that this wastes storage resources by first writing the output data to the default class, then copying it to the desired class, and then (after blobSignatureTTL) deleting it from default.
- Related to Feature #11184: [Keep] Support multiple storage classes added
- Target version set to To Be Groomed
- Description updated (diff)
- Subject changed from arvados-cwl-runner support storage tiers to [API] [arvados-cwl-runner] Save workflow outputs to desired storage classes
- Related to deleted (Feature #11184: [Keep] Support multiple storage classes)
- Description updated (diff)
- Target version changed from To Be Groomed to Arvados Future Sprints
- Story points set to 1.0
- Target version changed from Arvados Future Sprints to 2018-06-06 Sprint
- Assigned To set to Fuad Muhic
Reviewing 95c01d22571043d7b6405f928538204cf930d453
- File
sdk/cwl/arvados_cwl/__init__.py
- Line 604: I think the default storage classes value could be set on the
parser.add_argument()
call
- Line 606: On 13430 one of the requirements was to accept just one storage class for the moment, it that also desirable on this story?
- Line 727: Typo on help message: 'wortkflow'
- I think some tests are needed for cases when the required storage classes is not the default value, and to check that it errors out when passed multiple storage classes.
- Status changed from New to In Progress
I added check and tests for multiple storage classes. I think it would be nice to have a test that checks if output collection
contains storage_classes_desired field set to proper value defined by --storage-classes argument. However I'm really not sure how to write that test, I tried but I guess I'm not experienced enough with cwl-runner.
- Target version changed from 2018-06-06 Sprint to 2018-06-20 Sprint
Reviewing b01b480adb45c3bbfcfab13e343e08c16854dedc
- File
sdk/cwl/tests/test_submit.py
* Line 338: capture_stdout
is not being used after the main()
call so I suppose it can avoided or read from it to assure that the correct error message is being printed?
- File
sdk/cwl/arvados_cwl/__init__.py
* Line 789: On error I think the way to go is to log it and return 1 (as in lines 804 & 824)
- I think a test that assures an output collection has correct correct
storage_classes_desired
field is not necessary as the Python SDK Collection’s test already proves that. OTOH I think having a test that confirms that default & non-default —storage-classes
values are passed correctly to make_output_collection()
will be very useful. I think this can be done using a similar approach as your test_error_when_multiple_storage_classes_specified
test and mocking make_output_collection
to check how it was called.
- I realized that the storage classes argument is being used only when
—submit
is not added. When it is used, a-c-r submits a container that runs another a-c-r with all necessary params (see test at test_submit.py
line 591, for example), so we’ll need a similar test that assures that the correct@ —storage-classes@ param is being passed along. I’m not sure where this needs to be added, please talk to Peter if you don’t find it.
- Also, remember to merge master into this branch so Jenkins test don't fail because of
ciso8601
tests: https://ci.curoverse.com/job/developer-run-tests/759/ (with merged master)
- Added test which check that storage_classes parameters is propagated correctly to make_output_collection
- In --submit mode, storage_classes are passes to submitted container. (I opted to pass --storage-classes parameter only if user specified if in the first place. If user don't specify --storage-classes, submitted container will not have --storage-classes=default. Everything still works correctly and we don't have to add 100+ lines to test_submit.py). I added tests for this.
- Status changed from In Progress to Resolved
- % Done changed from 0 to 100
- Status changed from Resolved to Closed
Also available in: Atom
PDF