Pipeline template development » History » Version 2
Bryan Cosca, 04/19/2016 07:55 PM
| 1 | 1 | Bryan Cosca | h1. Pipeline template development |
|---|---|---|---|
| 2 | |||
| 3 | 2 | Bryan Cosca | This wiki will describe how to write a pipeline template. Some documentation for writing a pipeline template using run-command is available on "doc.arvados.org":http://doc.arvados.org/user/tutorials/running-external-program.html |
| 4 | |||
| 5 | <pre> |
||
| 6 | "components": { |
||
| 7 | "JobName": { |
||
| 8 | "script": "JobScript", |
||
| 9 | "script_version": "master", |
||
| 10 | "repository": "yourname/yourname", |
||
| 11 | "script_parameters": { |
||
| 12 | "CollectionOne": { |
||
| 13 | "required": true, |
||
| 14 | "dataclass": "Collection" |
||
| 15 | }, |
||
| 16 | "ParameterOne":{ |
||
| 17 | "required": true, |
||
| 18 | "dataclass": "text", |
||
| 19 | "default": "ParameterOneString" |
||
| 20 | } |
||
| 21 | }, |
||
| 22 | "runtime_constraints": { |
||
| 23 | "docker_image": "bcosc/arv-base-java", |
||
| 24 | "arvados_sdk_version": "master" |
||
| 25 | } |
||
| 26 | } |
||
| 27 | } |
||
| 28 | </pre> |
||
| 29 | |||
| 30 | 1 | Bryan Cosca | How to wrap a git repository containing a crunch script and a docker image into a component |
| 31 | Link to "Git Strategy for Pipeline Development" wiki page |
||
| 32 | |||
| 33 | 2 | Bryan Cosca | h3. Writing script_parameters |
| 34 | 1 | Bryan Cosca | |
| 35 | 2 | Bryan Cosca | "Script_parameters":http://doc.arvados.org/api/schema/PipelineTemplate.html are inputs that can be called in your crunch script. Each script parameter can have any dataclass: Collection, File, number, text. Collection passes in the pdh string (ex. 39c6f22d40001074f4200a72559ae7eb+5745), File passes in a file path in a collection (ex. 39c6f22d40001074f4200a72559ae7eb+5745/foo.txt), number passes in any integer, and text passes in any string. |
| 36 | 1 | Bryan Cosca | |
| 37 | 2 | Bryan Cosca | The default parameter is useful for using a collection you know will most likely be used, so the user does not have to input it manually. For example, a reference genome collection that will be used throughout the entire pipeline. |
| 38 | |||
| 39 | The title and description parameters are useful for showing what the script parameter is doing, but is not necessary. |
||
| 40 | |||
| 41 | h3. Writing runtime_constraints |
||
| 42 | |||
| 43 | "Runtime_constraints":http://doc.arvados.org/api/schema/Job.html are inputs in your job that help choose node parameters that your pipeline will run on. Optimizing these parameters can be found in the "Pipeline_Optimization wiki.":https://dev.arvados.org/projects/arvados/wiki/Pipeline_Optimization |
||
| 44 | |||
| 45 | 1 | Bryan Cosca | The actual meaning of min_nodes |
| 46 | Setting max_tasks_per_node != 1 |