Reusable tasks » History » Revision 2
« Previous |
Revision 2/4
(diff)
| Next »
Tom Clegg, 10/08/2014 03:28 PM
- Table of contents
- Reusable tasks
Reusable tasks¶
Tom Clegg
Last Updated: October 6, 2014
Overview¶
Objective¶
Say jobs A and B, although not identical, have some tasks in common. Job A is complete. Job B starting now. They use the same script, version, docker image, etc. The only difference between A and B is that B's input collection has one more file; the rest of the files are identical. The script processes each input file independently, and it is a pure function (re-computing the same files will produce the same result). This means most of Job B's work has already been done. Task re-use will allow Arvados to recognize this condition and re-use the outputs of Job A's tasks instead of recomputing them.
Task re-use will not attempt to detect equivalence conditions like differently-encoded collection manifests with identical data, differing git commits with identical trees, and differing docker images with functionally equivalent content.
The intended audience for this document is software engineers.
Background¶
The arvados.v1.jobs.create API offers a find_or_create feature which searches for an existing job which meets criteria specified by the client (e.g., same script, compatible script_version) and additional criteria (e.g., did not fail, is not marked impure/nondeterministic, does not diagree with other jobs passing the same criteria about what the correct output is).
Alternatives¶
Always recompute each task (i.e., leave existing behavior).
This makes desirable use cases prohibitively expensive.
Use smaller jobs, and more jobs per pipeline.
We could make the dynamic-structure capabilities of crunch jobs available at the pipeline level, and de-emphasize or stop using the features that encourage long-running jobs. Disadvantages include:
- The process of running a pipeline is not done in a controlled environment. This effectively reduces the utility of reproducibility and provenance features.
- Pipelines are currently encoded as JSON which is awkward to use as a DSL.
Tradeoffs¶
TODO
High Level Design¶
Before executing a job_task that qualifies for re-use, crunch-job uses the API to discover existing job_tasks that are functionally identical, are marked as "pure", and have already finished.
Specifics¶
Detailed Design¶
The JobTask schema has a new boolean flag is_pure
(not null, default false
).
is_pure==true
, crunch-job does an API query look up other tasks with is_pure=true
and identical inputs, parameters, script_version, etc.
- Some attributes like script and script_version are currently stored in the job record, not the job_task record. This will make the lookup interesting, in the absence of a generic "join" API.
- Tasks with
is_pure==true
cannot queue additional tasks, andis_pure
cannot change fromfalse
totrue
. - Tasks do not qualify for reuse until they have completed.[1] When reusing a task, copy (and reset to "todo" state) each task whose
created_by_job_task_uuid
attribute references the task being reused.
1 At least in the short term, this constraint is a good way to limit the complexity of implementation without sacrificing too much of the user benefit.
Code Location¶
sdk/cli/bin/crunch-job
will have new task reuse logic.
services/api/db/migrate
will have a new migration, which will be reflected in services/api/db/structure.sql
.
services/api/app/models/job_task.rb
will add :is_pure to the API response and prohibit is_pure
from changing from false
to true
.
doc/api/schema/JobTask.html.textile.liquid
will document the :is_pure flag.
Testing Plan¶
TODO
Logging¶
crunch-job
will log the fact that it has copied its output attribute (and, if applicable, queued additional tasks) from an existing completed task.
Debugging¶
TODO
Caveats¶
To be determined.
Security Concerns¶
TODO
Open Questions and Risks¶
TODO
Work Estimates¶
TODO
Future Work¶
TODO
Revision History¶
Date | Revisions Made | Author | Reviewed By |
---|---|---|---|
October 6, 2014 | Initial Draft | Tom Clegg | ---- |
Updated by Tom Clegg over 10 years ago · 4 revisions