Idea #10111: [Workbench][Crunch2] Provenance graph for Container Request - Arvados

Actions

Copy link

#1

Updated by Tom Morris over 9 years ago

Subject changed from Provenance graph for Container Request to [Crunch2] Provenance graph for Container Request

Actions

Copy link

#2

Updated by Tom Morris over 9 years ago

Story points set to 2.0

Extract dependency graph from Container Request and pass to existing code which knows how to use GraphViz to format and reuse the rest of the existing infrastructure.

Actions

Copy link

#3

Updated by Tom Morris over 9 years ago

Target version set to 2016-11-23 sprint

Actions

Copy link

#4

Updated by Tom Clegg over 9 years ago

Assigned To set to Tom Clegg

Actions

Copy link

#5

Updated by Tom Clegg over 9 years ago

Assigned To deleted (~~Tom Clegg~~)

Actions

Copy link

#6

Updated by Tom Clegg over 9 years ago

Target version changed from 2016-11-23 sprint to Arvados Future Sprints

Actions

Copy link

#7

Updated by Tom Morris over 9 years ago

Description updated (diff)

Actions

Copy link

#8

Updated by Tom Morris about 9 years ago

Target version changed from Arvados Future Sprints to 2017-04-12 sprint

Actions

Copy link

#9

Updated by Tom Morris almost 9 years ago

Tracker changed from Bug to Idea
Subject changed from [Crunch2] Provenance graph for Container Request to [Workbench][Crunch2] Provenance graph for Container Request
Assigned To set to Lucas Di Pentima

Actions

Copy link

#10

Updated by Lucas Di Pentima almost 9 years ago

Status changed from New to In Progress

Actions

Copy link

#11

Updated by Lucas Di Pentima almost 9 years ago

Target version changed from 2017-04-12 sprint to 2017-04-26 sprint

Actions

Copy link

#12

Updated by Lucas Di Pentima almost 9 years ago

Updates at 93c92875aaebe5b06f8dbfe2822b59a772895c08 (branch 10111-cr-provenance-graph)

Radhika: While I start working on the tests, I would like to check with you if this is the correct approach, and if there are missing elements on the graph to be included.

Actions

Copy link

#13

Updated by Lucas Di Pentima almost 9 years ago

Merged master && added some tests: 4ccbea9ef
Test run: https://ci.curoverse.com/job/developer-run-tests/247/

Actions

Copy link

#14

Updated by Radhika Chippada almost 9 years ago

For a CR -> Use Inputs from Mounts + output uuid + log uuid as the nodes

And then need to get all child CRs for this CR and repeat the above.

Actions

Copy link

#15

Updated by Lucas Di Pentima almost 9 years ago

Updates at 30146198f
Test run: https://ci.curoverse.com/job/developer-run-tests/251/

Removed container_image and requesting_container from the graph.
Added child CRs with their own mounts/output/log.

Actions

Copy link

#16

Updated by Radhika Chippada almost 9 years ago

The graph is looking pretty good. A few observations:

As we discussed, I think we should only use input collections (plus output and log uuids) for the CR in “find_collections cr[:mounts]” and exclude any other collections, if any. I think you probably need to look for collection uuids / pdhs in the segment returned by application_helper.get_cwl_inputs?

extra white space at line ends in _show_provenance.html.erb

Thanks.

Actions

Copy link

#17

Updated by Lucas Di Pentima almost 9 years ago

Updates: 260e85a9d

As we discussed, I think we should only use input collections (plus output and log uuids) for the CR in “find_collections cr[:mounts]” and exclude any other collections, if any. I think you probably need to look for collection uuids / pdhs in the segment returned by application_helper.get_cwl_inputs?

Ah yes, I thought that all mounts that specified a UUID/PDH were implicitly an input.

I have changed the code so it searches only for collections inside the /var/lib/cwl/cwl.input.json mount, that as I understand by reading the get_cwl_inputs() helper, it's the object describing the input mounts.
The result is that if a CR step is created and have mounts that aren’t from a CWL definition (example: bwa command execution that uses FUSE), those mounts on the child CR won’t be included in the graph (ie: 9tee4’s /container_requests/9tee4-xvhdp-29wnyz1npk9bycs#Provenance), is that ok or should I search for “any collection” inside mounts when not using arvados-cwl-runner on the command?

extra white space at line ends in _show_provenance.html.erb

Oops! done.

Another question: Currently I’m showing the Provenance tab on those CRs with state != Uncommitted, should I change that to only CR in Final state?

Actions

Copy link

#18

Updated by Radhika Chippada almost 9 years ago

I have changed the code so it searches only for collections inside the /var/lib/cwl/cwl.input.json mount, that as I understand by reading the get_cwl_inputs() helper, it's the object describing the input mounts. The result is that if a CR step is created and have mounts that aren’t from a CWL definition (example: bwa command execution that uses FUSE), those mounts on the child CR won’t be included in the graph (ie: 9tee4’s /container_requests/9tee4-xvhdp-29wnyz1npk9bycs#Provenance), is that ok or should I search for “any collection” inside mounts when not using arvados-cwl-runner on the command?

Yes, this seems problematic. I think we should check for /keep/<pdh> format instead in mounts to get the input collections. Please confirm with Peter. Thanks.

Another question: Currently I’m showing the Provenance tab on those CRs with state != Uncommitted, should I change that to only CR in Final state?

Comparing with pipeline_instances and jobs, this seems correct (to show graph for Queued etc)

Actions

Copy link

#19

Updated by Lucas Di Pentima almost 9 years ago

Update at: 88c241d7c
Test run will be on: https://ci.curoverse.com/job/developer-run-tests/257/

Search for all PDHs on "mounts" on cases when cwl.input.json is not included. As talked with Radhika & Bryan, outputs aren't listed using PDHs, just paths. So there's no possibility of including an output as an input.

Actions

Copy link

#20

Updated by Lucas Di Pentima almost 9 years ago

Update at: edfc619e6

As requested on the sprint review meeting, changed the graph edges from "cr" to "child" and "mounts" to "input".

Actions

Copy link

#21

Updated by Radhika Chippada almost 9 years ago

LGTM @ edfc619

Actions

Copy link

#22

Updated by Lucas Di Pentima almost 9 years ago

Status changed from In Progress to Resolved
% Done changed from 0 to 100

Applied in changeset arvados|commit:b075d1be1377760f5d8497a29f63c8e416cd5378.

Actions

Copy link

#23

Updated by Peter Amstutz almost 9 years ago

Status changed from Resolved to Feedback
Target version changed from 2017-04-26 sprint to 2017-05-10 sprint

Additional comments:

Needs to use PDH so that inputs and output match up. For example, in this graph the output of "rev" is supposed to be an input of "sort": https://workbench.9tee4.arvadosapi.com/container_requests/9tee4-xvhdp-w382mn52hn18oad#Provenance

Label collection inputs by name. If the collection shows up under multiple different names, prefer the name of the collection in the current project. Otherwise pick any name and render it something like "HWI-ST1027_129_D0THKACXX for CWL tutorial + 4 more"

Don't render "log" outputs. They are just clutter.

I'm not sure if its a good idea to render "child" links. If you have 300 child containers it is just a lot of lines providing very little information. Consider using a graphviz "subgraph" or "cluster".

To determine the inputs of a container request, recursively search "mounts" for JSON fields that look like "portable_data_hash": "abc+123" and "location": "keep:abc+123"

Actions

Copy link

#24

Updated by Peter Amstutz almost 9 years ago

In the interests of time, let's limit it to these changes:

Needs to use PDH so that inputs and output match up. For example, in this graph the output of "rev" is supposed to be an input of "sort": https://workbench.9tee4.arvadosapi.com/container_requests/9tee4-xvhdp-w382mn52hn18oad#Provenance (this is the most important change, because the current behavior it is effectively a regression from the equivalent functionality for jobs)

Don't render "log" outputs. They are just clutter.

To determine the inputs of a container request, recursively search "mounts" for JSON fields that look like "portable_data_hash": "abc+123" and "location": "keep:abc+123" (this should ensure that nothing is missed)

Actions

Copy link

#25

Updated by Lucas Di Pentima almost 9 years ago

Fixes at a4a8d41f6 - branch 10111-cr-prov-regression-fixes
Test run: https://ci.curoverse.com/job/developer-run-tests/273/

Actions

Copy link

#26

Updated by Lucas Di Pentima almost 9 years ago

Branch 10111-collection-labels - commit 39755f764
Test run: https://ci.curoverse.com/job/developer-run-tests/274/

Added better collection labelling on CR provenance graph.

Actions

Copy link

#27

Updated by Lucas Di Pentima almost 9 years ago

An integration test was failing, updated fix at e01823785
New test run: https://ci.curoverse.com/job/developer-run-tests/275/

Actions

Copy link

#28

Updated by Peter Amstutz almost 9 years ago

Additional comment: where we have a container request with an explicit output_uuid, make sure to use the label corresponding to the name of the collection in output_uuid, before falling back on the logic outlined in note-24

Actions

Copy link

#29

Updated by Lucas Di Pentima almost 9 years ago

Updates at 4259263d2
Test run: https://ci.curoverse.com/job/developer-run-tests/276/

Addressed issue about naming output collections after the cr's output_uuid collection reference.

Actions

Copy link

#30

Updated by Peter Amstutz almost 9 years ago

How hard would it be to fix the hyperlinks so that when you have a specific UUID associated with a collection, clicking on it takes you directly to it and not to the "this PDH has multiple collections" page?

Actions

Copy link

#31

Updated by Peter Amstutz almost 9 years ago

Another note. For labeling, if there are multiple collections but they have the same name, you don't need the "+N more"

It's making a separate API call for every collection. That adds a lot of latency. It should find all the PDHs in the graph, make a batch request for them all, and then filter on the workbench side.

Actions

Copy link

#32

Updated by Lucas Di Pentima almost 9 years ago

Updates at b29ca38e4
Test run: https://ci.curoverse.com/job/developer-run-tests/278/

Refactored the graph creation code for CR so that it minimizes the amount of API calls when looking for information about outputs, inputs and childs.
For input collections, when there are more than one with the same name, don't add the "+N more" to the name label.
For output collections, added an option on describe_node() helper function so that the graph node is referenced by PDH, but link urls are rendered by UUID so they take the user to the specific collection page when clicking on it.

Actions

Copy link

#33

Updated by Lucas Di Pentima almost 9 years ago

Target version changed from 2017-05-10 sprint to 2017-05-24 sprint

Actions

Copy link

#34

Updated by Peter Amstutz almost 9 years ago

Ok, for large workflows, it still takes forever to load, but it seems that "dot" is the bottleneck now. We need to rethink representation, but not for this story (I'm putting that on a new ticket, #11680).

On the implementation:

The intended way to call GenerateGraph() was with pdata to contain all the nodes that will be used in the graph. In order to have better separation of concerns, would it make sense for the new code in generate_provenance_edges() that does the batch queries to move to container_requests_controller#generate_provenance ?

Actions

Copy link

#35

Updated by Lucas Di Pentima almost 9 years ago

Updates at 795bf007c
Test run: https://ci.curoverse.com/job/developer-run-tests/284/

Moved the code related to API requests to the CR controller.

Actions

Copy link

#36

Updated by Peter Amstutz almost 9 years ago

Thanks. This is a much better separation of concerns.

I'm unhappy with how it behaves with large graphs, but instead of continuing to go around back and forth I think we should merge 795bf007cbe24775bd348fb40fc5c28d93c8f23d and schedule a grooming session to figure out how rendering can be improved.

LGTM.

Actions

Copy link

#37

Updated by Lucas Di Pentima almost 9 years ago

Status changed from Feedback to Resolved
% Done changed from 33 to 100

Applied in changeset arvados|commit:f5fbc48810d1397df9e6244c16cf07c05162d36a.

Task #11628: Rwview 10111-collection-labels	Resolved	Peter Amstutz	05/05/2017	Actions
Task #11381: Review 10111-cr-provenance-graph	Resolved	Radhika Chippada	04/25/2017	Actions
Task #11619: Review 10111-cr-prov-regression-fixes	Resolved		05/04/2017	Actions

Project

General

Profile

Arvados

Custom queries

Idea #10111

[Workbench][Crunch2] Provenance graph for Container Request

Updated by Tom Morris over 9 years ago

Updated by Tom Morris over 9 years ago

Updated by Tom Morris over 9 years ago

Updated by Tom Clegg over 9 years ago

Updated by Tom Clegg over 9 years ago

Updated by Tom Clegg over 9 years ago

Updated by Tom Morris over 9 years ago

Updated by Tom Morris about 9 years ago

Updated by Tom Morris almost 9 years ago

Updated by Lucas Di Pentima almost 9 years ago

Updated by Lucas Di Pentima almost 9 years ago

Updated by Lucas Di Pentima almost 9 years ago

Updated by Lucas Di Pentima almost 9 years ago

Updated by Radhika Chippada almost 9 years ago

Updated by Lucas Di Pentima almost 9 years ago

Updated by Radhika Chippada almost 9 years ago

Updated by Lucas Di Pentima almost 9 years ago

Updated by Radhika Chippada almost 9 years ago

Updated by Lucas Di Pentima almost 9 years ago

Updated by Lucas Di Pentima almost 9 years ago

Updated by Radhika Chippada almost 9 years ago

Updated by Lucas Di Pentima almost 9 years ago

Updated by Peter Amstutz almost 9 years ago

Updated by Peter Amstutz almost 9 years ago

Updated by Lucas Di Pentima almost 9 years ago

Updated by Lucas Di Pentima almost 9 years ago

Updated by Lucas Di Pentima almost 9 years ago

Updated by Peter Amstutz almost 9 years ago

Updated by Lucas Di Pentima almost 9 years ago

Updated by Peter Amstutz almost 9 years ago

Updated by Peter Amstutz almost 9 years ago

Updated by Lucas Di Pentima almost 9 years ago

Updated by Lucas Di Pentima almost 9 years ago

Updated by Peter Amstutz almost 9 years ago

Updated by Lucas Di Pentima almost 9 years ago

Updated by Peter Amstutz almost 9 years ago

Updated by Lucas Di Pentima almost 9 years ago