Project

General

Profile

Actions

Story #10111

closed

[Workbench][Crunch2] Provenance graph for Container Request

Added by Tom Morris over 8 years ago. Updated over 7 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
-
Target version:
Start date:
04/25/2017
Due date:
% Done:

67%

Estimated time:
(Total: 0.00 h)
Story points:
2.0
Release:
Release relationship:
Auto

Description

Extract dependency graph from Container Request and pass to existing code which knows how to use GraphViz to format and reuse the rest of the existing infrastructure.


Subtasks 3 (0 open3 closed)

Task #11628: Rwview 10111-collection-labelsResolvedPeter Amstutz05/05/2017

Actions
Task #11381: Review 10111-cr-provenance-graphResolvedRadhika Chippada04/25/2017

Actions
Task #11619: Review 10111-cr-prov-regression-fixesResolved05/04/2017

Actions
Actions #1

Updated by Tom Morris over 8 years ago

  • Subject changed from Provenance graph for Container Request to [Crunch2] Provenance graph for Container Request
Actions #2

Updated by Tom Morris about 8 years ago

  • Story points set to 2.0

Extract dependency graph from Container Request and pass to existing code which knows how to use GraphViz to format and reuse the rest of the existing infrastructure.

Actions #3

Updated by Tom Morris about 8 years ago

  • Target version set to 2016-11-23 sprint
Actions #4

Updated by Tom Clegg about 8 years ago

  • Assigned To set to Tom Clegg
Actions #5

Updated by Tom Clegg about 8 years ago

  • Assigned To deleted (Tom Clegg)
Actions #6

Updated by Tom Clegg about 8 years ago

  • Target version changed from 2016-11-23 sprint to Arvados Future Sprints
Actions #7

Updated by Tom Morris about 8 years ago

  • Description updated (diff)
Actions #8

Updated by Tom Morris almost 8 years ago

  • Target version changed from Arvados Future Sprints to 2017-04-12 sprint
Actions #9

Updated by Tom Morris over 7 years ago

  • Tracker changed from Bug to Story
  • Subject changed from [Crunch2] Provenance graph for Container Request to [Workbench][Crunch2] Provenance graph for Container Request
  • Assigned To set to Lucas Di Pentima
Actions #10

Updated by Lucas Di Pentima over 7 years ago

  • Status changed from New to In Progress
Actions #11

Updated by Lucas Di Pentima over 7 years ago

  • Target version changed from 2017-04-12 sprint to 2017-04-26 sprint
Actions #12

Updated by Lucas Di Pentima over 7 years ago

Updates at 93c92875aaebe5b06f8dbfe2822b59a772895c08 (branch 10111-cr-provenance-graph)

Radhika: While I start working on the tests, I would like to check with you if this is the correct approach, and if there are missing elements on the graph to be included.

Actions #13

Updated by Lucas Di Pentima over 7 years ago

Merged master && added some tests: 4ccbea9ef
Test run: https://ci.curoverse.com/job/developer-run-tests/247/

Actions #14

Updated by Radhika Chippada over 7 years ago

For a CR -> Use Inputs from Mounts + output uuid + log uuid as the nodes

And then need to get all child CRs for this CR and repeat the above.

Actions #15

Updated by Lucas Di Pentima over 7 years ago

Updates at 30146198f
Test run: https://ci.curoverse.com/job/developer-run-tests/251/

  • Removed container_image and requesting_container from the graph.
  • Added child CRs with their own mounts/output/log.
Actions #16

Updated by Radhika Chippada over 7 years ago

The graph is looking pretty good. A few observations:

  • As we discussed, I think we should only use input collections (plus output and log uuids) for the CR in “find_collections cr[:mounts]” and exclude any other collections, if any. I think you probably need to look for collection uuids / pdhs in the segment returned by application_helper.get_cwl_inputs?
  • extra white space at line ends in _show_provenance.html.erb

Thanks.

Actions #17

Updated by Lucas Di Pentima over 7 years ago

Updates: 260e85a9d

  • As we discussed, I think we should only use input collections (plus output and log uuids) for the CR in “find_collections cr[:mounts]” and exclude any other collections, if any. I think you probably need to look for collection uuids / pdhs in the segment returned by application_helper.get_cwl_inputs?

Ah yes, I thought that all mounts that specified a UUID/PDH were implicitly an input.

I have changed the code so it searches only for collections inside the /var/lib/cwl/cwl.input.json mount, that as I understand by reading the get_cwl_inputs() helper, it's the object describing the input mounts.
The result is that if a CR step is created and have mounts that aren’t from a CWL definition (example: bwa command execution that uses FUSE), those mounts on the child CR won’t be included in the graph (ie: 9tee4’s /container_requests/9tee4-xvhdp-29wnyz1npk9bycs#Provenance), is that ok or should I search for “any collection” inside mounts when not using arvados-cwl-runner on the command?

  • extra white space at line ends in _show_provenance.html.erb

Oops! done.

Another question: Currently I’m showing the Provenance tab on those CRs with state != Uncommitted, should I change that to only CR in Final state?

Actions #18

Updated by Radhika Chippada over 7 years ago

I have changed the code so it searches only for collections inside the /var/lib/cwl/cwl.input.json mount, that as I understand by reading the get_cwl_inputs() helper, it's the object describing the input mounts. The result is that if a CR step is created and have mounts that aren’t from a CWL definition (example: bwa command execution that uses FUSE), those mounts on the child CR won’t be included in the graph (ie: 9tee4’s /container_requests/9tee4-xvhdp-29wnyz1npk9bycs#Provenance), is that ok or should I search for “any collection” inside mounts when not using arvados-cwl-runner on the command?

Yes, this seems problematic. I think we should check for /keep/<pdh> format instead in mounts to get the input collections. Please confirm with Peter. Thanks.

Another question: Currently I’m showing the Provenance tab on those CRs with state != Uncommitted, should I change that to only CR in Final state?

Comparing with pipeline_instances and jobs, this seems correct (to show graph for Queued etc)

Actions #19

Updated by Lucas Di Pentima over 7 years ago

Update at: 88c241d7c
Test run will be on: https://ci.curoverse.com/job/developer-run-tests/257/

Search for all PDHs on "mounts" on cases when cwl.input.json is not included. As talked with Radhika & Bryan, outputs aren't listed using PDHs, just paths. So there's no possibility of including an output as an input.

Actions #20

Updated by Lucas Di Pentima over 7 years ago

Update at: edfc619e6

As requested on the sprint review meeting, changed the graph edges from "cr" to "child" and "mounts" to "input".

Actions #21

Updated by Radhika Chippada over 7 years ago

LGTM @ edfc619

Actions #22

Updated by Lucas Di Pentima over 7 years ago

  • Status changed from In Progress to Resolved
  • % Done changed from 0 to 100

Applied in changeset arvados|commit:b075d1be1377760f5d8497a29f63c8e416cd5378.

Actions #23

Updated by Peter Amstutz over 7 years ago

  • Status changed from Resolved to Feedback
  • Target version changed from 2017-04-26 sprint to 2017-05-10 sprint

Additional comments:

  • Label collection inputs by name. If the collection shows up under multiple different names, prefer the name of the collection in the current project. Otherwise pick any name and render it something like "HWI-ST1027_129_D0THKACXX for CWL tutorial + 4 more"
  • Don't render "log" outputs. They are just clutter.
  • I'm not sure if its a good idea to render "child" links. If you have 300 child containers it is just a lot of lines providing very little information. Consider using a graphviz "subgraph" or "cluster".
  • To determine the inputs of a container request, recursively search "mounts" for JSON fields that look like "portable_data_hash": "abc+123" and "location": "keep:abc+123"
Actions #24

Updated by Peter Amstutz over 7 years ago

In the interests of time, let's limit it to these changes:

  • Don't render "log" outputs. They are just clutter.
  • To determine the inputs of a container request, recursively search "mounts" for JSON fields that look like "portable_data_hash": "abc+123" and "location": "keep:abc+123" (this should ensure that nothing is missed)
Actions #25

Updated by Lucas Di Pentima over 7 years ago

Fixes at a4a8d41f6 - branch 10111-cr-prov-regression-fixes
Test run: https://ci.curoverse.com/job/developer-run-tests/273/

Actions #26

Updated by Lucas Di Pentima over 7 years ago

Branch 10111-collection-labels - commit 39755f764
Test run: https://ci.curoverse.com/job/developer-run-tests/274/

Added better collection labelling on CR provenance graph.

Actions #27

Updated by Lucas Di Pentima over 7 years ago

An integration test was failing, updated fix at e01823785
New test run: https://ci.curoverse.com/job/developer-run-tests/275/

Actions #28

Updated by Peter Amstutz over 7 years ago

Additional comment: where we have a container request with an explicit output_uuid, make sure to use the label corresponding to the name of the collection in output_uuid, before falling back on the logic outlined in note-24

Actions #29

Updated by Lucas Di Pentima over 7 years ago

Updates at 4259263d2
Test run: https://ci.curoverse.com/job/developer-run-tests/276/

Addressed issue about naming output collections after the cr's output_uuid collection reference.

Actions #30

Updated by Peter Amstutz over 7 years ago

How hard would it be to fix the hyperlinks so that when you have a specific UUID associated with a collection, clicking on it takes you directly to it and not to the "this PDH has multiple collections" page?

Actions #31

Updated by Peter Amstutz over 7 years ago

Another note. For labeling, if there are multiple collections but they have the same name, you don't need the "+N more"

It's making a separate API call for every collection. That adds a lot of latency. It should find all the PDHs in the graph, make a batch request for them all, and then filter on the workbench side.

Actions #32

Updated by Lucas Di Pentima over 7 years ago

Updates at b29ca38e4
Test run: https://ci.curoverse.com/job/developer-run-tests/278/

  • Refactored the graph creation code for CR so that it minimizes the amount of API calls when looking for information about outputs, inputs and childs.
  • For input collections, when there are more than one with the same name, don't add the "+N more" to the name label.
  • For output collections, added an option on describe_node() helper function so that the graph node is referenced by PDH, but link urls are rendered by UUID so they take the user to the specific collection page when clicking on it.
Actions #33

Updated by Lucas Di Pentima over 7 years ago

  • Target version changed from 2017-05-10 sprint to 2017-05-24 sprint
Actions #34

Updated by Peter Amstutz over 7 years ago

Ok, for large workflows, it still takes forever to load, but it seems that "dot" is the bottleneck now. We need to rethink representation, but not for this story (I'm putting that on a new ticket, #11680).

On the implementation:

The intended way to call GenerateGraph() was with pdata to contain all the nodes that will be used in the graph. In order to have better separation of concerns, would it make sense for the new code in generate_provenance_edges() that does the batch queries to move to container_requests_controller#generate_provenance ?

Actions #35

Updated by Lucas Di Pentima over 7 years ago

Updates at 795bf007c
Test run: https://ci.curoverse.com/job/developer-run-tests/284/

Moved the code related to API requests to the CR controller.

Actions #36

Updated by Peter Amstutz over 7 years ago

Thanks. This is a much better separation of concerns.

I'm unhappy with how it behaves with large graphs, but instead of continuing to go around back and forth I think we should merge 795bf007cbe24775bd348fb40fc5c28d93c8f23d and schedule a grooming session to figure out how rendering can be improved.

LGTM.

Actions #37

Updated by Lucas Di Pentima over 7 years ago

  • Status changed from Feedback to Resolved
  • % Done changed from 33 to 100

Applied in changeset arvados|commit:f5fbc48810d1397df9e6244c16cf07c05162d36a.

Actions

Also available in: Atom PDF