Project

General

Profile

Actions

Bug #7649

closed

Compute nodes should go down when not doing compute work with GATK Queue

Added by Bryan Cosca over 10 years ago. Updated over 10 years ago.

Status:
Closed
Priority:
Normal
Assigned To:
-
Category:
-
Target version:
-
Story points:
-

Description

Parallelizing GATK IndelRealigner using GATK Queue uses one node to spin out multiple child jobs. Once a child job finishes, there is not trigger to shut down the compute node that it is working on. This is wasting compute resources. I've included screen shots where there are only 4 jobs currently running, 2 parent and 2 child. There are 24 busy nodes. This is because indelrealigner is parallelized out to 10 smaller sub-samples.





Files

rcjobs0.png (22.7 KB) rcjobs0.png Bryan Cosca, 10/26/2015 02:39 PM
rcjobs2.png (97.4 KB) rcjobs2.png Bryan Cosca, 10/26/2015 02:39 PM
rcjobs1.png (86.9 KB) rcjobs1.png Bryan Cosca, 10/26/2015 02:39 PM
rcjobs3.png (21.7 KB) rcjobs3.png Bryan Cosca, 10/26/2015 02:39 PM
Actions #1

Updated by Brett Smith over 10 years ago

  • Status changed from New to Closed

This is a known limitation of Crunch v1. The Queue jobs have to stay running to monitor all the child jobs it created, and report that progress back to Arvados appropriately. The fix will be Crunch v2, for which there are many tickets (search "Crunch2").

Actions

Also available in: Atom PDF