Feature #4042
closed[Crunch] run-command script supports running MxN tasks (e.g., M directories x N chromosomes)
100%
Description
- 6 directories, each with 8 pairs of fastq files → 48 tasks, each operating on one pair
- 3 dirs with 1 pair of fastq files each + 3 dirs with 4 pairs of fastq files each → 15 tasks, each operating on one pair
- 6 dirs, each with 8 bam files → 6 tasks, each operating on one dir
- 6 dirs, each with 1 bam file, plus a static list of chromosomes → 6×23 tasks, each operating on one {bam file, chromosome}
- 6 dirs, each with 23 bam files → 6×23 tasks, each operating on one file
- 6 dirs, each with 23 bam files → 6 tasks, each operating on (all of the bam files in) one dir
The output should preserve the input directory structure: in each of the above examples, the output collection should have 6 directories with the same names as the 6 input directories.
Updated by Tom Clegg over 10 years ago
- Description updated (diff)
- Category set to Crunch
Updated by Tom Clegg over 10 years ago
- Subject changed from [Crunch] run-command script supports queueing MxN tasks (e.g., M directories x N chromosomes) to [Crunch] run-command script supports running MxN tasks (e.g., M directories x N chromosomes)
- Description updated (diff)
Updated by Peter Amstutz over 10 years ago
- Target version changed from Arvados Future Sprints to 2014-10-08 sprint
Updated by Peter Amstutz over 10 years ago
- Status changed from New to In Progress
Updated by Brett Smith over 10 years ago
General¶
In both error messages and documentation, "accessable" is misspelled; it should be "accessible."
Code¶
- Python dictionaries can use tuples as keys. Given this, I think
add_to_group
would be better off usingmatch.groups()
directly as the key. That way, there's no risk that different group matches will look identical because of conflicts with the join string, even as cute as^_^
is. - There are a couple of places in
expand_item
that sayp = pattern.match(i)
. This masks the originalp
argument that was passed in. I don't think that causes any behavior bugs in the current code (we never try to read the originalp
after), but changing it now might help prevent heartache later. - This doesn't need to change for the branch, but in case it's helpful for the future, you might find
os.path.isdir()
andos.path.isfile()
easier to use than mucking around with stat.
Documentation¶
- The two run-command include examples are identical.
_run_command_foreach_example.liquid
doesn't useforeach
anywhere. - Please run a spell checker on the run-command reference. Some typos I found: "evaluted", "substition", "commands’s", "concatinated."
- The description of the first example says
run “echo” with the first argument “hello”
, but the code echoes"hello world"
. - The description for
$(task.outdir)
seems to be cut off. - The second hyphen appears to be a typo in "The directory containing the source code for the run-command- script "
- I think "Note that parameter expansion is not performed lists produced this way." should say "performed on lists."
- "list context, as described below" - it's described above.
- "list items functions" should say "list item" (singular) for consistency with the rest of the documentation.
- Similarly, the index documentation should use monospace fonts for literal inputs and outputs, for consistency with the other function documentation. Plus Textile turns
--
into an em dash otherwise. - "write it output files" should say "its".
Updated by Peter Amstutz over 10 years ago
Brett Smith wrote:
General¶
In both error messages and documentation, "accessable" is misspelled; it should be "accessible."
Fixed.
Code¶
- Python dictionaries can use tuples as keys. Given this, I think
add_to_group
would be better off usingmatch.groups()
directly as the key. That way, there's no risk that different group matches will look identical because of conflicts with the join string, even as cute as^_^
is.
I tried that with a list and it said "lists are unhashable"; I didn't realize it worked for tuples. Fixed.
- There are a couple of places in
expand_item
that sayp = pattern.match(i)
. This masks the originalp
argument that was passed in. I don't think that causes any behavior bugs in the current code (we never try to read the originalp
after), but changing it now might help prevent heartache later.
Fixed.
- This doesn't need to change for the branch, but in case it's helpful for the future, you might find
os.path.isdir()
andos.path.isfile()
easier to use than mucking around with stat.
Good to know.
Documentation¶
- The two run-command include examples are identical.
_run_command_foreach_example.liquid
doesn't useforeach
anywhere.- Please run a spell checker on the run-command reference. Some typos I found: "evaluted", "substition", "commands’s", "concatinated."
- The description of the first example says
run “echo” with the first argument “hello”
, but the code echoes"hello world"
.- The description for
$(task.outdir)
seems to be cut off.- The second hyphen appears to be a typo in "The directory containing the source code for the run-command- script "
- I think "Note that parameter expansion is not performed lists produced this way." should say "performed on lists."
- "list context, as described below" - it's described above.
- "list items functions" should say "list item" (singular) for consistency with the rest of the documentation.
- Similarly, the index documentation should use monospace fonts for literal inputs and outputs, for consistency with the other function documentation. Plus Textile turns
--
into an em dash otherwise.- "write it output files" should say "its".
All fixed.
Updated by Brett Smith over 10 years ago
Peter Amstutz wrote:
- Python dictionaries can use tuples as keys. Given this, I think
add_to_group
would be better off usingmatch.groups()
directly as the key. That way, there's no risk that different group matches will look identical because of conflicts with the join string, even as cute as^_^
is.I tried that with a list and it said "lists are unhashable"; I didn't realize it worked for tuples. Fixed.
The rule is "keys have to be immutable," since you'd get bizarre results if you put a key in a dictionary and then mutated it. Fortunately, mutability is the differentiator between lists and tuples.
Reviewing bdd309b. Just a few small typos, go ahead and merge once these are fixed.
- The preface for the foreach example says "over multiple sample;" it should say "samples" plural.
- The foreach example is missing a comma after the
sample
parameter hash. - Small capitalization fixes: I think case-sensitive names like "run-command" and "script_parameters" should always be written with their correct case (i.e., lowercase), even at the beginning of sentences, to avoid bad input. Please capitalize the proper name "Unix."
Thanks.
Updated by Anonymous over 10 years ago
- Status changed from In Progress to Resolved
- % Done changed from 0 to 100
Applied in changeset arvados|commit:dd59a50f9f3933c359930806516b43899a8b4957.