Project

General

Profile

Actions

Feature #8381

closed

bcbio support for variant calling in CWL

Added by Brad Chapman almost 9 years ago. Updated almost 9 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
Crunch
Target version:
Start date:
02/04/2016
Due date:
% Done:

0%

Estimated time:
Story points:
2.0

Description

Add support for variant calling with bcbio into CWL generation. We currently support parallel alignment and including variant calling would enable GATK best practices pipelines and VarDict/somatic integration in coordination with current work from Tom and Sally.

Requirements:

- Batching of samples to allow pooled or tumor/normal calling. Need to represent in CWL.
- Parallel runs of batches across genomic regions with subsequent merging.

Actions #1

Updated by Brad Chapman almost 9 years ago

  • Status changed from New to In Progress

Progress so far:

- bcbio batches samples together into groups (tumor/normal or family calling) using CWL records.
- Submitted PR to cwltool that enables grouping after discussion with Peter: https://github.com/common-workflow-language/cwltool/pull/40
- We can split batches based on genomic regions to run in parallel.
- Variant calling runs in parallel on defined regions.
- Merge variant calls back into single VCF.

To do steps, which could be a new story:

- Post-calling filtering of VCFs
- Additional VCF annotations for effects (snpEff)

Actions #2

Updated by Brad Chapman almost 9 years ago

  • Status changed from In Progress to Resolved

Finalized post-call filtering of VCFs and am punting snpEff annotations for now since that is a bit more complex to link in all of the associated snpEff data files.

This puts us in place to have a more complete demonstration CWL with variant calling for #8176. I will put together documentation and a new example file as part of that story.

This allows us to move on to validation (#8382) which we could schedule for the next sprint.

Actions

Also available in: Atom PDF