Feature #8381
closedbcbio support for variant calling in CWL
0%
Description
Add support for variant calling with bcbio into CWL generation. We currently support parallel alignment and including variant calling would enable GATK best practices pipelines and VarDict/somatic integration in coordination with current work from Tom and Sally.
Requirements:
- Batching of samples to allow pooled or tumor/normal calling. Need to represent in CWL.
- Parallel runs of batches across genomic regions with subsequent merging.
Updated by Brad Chapman almost 9 years ago
- Status changed from New to In Progress
Progress so far:
- bcbio batches samples together into groups (tumor/normal or family calling) using CWL records.
- Submitted PR to cwltool that enables grouping after discussion with Peter: https://github.com/common-workflow-language/cwltool/pull/40
- We can split batches based on genomic regions to run in parallel.
- Variant calling runs in parallel on defined regions.
- Merge variant calls back into single VCF.
To do steps, which could be a new story:
- Post-calling filtering of VCFs
- Additional VCF annotations for effects (snpEff)
Updated by Brad Chapman almost 9 years ago
- Status changed from In Progress to Resolved
Finalized post-call filtering of VCFs and am punting snpEff annotations for now since that is a bit more complex to link in all of the associated snpEff data files.
This puts us in place to have a more complete demonstration CWL with variant calling for #8176. I will put together documentation and a new example file as part of that story.
This allows us to move on to validation (#8382) which we could schedule for the next sprint.