CNV Examples
These examples show how to use DRAGEN CNV to process already mapped and aligned BAM files using the two modes of normalization supported by the DRAGEN CNV pipeline: Self Normalization and Panel of Normals.
Self Normalization requires that the DRAGEN hash table be generated with the enable-cnv=true option. It is recommended that you always generate a CNV compatible hash table if you frequently run CNV.
The enable-map-align option is set to true by default in the configuration file. Set it to false if you do not need to map and align the input BAM.
The --intermediate-results-dir option should be set to a local directory (eg, /staging/intermediate or /local ssd). Otherwise, long processing time may occur during the CNV step.
Running with Self Normalization
Self normalization is the preferred method for single sample WGS processing. The BAM goes through the entire CNV pipeline with a single command line.
dragen -f \
-r /staging/human/reference/hg19/hg19.fa.k_21.f_16.m_149 \
-b /staging/examples/SRA056922_30x_e10_50M.bam \
--intermediate-results-dir /staging/intermediate \
--output-directory /staging/examples/ \
--output-file-prefix dragen_cnv1 \
--enable-map-align false \
--enable-cnv true \
--cnv-enable-self-normalization true \
Running with a Panel of Normals
The Panel of Normals approach requires pregenerating the target.counts file for each sample to be used, and then executing one final command to perform the normalization and copy number variant calling.
To calculate target counts with BAM Input, this example command extracts the signals, including read counts, from the alignments of the BAM file. It generates a *.target.counts file to be used in the normalization step. Target counts should be calculated for each input BAM file, including the case sample under analysis and the normals samples.
dragen -f \
-r /staging/human/reference/hg19/hg19.fa.k_21.f_16.m_149 \
-b /staging/examples/SRA056922_30x_e10_50M.bam \
--intermediate-results-dir /staging/intermediate \
--output-directory /staging/examples/ \
--output-file-prefix dragen_cnv1 \
--enable-map-align false \
--enable-cnv true
The following command performs the normalization and generates the CNV calls. The normals samples should be listed in a text file (normal.txt in this example) that provides the path to the *.target.counts files of the normal samples. The case sample *.target.counts file is specified with the --cnv-input option. In this example, gcbias correction of the input is disabled.
dragen -f \
-r /staging/human/reference/hg19/hg19.fa.k_21.f_16.m_149 \
--intermediate-results-dir /staging/intermediate \
--output-directory /staging/examples/ \
--output-file-prefix dragen_cnv2 \
--enable-cnv true \
--cnv-input /staging/examples/dragen_cnv1.target.counts \
--cnv-normals-list normal.txt \
--cnv-enable-gcbias-correction false
FASTQ Processing
This example runs the DRAGEN CNV caller in Self Normalization mode directly from FASTQ samples. It first maps and aligns the FASTQ and continues directly to CNV calling. This step can be combined with variant calling.
dragen -f \
-r /staging/human/reference/hg19/hg19.fa.k_21.f_16.m_149 \
-1 /staging/examples/reads/SRA056922_30x_shuffle16k_e10_50M_1.fastq.gz \
-2 /staging/examples/reads/SRA056922_30x_shuffle16k_e10_50M_2.fastq.gz \
--RGID Illumina_ID \
--RGSM SRA056922_30x_shuffle16k \
--intermediate-results-dir /staging/intermediate \
--output-directory /staging/examples/ \
--output-file-prefix dragen_cnv \
--enable-map-align true \
--enable-cnv true \
--cnv-enable-self-normalization true
Running De Novo CNV Calling
De novo calling requires previously generated normalized signal files (*.tn.tsv) from the single sample analysis. If a pedigree file is supplied, then a de novo state and a de novo quality score will be annotated for the proband sample’s records.
dragen -f \
-r /staging/human/reference/hg19/hg19.fa.k_21.f_16.m_149 \
--cnv-input father.tn.tsv \
--cnv-input mother.tn.tsv \
--cnv-input child.tn.tsv \
--intermediate-results-dir /staging/intermediate \
--output-directory /staging/examples/ \
--output-file-prefix trio_cnv \
--pedigree-file trio.ped \
--enable-cnv true
Running Somatic CNV Calling
Somatic CNV calling requires a tumor and matched normal sample. The matched normal sample must first go through the germline small variant caller to produce a *.hard-filtered.vcf.gz. If known, it is recommended that you specify the sample sex.
dragen -f \
-r /staging/human/reference/hg19/hg19.fa.k_21.f_16.m_149 \
--tumor-bam-input tumor.bam \
--intermediate-results-dir /staging/intermediate \
--output-directory /staging/examples/ \
--output-file-prefix somatic_cnv \
--enable-map-align false \
--enable-cnv true \
--cnv-normal-b-allele-vcf normal.hard-filtered.vcf.gz \
--sample-sex female
