DRAGEN Methylation Calling

Different methylation protocols require the generation of two or four alignments per input read, followed by an analysis to choose a best alignment and determine which cytosines are methylated. DRAGEN can automate this process, generating a single output BAM file with Bismark-compatible tags (XR, XG, and XM) that can be used for methylation calling and other downstream workflow.

When the --methylation-protocol option is set to a valid value other than none, DRAGEN automatically produces the required set of alignment runs, each with the appropriate base conversions on the reads, base conversions on the reference, and constraints on whether reads must be forward-aligned or reverse complement (RC) aligned with the reference.

The following options are automatically configured.

--preserve-map-align-order true \
--generate-md-tags true \
--Aligner.global 1 \
--Aligner.no-unpaired 1 \
--Aligner.aln-min-score 0 \
--Aligner.min-score-coeff -0.2 \
--Aligner.match-score 0 \
--Aligner.mismatch-pen 4 \
--Aligner.gap-open-pen 6 \
--Aligner.gap-ext-pen 1 \
--Aligner.supp-aligns 0 \
--Aligner.sec-aligns 0

Because global alignments (end-to-end in the reads) are generated, DRAGEN recommends trimming any artifacts introduced by library prep and adapter sequences.

When --enable-methylation-calling is set to true, DRAGEN analyzes the multiple alignments to produce a single methylation-tagged BAM file. When --enable-methylation-calling is set to false, DRAGEN outputs a separate BAM file per alignment run.

The following table describes these alignment runs:

Protocol

BAM

Reference

Read 1

Read 2

Orientation Constraint

directional

 

1

C->T

C->T

G->A

Forward-only

 

2

G->A

C->T

G->A

RC-only

nondirectional, or directional-complement

 

1

C->T

C->T

G->A

Forward-only

 

2

G->A

C->T

G->A

RC-only

 

3

C->T

G->A

C->T

RC-only

 

4

G->A

G->A

C->T

Forward-only

PBAT

 

3

C->T

G->A

C->T

RC-only

 

4

G->A

G->A

C->T

Forward-only

In directional protocols, the library is prepared such that only the BSW and BSC strands are sequenced. Thus, alignment runs are performed with the two combinations of base conversions and orientation constraints best suited for these strands (directional runs 1 and 2 above).

With nondirectional protocols, reads from each of the four strands are equally likely, so alignment runs must be performed with two more combinations of base conversions and orientation constraints (nondirectional runs 3 and 4 above).

In PBAT protocols, the library is prepared so only the BSWR and BSCR strands are sequenced. Only two alignment runs are performed with the combinations of base conversions and orientation constraints best suited for these strands (runs 3 and 4).

The directional-complement protocol can also be used for PBAT or similar libraries where mainly the BSWR and BSCR strands are sequenced. With this protocol, all four aligner runs are performed, but relatively few good alignments are expected from the runs for the BSW and BSC strands, so DRAGEN is automatically tuned to a faster analysis mode for those runs.

For any protocol, you must use a reference hash table that was produced with --ht-methylated true. For more information, see Pipeline Specific Hash Tables.

The following is an example DRAGEN command line for the directional protocol:

dragen --enable-methylation-calling true \
--methylation-protocol directional \
--ref-dir /staging/ref/mm10/methylation --RGID RG1 --RGCN CN1 \
--RGLB LIB1 --RGPL illumina --RGPU 1 --RGSM Samp1 \
--intermediate-results-dir /staging/tmp \
-1 /staging/reads/samp1_1.fastq.gz \
-2 /staging/reads/samp1_2.fastq.gz \
--output-directory /staging/outdir \
--output-file-prefix samp1_directional_prot