DRAGEN Graph Mapper

To improve variant calling accuracy in segmental duplications and other regions difficult to map with Illumina reads, you can use the graph mapper in DRAGEN. The graph-based method uses alt-aware mapping for population haplotypes stitched into the reference with known alignments to establish alternate graph paths that reads could seed-map and align to. The graph mapper reduces mapping ambiguity because reads that contain population variants are attracted to the specific regions where the variants are observed. Graph mapper is supported only for the hg38 reference.

DRAGEN augments the FASTA reference with around 900,000 short alternate contigs derived from population haplotypes of phased variants to evolve the FASTA reference to a graph reference. The DRAGEN mapper’s alt-aware capabilities are used to project reads that match the population haplotypes to corresponding primary assembly alignments with a precise lift-over alignment.

When given a set of population variants (VCF) or haplotypes, the FASTA modification is categorized in the following two types:

Alternate contigs represent population haplotypes. Alt-contigs can have a single variant or a combination of nearby phased variants.
Ambiguous codes (IUPAC codes) to represent SNPs . To improve alignment, edit the reference FASTA with isolated population SNPs.

DRAGEN graph mapper requires hash tables that are built using a FASTA containing population alternate contigs, corresponding lift-over SAM and unphased SNP VCF files are specified. The following is an example DRAGEN command line to build a graph-based hash table:

dragen --build-hash-table true \

--ht-build-rna-hashtable true --enable-cnv true \

--ht-reference hg38.fa \

--output-directory /tmp/ --ht-num-threads 40 \

--ht-alt-liftover /opt/edico/liftover/bwa-kit_hs38DH_liftover.sam \

--ht-pop-alt-contigs /opt/edico/liftover/pop_altContig.fa.gz \

--ht-pop-alt-liftover /opt/edico/liftover/pop_liftover.sam.gz \

--ht-pop-snps /opt/edico/liftover/pop_snps.vcf.gz