Gene Fusion Detection
The DRAGEN Gene Fusion module uses the DRAGEN RNA spliced aligner for detection of gene fusion events. It performs a split-read analysis on the supplementary (chimeric) alignments to detect potential breakpoints. The putative fusion events then go through various filtering stages to mitigate potential false positives. In addition to the final results, all potential candidates (unfiltered) are output, which can be used to maximize sensitivity.

The following is an example command line for running an end to end RNA-Seq experiment.
/opt/edico/bin/dragen \
-r <HASHTABLE> \
-1 <FASTQ1> \
-2 <FASTQ2> \
-a <GTF_FILE> \
--output-dir <OUT_DIRECTORY> \
--output-file-prefix <PREFIX> \
--RGID <READ_GROUP_ID> \
--RGSM <Sample_NAME> \
--enable-rna true \
--enable-rna-gene-fusion true
At the end of a run, a summary of detected gene fusion events is output, which is similar to the following example.
==================================================================
Loading gene annotations file
==================================================================
Input annotations file: ref_annot.gtf
Number of genes: 27459
Number of transcripts: 196520
Number of exons: 1196293
==================================================================
Launching DRAGEN Gene Fusion Detection
==================================================================
annotation-file: ref_annot.gtf
rna-gf-blast-pairs: blast_pairs.outfmt6
rna-gf-exon-snap: 50
rna-gf-min-anchor: 25
rna-gf-min-neighbor-dist: 15
rna-gf-max-partners: 3
rna-gf-min-score-ratio: 0.15
rna-gf-min-support: 2
rna-gf-min-support-be: 10
rna-gf-restrict-genes true
==================================================================
Completed DRAGEN Gene Fusion Detection
==================================================================
Chimeric alignments: 107923
Total fusion candidates: 38 (2116 before filters)
Time loading annotations: 00:00:08.543
Time running gene fusion: 00:00:18.470
Total runtime: 00:00:27.760
***********************************************************
DRAGEN finished normally

The DRAGEN Gene Fusion module can be run as a standalone utility, taking the *.Chimeric.out.junction file as input and the gene annotations file as a GTF/GFF file. Running the Gene Fusion module standalone is useful for trying out various configuration options at the gene fusion detection stage, without having to map and align the RNA-Seq data multiple times.
To execute the DRAGEN Gene Fusion module as a standalone utility, use the --rna-gf-input-file option to specify the already generated *.Chimeric.out.junction file.
The following is an example command line for running the gene fusion module as a standalone utility.
/opt/edico/bin/dragen \
-a <GTF_FILE> \
--rna-gf-input-file <INPUT_CHIMERIC> \
--output-dir <OUT_DIRECTORY \
--output-file-prefix <PREFIX> \
--enable-rna true \
--enable-rna-gene-fusion true
Standalone mode does not produce identical results to running from reads.

The <outputPrefix>fusion_candidates.features.csv file lists the detected gene fusion events. The output CSV file includes the following columns. Any additional columns describe additional features of the fusion candidates.
• | #FusionGene—Parent gene names (in 5' to 3' order of transcript) participating in the fusion. If a fusion breakend overlaps multiple genes, all are listed. |
• | Score—Fusion call confidence score based on the number of supporting split reads and read-pairs as well as other fusion features. The score can be 0 (low confidence) to 1 (high-confidence call). |
• | LeftBreakpoint—Gene 1 breakpoint formatted as <Chromosome>:<Position>:<Strand>. |
• | RightBreakpoint—Gene 2 breakpoint formatted as <Chromosome>:<Position>:<Strand>. |
• | Filter—Semi-colon separated list of filters. Each output is either a Confidence or Information Only filter. The Filter value is PASS if none of the confidence filters are triggered. Otherwise, the output value is FAIL. |
The following are the available filters.
Filter |
Type |
Description |
---|---|---|
DOUBLE_BROKEN_EXON |
Confidence |
If both breakpoints are 50 bp from annotated exon boundaries, then the number of supporting reads do not satisfy a high threshold requirement (≥ 10 supporting reads). |
LOW_MAPQ |
Confidence |
All fusion supporting read alignments at either of the breakpoints have MAPQ < 20. |
LOW_UNIQUE_ALIGNMENTS |
Confidence |
All fusion supporting read alignments map to a unique genomic interval at either of the breakpoints. |
MIN_SCORE |
Confidence |
The fusion candidate has probabilistic score as determined by the features of the candidate. |
MIN_SUPPORT |
Confidence |
The fusion candidate has < 2 fusion supporting read pairs. |
READ_THROUGH |
Confidence |
The breakpoints are cis neighbors (< 200,000 bp) on the reference genome. |
ANCHOR_SUPPORT |
Information only |
Read alignments of fusion supporting reads are 12 bp) at either of the two breakpoints. |
HOMOLOGOUS |
Information only |
The candidate is likely a false candidate generated because the two genes involved have high gene homology. |
LOW_ALT_TO_REF |
Information only |
The number of fusion supporting reads is < 1% of the number of reads supporting the reference transcript at either of the two breakpoints. |
LOW_GENE_COVERAGE |
Information only |
Either of the two breakpoints have less than 125 bp with nonzero read coverage. |

The following options can be used to configure the fusion caller:
• | --rna-gf-blast-pairs |
A file listing gene pairs that have a high level of similarity. This list of gene pairs is used as a homology filter to reduce false positives. One method to generate this file is to follow the instructions as described on the Fusion Filter Wiki. Use the ref annot.cdsplus.fa.allvsall.outfmt6.genesym.gz file produced by CTAT. For human genome runs, a default file is include and used automatically if no other file is manually specified.
• | --rna-gf-enriched-genes |
For RNA enrichment assays, a list of targeted genes specified as one gene-name per line. Only fusion calls involving at least one gene on the list are reported.
• | --rna-gf-restrict-genes |
When parsing the gene annotations file (GTF/GFF) for use in the DRAGEN Gene Fusion module, you can use this option to restrict the entries of interest to only protein-coding regions. Restricting the GTF to only the protein-coding and lincRNA genes reduces false positive rates in currently studied fusion events. The default value is true.