Home/
Set Analysis Parameters

Set Analysis Parameters

1. Open DRAGEN Germline from BaseSpace™ Sequence Hub as follows.
  1. Select the Apps tab, and then select DRAGEN Germline.
  2. From the Version drop-down list, select 3.5.7.
  3. Select Launch Application.
2. To override the default analysis name, enter a preferred analysis name in the Analysis Name field.

The default is the app name with the date and time the session started.

3. From the Save Results To field, select Select Project, and then select a project to store app results to.
4. Specify a sample by selecting the option that matches the input file type. Multiple samples of the same type can be selected in a single row.
5. [Optional] From the Sample Sex drop-down list, select the sex of the sample.
6. [Optional] Select Add a New Row to add more samples to the run.
7. Set the analysis pipeline configuration.
Map/Align–Samples are mapped and aligned to the reference genome and position-sorted.
Map/Align + Small Variant Caller–In addition to the Map/Align processes, variant calling is performed.
Small Variant Caller–Only variant calling is performed. This configuration accepts BAM or CRAM input files.
8. From the Reference drop-down list, select a reference genome.
9. If you selected Custom from the Reference drop-down list, select the custom DRAGEN reference file.

The custom reference file must be generated by the DRAGEN Reference Builder app.

10. From the Map/Align Output drop-down list, specify whether to output a BAM, CRAM, or no alignment file at all.
11. From the Small Variant Caller Output drop-down list, specify the gVCF output type:
VCF and GVCF–Variants are recorded individually and nonvariants are grouped into blocks.
VCF and GVCF with BP_RESOLUTION–Variants and nonvariants are recorded individually. This option will increase run time and create large gVCF files, eg, 2 hours for a 30x sample with a 20 GB gVCF.
12. [Optional] Select a target BED file to restrict processing of the small variant caller and target BED-related coverage and callability metrics to regions specified in this file.

The contig names must match those of the chosen reference. If a mismatch is detected, analysis will abort.

13. Define regions over which to produce coverage metrics as follows.
  1. Select the BED file(s) that contain the regions for which you want to produce coverage metrics.
  2. Enter a MAPQ filter value. Any read with a MAPQ value less than this threshold will be filtered out. Default value is 1.
  3. Enter a BQ filter value. Any base call with a quality score less than this threshold will be filtered out. Default value is 0.
  4. Enter a filename tag. The tag must contain letters, numbers, and underscores only. "wgs" or "target_bed" are reserved tags and cannot be used.
  5. Set the Full-Res option. Disabled by default, enabling this option will generate large files.
  6. [Optional] Select Add a New Row to add up to 5 BED files to produce coverage metrics.
14. Specify additional run settings by expanding the function headings and selecting the appropriate checkboxes.

Heading

Description

CNV

Enables germline CNV calling. If enabled, set the following options:

Reference Calls–Includes copy neutral (REF) calls in the output CNV VCF.
Segmentation Algorithm
Circular Binary Segmentation–Iteratively identifies change points in a genomic sequence using a nonparametric hypothesis testing approach. Recommended for whole exome processing.
Shifting Level Model–Models genomic data as the sum of two independent stochastic processes and segments using a subclass of Hidden Markov Model. Recommended for whole genome processing.

SV (Manta)

Enables SV (Manta) analysis. If enabled, set the following options:

Depth Filters–Set options for high coverage input: turn off depth filters.
SV BED File[Optional] Select the BED file that is passed into the SV analysis.

Expansion Hunter

Enables calling of repeat-expansion variants.

UMI Settings

Enables UMI-based read processing when the run is configured for the Map/Align pipeline configuration. Disabled by default and is provided for experimental purposes only. If enabled, set the following options:

UMI Min Supporting Reads–Sets the minimum number of supporting reads required for a family. Default value is 2.
UMI Correction Scheme
Lookup–A correction lookup table is used to correct observed UMI sequences.
Random–UMI sequences are corrected based on all UMI sequences and corresponding read counts at each fragment mapping location.
UMI Error Correction Table–Select the lookup table for UMI correction. If a lookup table is not specified, the Illumina non-random UMIs from the TruSight Oncology 170 RUO, TruSight Oncology 500 RUO, and IDT for Illumina - UMI DNA Index Anchors kits are used.
UMI Duplex Collapsing–Enables UMI duplex collapsing. Disabled by default.

Advanced Settings

Duplicate Marking–Enabled by default.
BQD–Enable base quality drop off detection. Enabled by default.
Optional Metrics–Enable the collection of GC-related metrics. Disabled by default.
Nirvana Annotation–Output a Nirvana-generated JSON file with annotations for variants in all output DRAGEN VCFs. Disabled by default.
MD5 Checksums–Calculate MD5 checksums for all output files.Disabled by default.
dbSNP VCF–Select the variant annotation database *.vcf or *.vcf.gz file.
ForceGT VCF–Select the *.vcf or *.vcf.gz file containing the list of variants to force genotype.
Concordance VCF–Select the concordance, *.vcf or *.vcf.gz, file.

Automation Settings

Enables automation settings. Specify a sample by selecting the option that matches the input file type and the sex.

15. Select Launch Application to start the analysis.

When the analysis is complete, the status of the app session is automatically updated and you receive a confirmation email.