Set Analysis Parameters

1. Open DRAGEN Enrichment from BaseSpace Sequence Hub as follows.
  1. Select the Apps tab, and then select DRAGEN Enrichment.
  2. From the Version drop-down list, select 3.4.5.
  3. Select Launch Application.
2. To override the default analysis name, enter a preferred analysis name in the Analysis Name field.

The default is the app name with the date and time the session started.

3. From the Save Results To field, select Select Project, and then select a project to store app results in.
4. Specify a sample by selecting the option that matches the input file type. Multiple samples can be selected in a single row.
5. [Optional ] From the Sample Sex drop-down list, select the sex of the sample.
6. For Small Variant Caller, select from the following options:
Germline—Used for most nontumor sample types. Outputs both VCF and gVCF files. Selected by default.
Somatic—Used for tumor samples. Outputs only VCF files.
7. For Annotation, select from the following options.
RefSeq—Variants are annotated using RefSeq transcripts.
Ensembl—Variants are annotated using Ensembl transcripts.
None—Variants are not annotated. Selected by default.

Variant annotation is supported for human genomes only, and not supported for custom genomes.

8. From the Reference drop-down list, select a reference genome.
9. If you selected Custom from the Reference drop-down list, select the custom DRAGEN and FASTA reference files.
Custom DRAGEN Reference File–The custom reference file must have been generated by the DRAGEN Reference Builder app.
Custom FASTA Reference File–When using a custom reference, a custom FASTA file must be specified. When using a built-in reference (eg, hg19, hg38, GRCh37, hs37d5) this does not need to be specified. The correct FASTA is automatically used.
10. From the Targeted Regions drop-down list, select the targeted regions (BED) for your enrichment panel.
11. If you selected Custom BED from Targeted Regions, select the target BED file by selecting Select Dataset File(s) from the Custom Target BED File field.

The contig names must match those of the chosen reference. This file is applicable only to the variant calling stage. If a mismatch is detected, analysis is aborted.

12. For Base Padding (for Enrichment Metrics), enter an integer value 010,000 to define the base padding to add to each targeted region. The default value is 150.
13. Specify additional run settings by expanding the function headings and selecting the appropriate checkboxes.

Heading

Setting Description

Picard HsMetrics

Enables Picard HsMetrics generation. When enabled, select the probe BED type. If Use Custom Probe BED is chosen, select the custom probe BED file to use for the analysis.

CNV

Enables CNV analysis. If enabled, configure the following settings.

Segmentation Algorithm
Circular Binary Segmentation–Iteratively identifies change points in a genomic sequence using a nonparametric hypothesis testing approach.
Shifting Levels Model–Models genomic data as the sum of two independent stochastic processes and segments using a subclass of Hidden Markov Model.
CNV caller quality filter threshold–Specifies the QUAL value at which variants are filtered in the CNV VCF. The range of this setting is 050.0. The default value is 50.0.
CNV Baseline Files–Select the CNV baseline files. Up to 10 files can be selected and they all must be of the same type (eg, either all *.target.counts or all *.target.counts.gc-corrected). CNV baselines can be created using the DRAGEN CNV Baseline Builder app.
GC Bias Correction–Enabled by default.

SV (Manta)

Enables SV (Manta) analysis.

Advanced Settings

Pipeline Configuration
Map/Align + Variant Caller–Samples are mapped, aligned to the reference genome, and position-sorted. Variant calling is also performed.
Variant Caller Only–Only variant calling is performed. This configuration only accepts BAM input files.
Variant Caller Output
GVCF–Variants are recorded individually and nonvariants are grouped into blocks.
GVCF with BP_RESOLUTION–Variants and nonvariants are recorded individually. This option is typically used for debugging and will increase run time and create large gVCF files.

When the small variant is set to Somatic mode, the output format is VCF.

Somatic variant caller threshold (percentage)–Enter a value 030 to specify the threshold percentage above which somatic variants are called. The default is 1.
Decreasing this value increases variant caller sensitivity but raises the risk of false positives.
Somatic variant filter threshold (percentage)–Enter a value 030 to specify the threshold percentage above which somatic variants are filtered. The default is 5.
Base Padding (for Variant Calling)–Enter a nonnegative integer to define the base padding to add to each target BED region. Used to pad targeted regions for variant calling but does not affect most enrichment metrics.
Duplicate Marking–Enabled by default.
ForceGT VCF–Select a *.vcf or *.vcf.gz file of small variants to force genotype.
QC Coverage Metrics–Select the BED file that contains the region(s) over which to generate metrics. Reads with a MAPQ value less than 1 will be filtered.

Automation Settings

Enables automation settings. Specify a sample by selecting the option that matches the input file type and the sex. These settings should only be used when launching an app using a biosample workflow or BaseSpace CLI.

14. Select Launch Application to start the analysis.

When the analysis is complete, the status of the app session is automatically updated and you receive a confirmation email.