Set Analysis Parameters

Open DNA + RNA Amplicon from BaseSpace™ Sequence Hub as follows.

Select the Apps tab, and then select DNA + RNA Amplicon.
From the Version drop-down list, select 1.0.0.
Select Launch Application.

To override the default analysis name, enter a preferred analysis name in the Analysis Name field.

The default is the app name with the date and time the session started.

From the Save Results To field, select Select Project, and then select a project to store app results in.

From the Biosamples to Analyze field, select the biosample you want to analyze as follows.

Select Select Biosample(s).
Drag and drop the DNA and RNA biosamples you want to analyze into the appropriate column of the Create Pairs window. DNA and RNA biosamples listed on the same row are assumed to be taken from the same biological material and their output files are grouped together in the aggregation step.
Select Confirm.

The labels of this field and button in the software interface might be different from the labels in this guide, depending on the BaseSpace™ Sequence Hub mode. In New mode, all data that is associated with the selected biosample is used in the analysis.

From the Targeted Amplicons drop-down list, select the amplicon panel that was used to generate the libraries.

If you selected Custom Panels from Targeted Amplicons, select the manifest file(s) for your custom panel(s) by selecting Select Dataset File(s) under the appropriate field(s). A DNA manifest is required if any DNA biosamples were selected and an RNA manifest is required if any RNA biosamples were selected. If the manifest file you want to use is not listed, import the file as follows.

Open a new browser window and visit the BaseSpace™ Sequence Hub home page.
Select Projects from the My Data tab.
Select the project you want to upload the manifest file to.
Select File | Upload | Files | Manifest.
Follow the onscreen instructions to add the Custom Amplicon manifest file (*.txt) to the project. Make sure that the reference genome is specified in the header of the manifest file. See Reference Genomes.
Return to the DNA + RNA Amplicon app and select the newly added manifest file.

For Variant Annotation Source, select from the following options.

•

RefSeq—Variants are annotated using RefSeq transcripts.

•

Ensembl—Variants are annotated using Ensembl transcripts.

•

None—Variants are not annotated.

Variant annotation is supported for human genomes only, and not supported for custom genomes.

To configure the DNA settings for the analysis, expand DNA Settings.

For Variant Caller, select from the following options:

•

Germline—Used for nontumor sample types.

•

Somatic—Used for tumor samples.

10.

If you selected the Somatic Variant Caller, enter a value 0.05–30 to define the Somatic Variant Frequency Threshold (Percentage).

The default value is 5. The LowVariantFreq filter is applied to variants with frequencies below the specified threshold. Lower threshold values can result in more false positive variants.

11.

Enter a value 10–10,000 to define the Variant Caller Depth Filter level.

The default value is 10. Variants covered by a number of reads less than this value are marked as filtered. Lower filter values can result in more false positive variants passing filter.

12.

[Optional] In the Genotypes of Interest VCF field, select Select Dataset File(s).

13.

Select at least one VCF file containing variants of interest.

The input VCF file must include the eight mandatory VCF columns and the FORMAT and SAMPLE columns. Only the CHROM, POS, REF, and ALT columns must contain data. Columns without a value must contain a period (.). The app ignores these columns.

The following table provides an example of a genotypes of interest file (*.vcf or *.vcf.gz) with the minimum required information.

##fileformat=VCFv4.1
#CHROM	POS	ID	REF	ALT	QUAL	FILTER	INFO	FORMAT	SAMPLE
chr2	177016728	.	T	C	.	.	.	.	.
chr15	38545390	.	A	C	.	.	.	.	.
chr15	38545390	.	A	G	.	.	.	.	.
chr22	30090721	.	T	.	.	.	.	.	.

14.

Set the Sample Identification Analysis option.

Generate sample fingerprints based on SNPs in the Sample ID panel spike-in to detect sample swaps.

15.

For Indel Realignment, select from the following options:

•

On—Gemini performs indel realignment, which might improve indel detection. However, overall accuracy can vary in different panels and total analysis time can increase. This option is the default setting.

•

Off—Indel realignment is not performed.

16.

To configure the RNA settings for the analysis, expand RNA Settings.

17.

If you are using the Ampliseq ERCC RNA Spike-In Mix for Illumina or the Ampliseq ERCC RNA Companion Panel for Illumina, set the ERCC Analysis option.

When enabled, ERCC targets are merged into the input manifest for analysis. The gene name and amplicon sequence are checked to make sure that each ERCC target appears only once in the merged manifest.

18.

To enable gene-based normalization, select the Enable Gene-Based Normalization checkbox and enter a list of gene names, separated by semicolon, in the Gene Name(s) field.

DESeq2, by default, uses all the amplicon targets to perform library-size normalization. When this option is enabled, only the amplicon targets targeting the provided genes are used for normalization.

19.

To configure Pindel settings for the analysis, expand Pindel Settings.

20.

To detect and report structural variants using Pindel, select the Enable analysis of SVs checkbox.

21.

Enter a positive integer value to define the Variant Depth Filter level.

The default value is 20. Variants with a depth below the specified value are marked as filtered. Decreasing this value raises the risk of false positives.

22.

Enter a value 0–1 to define the Variant Frequency Threshold.

The default value is 0.01. This value specifies the threshold above which indel variants are called. Decreasing this value raises the risk of false positives.

23.

Enter a positive integer value to define the Variant Length Filter.

The default value is 5. This value specifies the minimum indel variants length. Variants with lengths below this value are filtered.

24.

To add reference calls with 0/0 genotype to the filtered VCF file, select the Include reference calls checkbox.

25.

To configure OncoCNV settings for the analysis, expand OncoCNV Settings.

26.

To detect and report copy number variants using OncoCNV, select the Enable analysis of CNVs checkbox.

27.

From the Baseline File (.txt) field, select Select Dataset File(s), and then select a baseline file.

Use only baseline files created with the OncoCNV Trainer app. Generic baseline files can be found in the Public Data project, OncoCNV Baselines.

28.

Enter a positive integer value to define the Minimum Depth level at which a sample will be processed.

The default value is 10. Samples with a depth below the specified value are not processed and produce an empty output.

29.

Select Launch Application to start the analysis.

When the analysis is complete, the status of the app session is automatically updated and you receive a confirmation email.

Related articles

Set Analysis Parameters