Specify B-Allele Loci

Use the --cnv-normal-b-allele-vcf option to specify a matched normal SNV VCF. Ideally, this VCF file comes from processing the matched normal sample through the DRAGEN germline small variant caller with filters applied. Typically, this file name has a *.hard-filtered.vcf.gz extension. All records marked as PASS that are determined to be heterozygous in the normal sample are used to measure the b-allele counts of the tumor sample . You can also use equivalent gVCF file (*.hard-filtered.gvcf.gz), but the processing time is significantly longer due to the number of records, most of which are not heterozygous sites. It is recommended that you use the VCF file.

Use the --cnv-population-b-allele-vcf option to specify a population SNP VCF. To obtain a population SNP VCF, process an appropriate catalog of population variation, such as from dbSNP, the 1000G project, or other large cohort discovery efforts. Only high-frequency SNPs should be included. For example, include SNPs with minor allele population frequency ≥ 10% to limit run time impact and reduce artifacts. Specify the ALT allele frequency by adding AF=<alt frequency> to the INFO section of each record. Additional INFO fields might be present, but DRAGEN only parses and uses the AF field. Sites specified with --cnv-population-b-allele-vcf can be either heterozygous or homozygous in the germline genome from which the tumor genome derives.

The following is an example valid population SNP record:

chr1 51479 . T A 1000 PASS AF=0.3253

DRAGEN considers the following requirements when parsing records from the b-allele VCF:

Only simple SNV sites.
Records must be marked PASS in the FILTER field.
If there are duplicate records (same CHROM and POS) in the VCF, then the first occurring record is used.