VCF File Annotations

Heading

Description

FILTER

If all filters are passed, PASS is written in the filter column.

artifact_in_normal–TLOD of the normal read set (Normal artifact LOD) exceeds threshold.
base_quality–Site filtered because median base quality of alt reads at this locus does not meet threshold.
clustered_events–Clustered events observed in the tumor.
DRAGENHardQUAL–Applied to sites with QUAL < 10.41 (default cutoff)
fragment_length–Site filtered because absolute difference between the median fragment length of alt reads and median fragment length of ref reads at this locus exceeds threshold.
germline_risk–Evidence indicates this site is germline, not somatic.
lod_fstar–Applied to sites in the mitochondrial contig where lod < 6.3 (default cutoff)
low_af–Allele frequency does not meet threshold.
LowDepth–Applied to sites with DP ≤1 (default cutoff).
mapping_quality–Site filtered because median mapping quality of alt reads at this locus does not meet threshold.
multiallelic–Site filtered because more than two alt alleles pass tumor LOD.
panel_of_normals–Seen in at least one sample in the panel of normals vcf.
PloidyConflict–Applied to sites where the genotype call from the variant caller is not consistent with chromosome ploidy (eg, chrY in Female subjects).
read_position–Site filtered because median of distances between start/end of read and this locus exceeds threshold.
str_contraction–Site filtered due to suspected PCR error where the alt allele is one repeat unit less than the reference.
t_lod–Tumor does not meet likelihood threshold.

INFO

Possible entries in the INFO column include:

AC–Allele count in genotypes, for each ALT allele, in the same order as listed.
AF–Allele Frequency, for each ALT allele, in the same order as listed.
AN–Total number of alleles in called genotypes.
DB –dbSNP Membership.
DP –Approximate read depth (informative and non-informative); some reads may have been filtered based on mapq, etc.
END –Stop position of the interval in gVCF banding.
FS –Phred-scaled p-value using Fisher's exact test to detect strand bias.
FractionInformativeReads –The fraction of informative reads out of the total reads.
LOD –Variant LOD score (for mitochondrial contig).
MQ –RMS Mapping Quality.
MQRankSum –Z-score From Wilcoxon rank sum test of Alt vs. Ref read mapping qualities.
NLOD–Normal LOD score.
QD –Variant Confidence/Quality by Depth.
R2_5P_bias–Score based on mate bias and distance from 5 prime end.
ReadPosRankSum –Z-score from Wilcoxon rank sum test of Alt vs. Ref read position bias.
SOR –Symmetric Odds Ratio of 2x2 contingency table to detect strand bias.
TLOD–Tumor LOD score.

FORMAT

The format column lists fields separated by colons. For example, GT:GQ. The list of fields provided depends on the variant caller used. Available fields include:

AD—Allele Depth (counting only informative reads out of the total reads) for the ref and alt alleles in the order listed; if the GT is 0/0, the AD is the reference count. If the GT is 0/1 or 1/1, the AD is of the form X,Y, where X is the reference allele count and Y is the alternative allele count. If the GT is 1/2, the AD is of the form Y,Z, where Y and Z are the alternative allele 1 and 2 counts.
AF–Allele fractions for alt alleles in the order listed.
DP–Approximate read depth (reads with MQ=255 or with bad mates are filtered).
F1R2–Count of reads in F1R2 pair orientation supporting each allele.
F2R1–Count of reads in F2R1 pair orientation supporting each allele.
FT–Sample filter, 'PASS' indicates that all filters have passed for this sample (used in multi-sample VCF).
GP–Phred-scaled posterior probabilities for genotypes as defined in the VCF specification.
GQ–Phred-scale Genotype Quality.
GT–Genotype.
ICNT–Counts of INDEL informative reads based on the reference confidence model.
LOD–Per-sample variant LOD score (for mitochondrial contig).
MB–Per-sample component statistics to detect mate bias.
MIN_DP–Minimum DP observed within the GVCF block.
PL–Normalized, Phred-scaled likelihoods for genotypes as defined in the VCF specification.
PRI–Phred-scaled prior probabilities for genotypes.
PS–Physical phasing ID information, where each unique ID within a given sample (but not across samples) connects records within a phasing group.
SB–Per-sample component statistics which comprise the Fisher's Exact Test to detect strand bias.
SPL–Normalized, Phred-scaled likelihoods for SNPs based on the reference confidence model.

SAMPLE

The sample column gives the values specified in the FORMAT column.