Analysis Output | Output Files | VCF File Format | VCF File Annotations

VCF File Annotations

Heading

Description

FILTER

If all filters are passed, PASS is written in the filter column.

LowDP—Applied to sites with depth of coverage below a cutoff.
LowGQ—The genotyping quality (GQ) is below a cutoff.
LowQual—The variant quality (QUAL) is below a cutoff.
LowVariantFreq—The variant frequency is less than the given threshold.
R8—For an indel, the number of adjacent repeats (1-base or 2-base) in the reference is greater than 8.
SB—The strand bias is more than the given threshold. Used with the Somatic Variant Caller and GATK.

INFO

Possible entries in the INFO column include:

AC—Allele count in genotypes for each ALT allele, in the same order as listed.
AF—Allele Frequency for each ALT allele, in the same order as listed.
AN—The total number of alleles in called genotypes.
CD—A flag indicating that the SNP occurs within the coding region of at least 1 RefGene entry.
DP—The depth (number of base calls aligned to a position and used in variant calling).
Exon—A comma-separated list of exon regions read from RefGene.
FC—Functional Consequence.
GI—A comma-separated list of gene IDs read from RefGene.
QD—Variant Confidence/Quality by Depth.
TI—A comma-separated list of transcript IDs read from RefGene.

FORMAT

The format column lists fields separated by colons. For example, GT:GQ. The list of fields provided depends on the variant caller used. Available fields include:

AD—Entry of the form X,Y, where X is the number of reference calls, and Y is the number of alternate calls.
DP—Approximate read depth; reads with MQ=255 or with bad mates are filtered.
GQ—Genotype quality.
GQX—Genotype quality. GQX is the minimum of the GQ value and the QUAL column. In general, these values are similar; taking the minimum makes GQX the more conservative measure of genotype quality.
GT—Genotype. 0 corresponds to the reference base, 1 corresponds to the first entry in the ALT column, and so on. The forward slash (/) indicates that no phasing information is available.
NL—Noise level; an estimate of base calling noise at this position.
PL—Normalized, Phred-scaled likelihoods for genotypes.
SB—Strand bias at this position. Larger negative values indicate less bias; values near 0 indicate more bias. Used with the Somatic Variant Caller and GATK.
VF—Variant frequency; the percentage of reads supporting the alternate allele.

SAMPLE

The sample column gives the values specified in the FORMAT column.