QUAL, QD, and GQ Formulation

In single sample VCF and gVCF, the QUAL follows the definition of the VCF specification (https://samtools.github.io/hts-specs/VCFv4.3.pdf).

QUAL is the Phred-scaled probability that the site has no variant and is computed as:

QUAL = -10*log10 (posterior genotype probability of a homozygous-reference genotype (GT=0/0))

That is, QUAL = GP (GT=0/0), where GP = posterior genotype probability in Phred scale.

QUAL = 20 means there is 99% probability that there is a variant at the site. The GP values are also given in Phred-scale in the VCF file.

GQ is the Phred-scaled Probability that the call is incorrect.

GQ=-10*log10(p), where p is the probability that the call is incorrect.

GQ=-10*log10(sum(10.^(-GP(i)/10))) where the sum is taken over the GT that did not win.

So, GQ of 3 means there's a 50 percent chance that the call is incorrect, and GQ of 20 means there's a 1 percent chance that the call is incorrect.

QD is the QUAL normalized by the read depth, DP.

Metric

QUAL

GQ

QD

Description

Probability that the site has no variant

Probability that the call is incorrect

Qual normalized by Depth

Formulation

QUAL = GP(GT=0/0)

GQ=-10*log10(p)

QUAL/DP

Scale

Unsigned Phred

Unsigned Phred

Unsigned Phred

Numerical example

QUAL=20: 1 % chance that there is no variant at the site

QUAL=50: 1 in 1e5 chance that there is no variant at the site

GQ=3, 50% chance that the call is incorrect

GQ=20, 1% chance that the call is incorrect