Analysis Reports

The BWA Whole Genome Sequencing app provides an overview of statistics per sample on the sample pages. A brief description of the metrics is below.

Alignment Summary

Statistic Definition
Number of reads Total number of reads passing filter for this sample.

Coverage

Total number of aligned bases divided by the genome size.

Percent Duplicate Paired Reads

Percentage of paired reads that have duplicates.

Fragment Length Median

Median length of the sequenced fragment. The fragment length is calculated based on the locations at which a read pair aligns to the reference. The read mapping information is parsed from the BAM files.

Fragment Length Standard Deviation

Standard deviation of the sequenced fragment length.
Read Statistics
Statistic Definition

Percent Aligned

Percentage of reads passing filter that aligned.

Percent Q30 The percentage of bases with a quality score of 30 or higher.

Mismatch Rate

The average percentage of mismatches across both reads 1 and 2 over all cycles.

Small Variants Summary

This table provides metrics about the number of SNVs, insertions, and deletions.

Statistic Definition

Total Passing

Total number of variants present in the data set that pass the variant quality filters.

Percent found in dbSNP

100*(Number of variants in dbSNP/Number of variants).

Het/Hom Ratio

Number of heterozygous variants/Number of homozygous variants.

Ts/Tv Ratio

Transition rate of SNVs that pass the quality filters divided by transversion rate of SNVs that pass the quality filters. Transitions are interchanges of purines (A, G) or of pyrimidines (C, T). Transversions are interchanges between purine and pyrimidine bases (for example, A to T).

Variants by Sequence Context

Statistic Definition
Number in Genes The number of variants that fall into a gene.
Number in Exons The number of variants that fall into an exon.
Number in Coding Regions The number of variants that fall into a coding region.
Number in UTR Regions The number of variants that fall into an untranslated region (UTR).
Number in Splice Site Regions The number of variants that fall into a splice site region.
Number in Mature microRNA The number of variants that fall into a mature microRNA.

Variants by Consequence

Statistic Definition
Frameshifts

The number of variants that cause a frameshift.

Non-synonymous The number of variants that cause an amino acid change in a coding region.
Synonymous The number of variants that are within a coding region, but do not cause an amino acid change.
Stop Gained The number of variants that cause an additional stop codon.
Stop Lost The number of variants that cause the loss of a stop codon.

Structural Variants Summary

This table breaks structural variant output into the classes of variants called, and reports the total number and their overlap with annotated genes. All counts are based on PASS filter variants.

Variant Class Definition
CNV

Number of copy number variations.

 

Insertions Number of insertions
Tandem duplications Number of tandem duplications
Deletions Number of deletions
Inversion Number of inversions

Coverage Histogram

The coverage histogram shows the number of reference bases plotted against the depth of coverage (read depth). It has the following features:

The dropdown menu allows you to look at the overall picture, or highlight a particular chromosome.
The Fix Y Scale checkbox allows you to keep the Y Scale the same when comparing multiple chromosomes.
The Export TSV link allows you to export the coverage data in a tab-separated TXT file.

Figure 4   BWA Whole Genome Sequencing Coverage Histogram

 

© 2014 Illumina, Inc. All rights reserved.

15050953 Rev. B