Home/
Introduction/Glossary

Glossary

Term

Definition

BAM

A standard (usually compressed) file format for representing aligned reads. The file is a binary version of the Sequence Alignment Map file standard.

BCL

Base call files produced by Illumina sequencers.

bcl2fastq

An Illumina program that translates BCL files into FASTQ files and labels the resulting reads with the UMIs.

Collapse

To derive a consensus read from all the reads in a Family.

Correction

To alter a UMI or positional grouping to make that grouping more accurate based on known valid UMIs and nearby Families.

Duplex-Family

Two families that originate from the two strands of dsDNA are collapsed into a duplex-family.

Family

A group of read pairs that share the same position and UMI tags. Also referred to as a "Bag" in some metrics.

FASTQ

A standard file format for representing unaligned reads and their corresponding quality data.

Manifest

A set of genomic intervals (or regions).

Noise Allele Frequency / Error Rate

Average of all allele frequency values between 0 and 0.05, excluding loci overlapping with known germline mutations from 1000 genome data set.

Read Pair

A Read 1/Read 2 pair of reads from the same cluster.

ReCo

The ReadCollapser program.

Stitcher

The read stitching program.

UMI

Unique Molecular Identifier.