Glossary
Term |
Definition |
---|---|
BAM |
A standard (usually compressed) file format for representing aligned reads. The file is a binary version of the Sequence Alignment Map file standard. |
BCL |
Base call files produced by Illumina sequencers. |
bcl2fastq |
An Illumina program that translates BCL files into FASTQ files and labels the resulting reads with the UMIs. |
Collapse |
To derive a consensus read from all the reads in a Family. |
Correction |
To alter a UMI or positional grouping to make that grouping more accurate based on known valid UMIs and nearby Families. |
Duplex-Family |
Two families that originate from the two strands of dsDNA are collapsed into a duplex-family. |
Family |
A group of read pairs that share the same position and UMI tags. Also referred to as a "Bag" in some metrics. |
FASTQ |
A standard file format for representing unaligned reads and their corresponding quality data. |
Manifest |
A set of genomic intervals (or regions). |
Noise Allele Frequency / Error Rate |
Average of all allele frequency values between 0 and 0.05, excluding loci overlapping with known germline mutations from 1000 genome data set. |
Read Pair |
A Read 1/Read 2 pair of reads from the same cluster. |
ReCo |
The ReadCollapser program. |
Stitcher |
The read stitching program. |
UMI |
Unique Molecular Identifier. |