Alignment

The Alignment analysis step first checks the FASTQ files associated with each sample for correct UMI information. If the FASTQ files do not contain UMI information specific to the TruSight Oncology UMI Reagents, the software does not perform the analysis.

The Burrows-Wheeler Aligner (BWA) uses the Maximal Exact Matches (MEM) algorithm to align DNA reads to hg19 (Homo_sapiens_masked PAR, source UCSC, build hg19). There is a known issue within the BWA version used in the Alignment step, which can produce slightly different results depending on the number of threads used. To ensure consistency between runs, it is recommended that you use the same number for the BwaThreadCount parameter in the override parameter file.

The hg19 genome is the human reference sequence with the chromosome Y pseudoautosomal regions (PAR) masked (with N's). PAR is present on both the X and Y chromosomes. Because they are identical between chromosomes, reads that map to these regions cannot map uniquely. The masked coordinates are: chrY:10001-2649520 and 59034050-59363566. For reference, the corresponding PAR on chrX are: chrX:60001-2699520 and 154931044-155260560.

The inputs are FASTQ files and the outputs are BAM files and their corresponding BAM index files.

More information about the algorithm can be found on the BWA website (http://bio-bwa.sourceforge.net/).