BAM File Format
A BAM file (*.bam) is the compressed binary version of a SAM file that is used to represent aligned sequences up to 128 Mb. SAM and BAM formats are described in detail at https://samtools.github.io/hts-specs/SAMv1.pdf.
BAM files use the file naming format of SampleName_S#.bam, where # is the sample number determined by the order that samples are listed for the run. In multi-node mode, the S# is set to S1, regardless the order of the sample.
BAM files contain a header section and an alignment section:
|
▶
|
Header—Contains information about the entire file, such as sample name, sample length, and alignment method. Alignments in the alignments section are associated with specific information in the header section.
|
|
▶
|
Alignments—Contains read name, read sequence, read quality, alignment information, and custom tags. The read name includes the chromosome, start coordinate, alignment quality, and the match descriptor string. |
The alignments section includes the following information for each or read pair:
|
▶
|
RG: Read group, which indicates the number of reads for a specific sample. |
|
▶
|
BC: Barcode tag, which indicates the demultiplexed sample ID associated with the read. |
|
▶
|
SM: Single-end alignment quality. |
|
▶
|
AS: Paired-end alignment quality. |
|
▶
|
NM: Edit distance tag, which records the Levenshtein distance between the read and the reference. |
|
▶
|
XN: Amplicon name tag, which records the amplicon tile ID associated with the read. |
BAM index files (*.bam.bai) provide an index of the corresponding BAM file.
TruSeq Amplicon v2.0 App Online Help