Home/
Output Files/BAM File Format

BAM File Format

A BAM file (*.bam) is the compressed binary version of a SAM file that is used to represent aligned sequences. SAM and BAM formats are described in detail at samtools.github.io/hts-specs/SAMv1.pdf.

BAM files use the file naming format of SampleName_S#.bam, in which # is the sample number determined by the order that samples are listed for the run. In multinode mode, the S# is set to S1, regardless of the order of the sample.

BAM files contain a header section and an alignment section:

Header—Contains information about the entire file, such as sample name, sample length, and alignment method. Alignments in the alignments section are associated with specific information in the header section.
Alignments—Contains read name, read sequence, read quality, alignment information, and custom tags. The read name includes the chromosome, start coordinate, alignment quality, and match descriptor string.

The alignments section includes the following information for each read or read pair:

RG—Read group, which indicates the number of reads for a specific sample.
BC—Barcode tag, which indicates the demultiplexed sample ID associated with the read.
SM—Single-end alignment quality.
NM—Edit distance tag, which records the Levenshtein distance between the read and the reference.
XU—Identifies the UMI BAM tag.
XB—Identifies the bead-barcode BAM tag.

BAM index files (*.bam.bai) provide an index of the corresponding BAM file.