You are here:

Analysis Output | Output Files | BAM File Format

BAM File Format

A BAM file (*.bam) is the compressed binary version of a SAM file that is used to represent aligned sequences up to 128 Mb. SAM and BAM formats are described in detail at samtools.github.io/hts-specs/SAMv1.pdf.

BAM files use the file naming format of SampleName_S#.bam, where # is the sample number determined by the order that samples are listed for the run. In multinode mode, the S# is set to S1, regardless of the order of the sample.

NOTE

Whole Genome Sequencing v5.0 is multinode only and uses the file naming format SampleName_S1.bam.

BAM files contain a header section and an alignment section:

Header—Contains information about the entire file, such as sample name, sample length, and alignment method. Alignments in the alignments section are associated with specific information in the header section.
Alignments—Contains read name, read sequence, read quality, alignment information, and custom tags. The read name includes the chromosome, start coordinate, alignment quality, and the match descriptor string.

The alignments section includes the following information for each read or read pair:

RG: Read group, which indicates the number of reads for a specific sample.
BC: Barcode tag, which indicates the demultiplexed sample ID associated with the read.
SM: Single-end alignment quality.
AS: Paired-end alignment quality.
NM: Edit distance tag, which records the Levenshtein distance between the read and the reference.
XN: Amplicon name tag, which records the amplicon tile ID associated with the read.

BAM index files (*.bam.bai) provide an index of the corresponding BAM file.

For Research Use Only. Not for use in diagnostic procedures. 

©2016 Illumina, Inc. All rights reserved.