Troubleshooting demultiplexing issues using BaseSpace Sequence Hub and bcl2fastq2 v2.17+


The information in this bulletin applies to run data demultiplexed on BaseSpace Sequence Hub, and to run data demultiplexed using the bcl2fastq2 conversion software version 2.17 and later.

After demultiplexing, BaseSpace Sequence Hub and the bcl2fastq2 v2.17+ conversion software output a demultiplexing summary file called DemuxSummaryF1L#.txt. The L# stands for the lane number on the flow cell, and one summary file is output for each lane. The DemuxSummaryF1L#.txt files can be used to troubleshoot demultiplexing issues.

Where are the DemuxSummaryF1L#.txt files?

    bcl2fastq2 v2.17+
    After FASTQ file generation completes, the DemuxSummaryF1L#.txt files are located in the Stats folder, which is located in the specified output directory.
    BaseSpace Sequence Hub
    For all runs, the DemuxSummaryF1L#.txt files are located in the project after FASTQ file generation completes. To find the files in your BaseSpace Hub project, click on the FASTQ Generation link in the Analyses list. On the Analysis Info page, click on the Log Files link (Figure 1). You will find the DemuxSummaryF1L#.txt files on the list of log files. 

Figure 1: The View Files link on the Summary page in a BaseSpace Sequence Hub project.

What information is in the DemuxSummaryF1L#.txt files?

    The DemuxSummaryF1L#.txt file has 2 sections. The top section (Figure 2) is a tab-delimited table that summarizes the demultiplexing results on a per-tile basis. On the left side of the table is a list of the tiles on the flow cell lane. Along the top of the table, the samples are listed in the order in which they were entered into the sample sheet. Sample 0 is always reserved for the undetermined reads. The table shows the percentage of reads demultiplexed to each sample, per tile. In general, for a given sample, the percentage of demultiplexed reads should be similar across all tiles. The tile summary information can be used to identify tile-specific demultiplexing issues.

Figure 2: The tile summary section shows the percentage of reads demultiplexed to each sample, per tile.

The second section of the DemuxSummaryF1L#.txt file (Figure 3) lists the 1,000 most common undetermined index sequences, and the number of times each index sequence was found (or the number of clusters assigned to each index sequence).


Figure 3: The most popular index sequences section of the DemuxSummaryF1L#.txt file lists the top 1000 index sequences associated with the undetermined reads.

When troubleshooting demultiplexing issues, this list can be used to compare the expected index sequences (those in the sample sheet) to those that were found. Some common causes for poor demultiplexing that these lists can reveal are:

  • Index sequences entered in the wrong orientation in the sample sheet.
  • Incorrect index sequences entered in the sample sheet (eg, Nextera vs TruSeq UD or index A001 vs index A006).
  • Sample mix ups between lanes.
  • Poor Index Read sequencing quality.
    • Ns in the sequences represent positions where the base calling software was unable to make a base call.
    • For sequencing systems using Illumina one-channel (iSeq) or two-channel SBS chemistry (MiniSeq, NextSeq 500/550 and NovaSeq), poly-G sequences indicate that no index sequence was read.  Poly-G sequences are typical for PhiX reads, which are not indexed.