NEW! Upgrading from bcl2fastq to BCL Convert

10/19/21


The Illumina BCL Convert software is a standalone local Linux application that converts the binary base call (BCL) files produced by Illumina sequencing systems to FASTQ files. Based on software derived from the Illumina DRAGEN™ Bio-IT platform, BCL Convert offers improvements to the speed and efficiency of handling large data sets compared to the older bcl2fastq software. While both bcl2fastq and BCL Convert are currently supported, BCL Convert is planned to replace bcl2fastq in the future. The BCL Convert compatibility support page provides a broad comparison between the two programs.  This bulletin provides a more detailed comparison of usage and feature changes between the latest release of bcl2fastq (v2.20) and BCL Convert.

Feature bcl2fastq 2.20 BCL Convert Changes from bcl2fastq
File name <Sample_Name>_S#_L00#_<R or I>#_00#.fastq.gz
OR
<Sample_ID>_S#_L00#_R#_001.fastq.gz (if Sample_Name is not present)
<Sample_ID>_S#_L00#_<R# or I#>_001.fastq.gz Always include Sample ID as part of file output naming convention. Sample name is ignored in BCL Convert.
FASTQ Header @Instrument:RunID:FlowCellID:Lane:Tile:X:Y[:UMI]
ReadNum:FilterFlag:0:IndexSequence
or SampleNumber
@Instrument:RunID:FlowCellID:Lane:Tile:X:Y[:UMI]
ReadNum:N:0:IndexSequence
or SampleNumber
Filter flag is set to = “N” and the control bit = “0”. Missing instrument is not supported.
Determine Expected Input Files Expects Config.xml in defined input folder:
<input folder>/Data/Intensities/BaseCalls
If no Config.xml, expects RunInfo.xml in defined input folder.
Expects Config.xml in defined input folder:
<input folder>/Data/Intensities/BaseCalls
If no Config.xml, expects RunInfo.xml in defined input folder.
No change.
Show command line options and help information --help or -h
--version or -v
--help or -h
--version or -V
Command requires upper case letter “V” to get version info in the command line.
Run Folder Command Line:
--runfolder-dir or -R
Command Line:
--bcl-input-directory
Input specified is the same (top level of run folder is the input), only the command line option has changed.
Input Folder Command Line:
--input-dir or -i
None Cannot specify path to BaseCalls folder specifically.
Output Folder Command Line:
--output-dir or -o
(default input folder)
Command Line:
--output-directory
(required, default cannot exist)
 
Command Line:
--force, -f
(allows output to be written to existing folder)
Output directory is required. Specify directory for new folder, otherwise, use --force to use an existing folder.
Sample Sheet Format V1 format only. V1 and V2 formats both accepted. See software guide for examples of V1 and V2 formatting changes.
Sample Sheet Path Command Line:
--sample-sheet
(default input folder, sample sheet not required)
Command Line:
--sample-sheet
(default input folder, sample sheet required)
Sample Sheet is now required.
 
The software will default to search for SampleSheet.csv in input run folder. The --sample-sheet option is used to specify the path to the file if it is not in the default location.
 
Note: At least one Sample_ID is required in the Data section of the Sample Sheet.
Ignore missing base call (BCL) files Command Line:
--ignore-missing-bcls (default off)
Command Line:
--strict-mode false
(default false)
Missing or corrupt BCL is ignored and the corresponding base call is replaced with an N with a quality score of 2 (#).
Ignore missing or corrupt Filter files Command Line:
--ignore-missing-filter
(default off)
 
Assumes PF for all tiles with missing filter files.
Command Line:
--strict-mode false
(default false)
 
Note behavior change, no FASTQ entries for reads in tiles with missing filter files
BCL Convert does not produce FASTQ entries for any reads where the filter file is missing.
Ignore missing or corrupt Position files Command Line:
--ignore-missing-positions
(default off)
Command Line:
--strict-mode false
(default false)
FASTQ file header will contain automatically generated unique XY positions when position files are missing.
Assume that failed reads are PF Command Line:
--with-failed-reads
(default off)
None No longer supported.
Ignore beginning or end of read Sample Sheet:
Read1StartFromCycle # (default 1)
Read2StartFromCycle # (default 1)
 
Read1EndWithCycle # (default last cycle)
Read2EndWithCycle # (default last cycle)
 
Command Line:
--use-bases-mask Y#;N#
(default all cycles used)
Sample Sheet:
OverrideCycles,Y#;N#
(default all cycles used)
OverrideCycles can only be applied to the entire analysis; there is no per-lane option for OverrideCycles. Cycles cannot be ignored from the middle of a read.
Use a subset of bases for i7/i5 indexes Sample Sheet:
Use subset of index cycles for demultiplexing by providing shortened sequence in index or index2 column within a lane.
 
Command Line:
--use-bases-mask I#N# (default use all index cycles defined in RunInfo.xml).
Sample Sheet:
Use subset of index cycles for demultiplexing by providing shortened sequence in index or index2 column and providing desired length in OverrideCycles setting
(default use all index cycles defined in RunInfo.xml)
The number of cycles defined per read in OverrideCycles must always match the number of cycles in the corresponding read of the RunInfo.xml.
Wildcard Index Sequences None None Wildcard entries (N) for indexes are not supported.
Index FASTQs Sample Sheet:
CreateFastqForIndexReads
0 or 1
(default 0)
Command Line:
--create-fastq-for-index-reads
Sample Sheet:
CreateFastqForIndexReads, 0 or 1
(default 0)
Generating FASTQs for index reads is off by default, add the sample sheet setting with a value of 1 to enable. When an index read is specified as a UMI with OverrideCycles, the UMI read will be output to a FASTQ file. This feature introduced in BCL Convert version 3.7.5
FASTQ Compression Specification Command Line:
--no-bgzf-compression
--fastq-compression-level
None FASTQ files are always gzipped at a compression level of “–1”. Multiple gzip compression regions are appended to the same file with large block sizes. Some tools could have trouble fully decompressing these files if they do not continue past the first gzip region.
Barcode Mismatches Command Line:
--barcode-mismatches # or #,#
(default 1 == 1,1)
Sample Sheet:
BarcodeMismatchesIndex1,#
(default 1)
BarcodeMismatchesIndex2,#
(default 1)
 
Note: Command Line no longer supported.
Index # must be specified separately.
Adapter Read 1, 2 Trimming Sample Sheet:
Adapter/TrimAdapter,
A/T/C/G
AdapterRead2/TrimAdapterRead2,A/T/C/G
Sample Sheet:
AdapterRead1,A/T/C/G
AdapterRead2,A/T/C/G
AdapterBehavior,trim

(default trim)
Read 1 and Read 2 adapters must be specified separately.
Adapter Read 1, 2 Masking Sample Sheet:
MaskAdapter A/T/C/G
MaskAdapterRead2 A/T/C/G
Sample Sheet:
AdapterRead1,A/T/C/G
AdapterRead2,A/T/C/G
AdapterBehavior,mask

(default trim)
Read 1 and Read 2 adapters must be specified separately.
Adapter Stringency Command Line:
--adapter-stringency #
(default 0.9, 0.0-1.0 allowed)
Sample Sheet:
AdapterStringency,#
(default 0.9, 0.5-1.0 allowed)
 
Note: Command Line no longer supported.
The range is now 0.5-1.0 vs 0.0-1.0.
Adapter Matching Algorithm Sample Sheet in [Settings] section:
FindAdaptersWithIndels 1
(default on, 0 = Sliding Window)
None (always Sliding Window) Finding adapter with indels is no longer supported.
Trimming last bases when they match the adapter Always trims or masks the final X bases when they overlap with the adapter provided according to stringency settings. Sample Sheet:
MinimumAdapterOverlap,#
(default 1, 1-3 allowed)
 
Never trims or masks less than X bases when they overlap with the adapter provided regardless of stringency settings, where X is the MinimumAdapterOverlap provided.
Default behavior is identical.
Minimum Read Length Command Line:
--minimum-trimmed-read-length #
(default 35)
Sample Sheet:
MinimumTrimmedReadLength,#
(default 35)
 
Note: Command Line no longer supported.
Part of sample sheet.
Minimum Number of ATCG Bases per Read Command Line:
--mask-short-adapter-reads #
(default 22)
Sample Sheet:
MaskShortReads,#
(default 22)
 
Note: Command Line no longer supported.
Part of sample sheet.
UMI Settings Sample Sheet:
Trim UMI 0,1 (default 0)
Read1UMIStartFromCycle # (default 1)
Read2UMIStartFromCycle # (default 1)
Read1UMILength #
Read2UMILength #
Sample Sheet:
OverrideCycles,U#
TrimUMI,0 or 1 default 1)
UMIs can now be defined in index or genomic reads.
Default is to trim UMIs.
TrimUMI option introduced in BCL Convert version 3.7.5
Use Subset of Tiles for Processing Command Line:
--tiles (provide list of tiles to include)
Command Line:
--tiles (provide list of tiles to include)
--tiles introduced in BCL Convert version 3.9
Exclude Tiles from processing Sample Sheet:
ExcludeTiles #: Provide list or range of tiles to exclude from processing (default no tiles)
 
ExcludeTilesLaneX: Provide list or range of tiles in lane X to exclude from processing (default no tiles)
Command Line:
--first-tile-only true (default false)
Or
--exclude-tiles (provide list of tiles to exclude)
For testing purposes, BCL Convert can run with the first tile only with –first-tile-only Option not compatible with NovaSeq SP flowcells.
--exclude-tiles introduced in BCL Convert version 3.9
Logging Console output Console output
Warnings/Errors/Information log files in <output_directory>/Logs.
FastqComplete.txt into <output_directory>/Logs after all FASTQs are created.
New support for logging files. Less verbose logging.
 
New output file: fastqcomplete.txt is generated in log folder.
Association of Samples and output FASTQ files None Goes into fastq_list.csv in <output_directory>/Reports New report of samples and output FASTQ file association is now generated.
Combine multiple FASTQ files Command Line:
--no-lane-splitting
(default off)
Command Line:
--no-lane-splitting
(default off)
Sample sheet:
NoLaneSplitting,true or
false
(default false)
Concatenation of FASTQ files separated by lane can be done by enabling this setting. FASTQs will be output with the naming convention
<Sample_ID>_S#_<R or I>#_001.fastq.gz (no L00# included).
Reports will be generated with values separated by lane. Command line option introduced in BCL Convert version 3.7.5. Sample sheet setting introduced in BCL Convert 3.8. Command line and sample sheet settings must be consistent.
Reverse Complement all reads Sample Sheet:
ReverseComplement 1
(default 0)
None Impacts Nextera Mate Pair kits, which are not supported by BCL Convert.
Sample Project Sample Sheet:
Creates directory with sample project name.
Can use multiple samples in same project.
Cannot use “all” or “default” as project name.
Command Line:
--bcl-sampleproject-subdirectories true
(default false)
 
Sample Sheet:
Sample_Project column in Data section.
By default, all FASTQ files will be placed into the same output directory regardless of Sample Sheet columns. Command Line must be set in order to generate subdirectories.
Sample Name Sample Sheet:
Used for FASTQ name.
Cannot use “all” or “undetermined” as name.
Software ignores Sample Name in V1 sample sheet, rejects Sample_Name in V2 sample sheet.  
IndexMetricsOut.bin Output Location Command Line:
--interop-dir
(default <runfolder-dir>/InterOp)
Always output to <output_directory>/Reports. User cannot configure IndexMetricsOut.bin output location.
Number of perfect barcodes, 1 mismatch barcodes Provided in DemultiplexingStats.xml and HTML report. Provided in Demultiplex_Stats.csv (<output_directory>/Reports). HTML reports with demultiplexing reports are not produced.
Unknown Barcodes Provided in AdapterTrimming.txt, DemultiplexingStats.xml, DemuxSummaryF#L#.txt, HTML report. Reported in Top_Unknown_Barcodes.csv
(top 100 per lane).
AdapterTrimming.txt is not generated.
Adapter Trimming Metrics Provided in AdapterTrimming.txt. Provided in Adapter_Metrics.csv.  
Lane-Specific Processing Define only the desired lanes in the sample sheet.
  1. Define only the desired lanes in the sample sheet.
  2. Command Line:
    --bcl-only-lane #
    (default all lanes in sample sheet).
Sample sheet change and command line option are both required.
Processing Options Command Line:
--loading-threads
--processing-threads
--writing-threads
Command Line:
--bcl-num-decompression-threads
--bcl-conversion-threads
--bcl-num-compression-threads
--bcl-num-parallel-tiles
Defaults are set dynamically.
This option introduced in BCL
Convert version 3.7.5.