How to convert a custom BED file to a manifest file for enrichment analysis

02/19/20


Available Illumina enrichment analysis workflows, including those in BaseSpace Sequence Hub, Local Run Manager, and MiSeq Reporter, can use either the Illumina fixed panel manifests to specify the targeted regions for variant calling or use a custom manifest file. Manifests are provided for Illumina custom panels created through Illumina DesignStudio. Alternatively, many non-Illumina vendors provide BED files to define the targeted regions that need to be converted into a manifest. As shown in the image below, while the contents of a BED file (top) and manifest (bottom) are similar, the manifest file has a different format and some additional information. These examples are taken from the Nextera Rapid Capture Expanded Exome content set.

To convert a BED file into an enrichment manifest file, perform the following steps:

  1. Download the Nextera Rapid Capture Expanded Exome manifest to use as a template. Retain the [Header] section, including the path to the reference genome.
  2. For the [Regions] section, copy any target names from column 4 of the BED file to the Name column of the manifest file (column 1). See 'Important Considerations' below for more guidance.
  3. Copy the first, second, and third columns from the BED file to the second, third, and fourth columns of the manifest file (ie Chromosome, Start, End), respectively.
  4. If upstream and downstream probe lengths are provided by the vendor, they can be included in the next 2 columns. If they are not provided, set these values to 0.
  5. Save as a tab-delimited text file (TSV), not CSV or Excel.

Important considerations

  • Target names must be unique. To ensure that the names are unique, they can be appended with a series of numerical values, such as Target.001, Target.002, Target.003.
  • Only letters, numbers, periods, dashes, and underscores are allowed in the target names. Spaces and special characters (such as, but not limited to : and ") are not allowed. Remove any lines that contain track information contained within < > symbols.
  • Non-unique target names or the usage of unsupported characters return the error message "[Probes] section is missing or invalid" when importing into BaseSpace Sequence Hub. While enrichment manifest files do not require a [Probes] section, this message indicates that the file format is invalid.
  • When importing a manifest file to BaseSpace Sequence Hub, files larger than 30 MB in size may fail the validation step.