SJ.out.tab
Along with the alignments emitted in the SAM/BAM file, an additional SJ.out.tab file summarizes the high confidence splice junctions in a tab-delimited file. The columns for this file are as follows:
|
2.
|
first base of the splice junction (1-based) |
|
3.
|
last base of the splice junction (1-based)strand (0: undefined, 1: +, 2: -) |
|
4.
|
strand (0: undefined, 1: +, 2: -) |
|
5.
|
intron motif: 0: noncanonical, 1: GT/AG, 2: CT/AC, 3: GC/AG, 4: CT/GC, 5: AT/AC, 6: GT/AT |
|
6.
|
0: unannotated, 1: annotated, only if an input gene annotations file was used |
|
7.
|
number of uniquely mapping reads spanning the splice junction |
|
8.
|
number of multimapping reads spanning the splice junction |
|
9.
|
maximum spliced alignment overhang |
The maximum spliced alignment overhang (column 8) field in the SJ.out.tab file is the anchoring alignment overhang. For example, if a read is spliced as ACGTACGT------------ACGT, then the overhang is 4. For the same splice junction, across all reads that span this junction, the maximum overhang is reported. The maximum overhang is a confidence indicator that the splice junction is correct based on anchoring alignments.
There are two SJ.out.tab files generated by the DRAGEN host software, an unfiltered version and a filtered version. The records in the unfiltered file are a consolidation of all spliced alignment records from the output SAM/BAM. However, the filtered version has a much higher confidence for being correct due to the use of the following filters.
A splice junction entry in the SJ.out.tab file is filtered out if any of these conditions are met:
|
•
|
SJ is a noncanonical motif and is only supported by < 3 unique mappings. |
|
•
|
SJ of length > 50000 and is only supported by < 2 unique mappings. |
|
•
|
SJ of length > 100000 and is only supported by < 3 unique mappings. |
|
•
|
SJ of length > 200000 and is only supported by < 4 unique mappings. |
|
•
|
SJ is a noncanonical motif and the maximum spliced alignment overhang is < 30. |
|
•
|
SJ is a canonical motif and the maximum spliced alignment overhang is < 12. |
The filtered SJ.out.tab is recommended for use with any downstream analysis or post processing tools. Alternatively, you can use the unfiltered SJ.out.tab and apply your own filters (for example, with basic awk commands).
Note that the filter does not apply to the alignments present in the BAM or SAM file.