File Format
Each entry in a FASTQ file consists of four lines:
• | Sequence identifier |
• | Sequence |
• | Quality score identifier line (consisting only of a +) |
• | Quality score |
The first line, identifying the sequence, contains the following elements.
@<instrument>:<run number>:<flowcell ID>:<lane>:<tile>:<x-pos>:<y-pos>:<UMI> <read>:<is filtered>:<control number>:<index>
Element |
Requirements |
Description |
@ |
@ |
Each sequence identifier line starts with @. |
<instrument> |
Characters allowed: a–z, A–Z, 0–9 and underscore |
Instrument ID. |
<run number> |
Numerical |
Run number on instrument. |
<flowcell ID> |
Characters allowed: a–z, A–Z, 0–9 |
Flow cell ID |
<lane> |
Numerical |
Lane number. |
<tile> |
Numerical |
Tile number. |
<x_pos> |
Numerical |
X coordinate of cluster. |
<y_pos> |
Numerical |
Y coordinate of cluster. |
<UMI> |
Restricted characters: A/T/G/C/N |
Optional, appears when UMI is specified in the sample sheet. UMI sequences for Read 1 and Read 2, separated by a plus [+]. |
<read> |
Numerical |
Read number. 1 can be single read or Read 2 of paired-end. |
<is filtered> |
Y or N |
Y if the read is filtered (did not pass), N otherwise. |
<control number> |
Numerical |
0 when none of the control bits are on, otherwise it is an even number. For systems that do not perform control specification, this number is always 0. |
<index> |
Restricted characters: A/T/G/C/N |
Index of the read. |
An example of a valid entry is as follows; note the space preceding the read number element:
@SIM:1:FCX:1:15:6329:1045:GATTACT+GTCTTAAC 1:N:0:ATCCGA
TCGCACTCAACGCCCTGCATATGACAAGACAGAATC
+
<>;##=><9=AAAAAAAAAA9#:<#<;<<<????#=