Home/
16S Metagenomics Methods/Taxonomic Database

Taxonomic Databases

The 16S Metagenomics app includes three taxonomic databases to perform taxonomic classification.

RefSeq RDP 16S v3 May 2018 DADA2 32bp

This FASTA file is based on: https://benjjneb.github.io/dada2/training.html. Citation: Ali Alishum. (2019). DADA2 formatted 16S rRNA gene sequences for both bacteria & archaea (Version 2).

Greengenes May 2013 32bp

This database is an Illumina-curated version of the May 2013 release of the Greengenes Consortium Database (greengenes.secondgenome.com/).

The following table includes the current statistics for that database:

Taxonomic Level

Number of Classifications

Kingdoms

3

Phyla

33

Classes

74

Orders

148

Families

321

Genera

1086

Species

6466

The Greengenes SQL database files (gg_13_5.sql.gz) was used to get taxonomies down to the species level. Specifically our database started off with everything contained in the Greengenes clones, isolates, and symbionts tables. From there, the following set of filters is applied:

1. Filter all entries where the 16S sequence length was below 1250 bp.
2. Filter all entries that had more than 50 wobble bases (eg, M, R, W, S, Y, K, V, H, D, B, N).
3. Filter all entries that were only partially classified (no classification for genus or species).

The Greengenes database had a number of classifications placed in the incorrect field, such as improper genus or species names, placing clone or strain IDs in the species field, etc. Illumina developed a program to help identify and clean up these entries.

Ambiguous epithets and classifications (sp, aff, cf, genosp, genomosp) were removed, because they are equivalent to an empty taxonomic level.

Listeria monocytogenes (GenBank entry X56153.1), Listeria innocua (GenBank entry FJ774235.1), and PhiX (NCBI reference sequence: NC_001422) are included in the database to support internal research projects.

UNITE Fungal ITS Database v7.2

This FASTA is based on UNITE Community (2017): UNITE general FASTA release. Version 01.12.2017. UNITE Community. https://doi.org/10.15156/BIO/587475 Includes singletons set as RefS (in dynamic files).