The Visualization and Analysis
of Microbial Population Structures
VAMPS currently recognizes two taxonomic file formats, both compliant with the Ribosomal Database Project (RDP).
If your data was not generated by RDP, please convert your text files to one of the following formats before uploading.
1. Classifier text format: this format is produced when data are uploaded and classified on the RDP website.
After a series of optional blank lines, seven lines define information about the taxonomy calculation.
VAMPS looks for the presence of these lines but does not actively use them.
They are Classifier, Taxonomical Hierarchy, Query File, Submit Date, Confidence threshold, Symbol, Lineage.
Then the key word Details, is followed by data lines. Each line represents one sequence/organism.
The data lines include the sequence/organism name, followed by the taxonomic - domain, phylum, class, order, family, and genus alternating with the RDP boot scores
(as integer percents between 0% and 100%). If you have not used RDP, simply use 100% for the boot scores.
Classifier: Naive Bayesian rRNA Classifier Version 2.0, July 2007
Taxonomical Hierarchy: Taxonomic Outline of the Bacteria and Archaea, release 7.8
Query File: testdata.fa
Submit Date: Mon Sep 19 12:01:35 EDT 2007
Confidence threshold: 80%
Symbol - after a sequence name indicates the results are obtained using reverse complement of that query sequence.
Sequence1; ; Root; 100%; Bacteria; 100%; Proteobacteria; 100%; Alphaproteobacteria; 100%; Rhizobiales; 100%; Hyphomicrobiaceae; 84%; Filomicrobium; 80%;
Sequence2; ; Root; 100%; Bacteria; 100%; Proteobacteria; 100%; Alphaproteobacteria; 100%; Rhodobacterales; 52%; Rhodobacteraceae; 52%; Pannonibacter; 40%;
Sequence3; ; Root; 100%; Bacteria; 100%; Proteobacteria; 100%; Gammaproteobacteria; 100%; Legionellales; 51%; Coxiellaceae; 51%; Rickettsiella; 51%;
2. Fasta-style format: this format is returned from local installations of RDP.
For each sequence or organism, this file contains two lines, a definition line starting with ">" that includes the sequence/organism name, and a taxonomic line that includes the taxonomy along with RDP boot scores (as a decimal between 0 and 1.0) separated by semi-colons. If you have not used RDP, simply use 1.0 for the boot scores.
Root; 1.0; Bacteria; 1.0; Proteobacteria; 1.0; Alphaproteobacteria; 1.0; Rhizobiales; 1.0; Hyphomicrobiaceae; 0.84; Filomicrobium; 0.8;
Root; 1.0; Bacteria; 1.0; Proteobacteria; 1.0; Alphaproteobacteria; 1.0; Rhodobacterales; 0.52; Rhodobacteraceae; 0.52; Pannonibacter; 0.4;
Root; 1.0; Bacteria; 1.0; Proteobacteria; 1.0; Gammaproteobacteria; 1.0; Legionellales; 0.51; Coxiellaceae; 0.51; Rickettsiella; 0.51;