Recent Updates

Feb 2nd, 2016
New SILVA Database taxonomy: updated to version 119
The new taxonomy is reflected in the all the MBL origin projects. While the user-uploaded project taxonomy is not affected. To use the new taxonomy with your project you must delete the old taxonomy before re-GASTing. Or Re-upload your project under a new name.

May 19th, 2015
Added new citations to the methods page
More information: Methods Publications

Apr 30th, 2014
Data Exports (Taxcounts and Fasta Files) Change
The increasing size of the VAMPS database has neccesitated a change to the way that the majority of data is downloaded from VAMPS. Previously you were able to select the datasets and download these files 'on-the-fly' meaning that the web interface would be unresponsive while the data was pulled from the database and the file were written. Now the time consuming work will go on in the background and when ready the files will be available for a couple of weeks here.
More information: Data Exports

Apr, 2014
Geo Cluster & Geo Distribution -- Added this for all projects with lat/long metadata
Previously you could use this only for the ICoMM datasets but now there are other projects with latitude and longitude data including both private and public data like some DCO (Deep Carbon Observatory), LTR (Long Term Ecological Reserve) and RARE (Rare Biosphere) projects. It uses the Google Earth API to visualize a community dendrogram on the globe clustered with selectable methods.
Give it a try: Geo Cluster -- Geo Distribution

Mar, 2014
Export Fasta and Taxcounts: Added the ability to choose a specific taxonomy.
Previously you could only choose a specific domain (Archaea, Bacteria, Eukarya...) or 'All Domains', but now you can browse through the taxonomy for each domain to select a taxonomic lineage at any rank down to species.
Export Fasta Sequences -- Export Taxcounts

Jan, 2014
OTUs: added anchor trimming for our Bacterial v6v4 sequences.
Our Bacterial v6v4 reads are currently 480bp long. It is well known that these long 454 reads generated through unidirectional sequencing have lower quality at the end of the read, which can contribute to error and inflate OTU clusters. Most OTU clustering pipelines highly suggest trimming reads for quality and uniform length, generally to shorter than 400bp. So for more meaningful clustering of reads we will trim off that low-quality end. Trimming to the v6v5 anchor generates a read <400bp in length that is overall higher quality than the original 480bp read, has a conserved end point, and still has the highly informative v6 region.

Dec, 2013
New Portal: Microbiome of the Built Environment
These projects were funded by the Sloan Foundation and relate to studies of the microbiome of human built structures. More Information....

Dec, 2013
Publications Page
This page will list the publications that use data from the VAMPS site. If you have published data and would like to list your publication here, please send an email to with the details.

Sept, 2013
OTU Upload You can now upload OTUs in either an OTU Table or biom file format.
See OTU uploads page

May - Aug, 2013
Improved Upload capabilities The upload process is more robust and the storage hardware has been upgraded. In addition to uploading raw and trimmed (post-processed) sequences now you can also upload taxonomic counts data in a matrix format (without sequences).
More Information....

June, 2012
Oligotyping This analysis tool is new and not yet published. It is a supervised method to identify closely related but distinct bacterial taxa in high-throughput sequencing data of 16S Ribosomal RNA gene.
More Information...

Dec, 2012
QIIME - updated to version 1.7

Mar, 2012
Portals Portals allow users access to specific groups of projects. Some portals are public like ICoMM (International Census of Marine Microbes) and some are private now but will become public as the data are published. Currently there are seven portals on VAMPS.

Dec, 2011
QIIME - 454 Overview Tutorial
Using VAMPS data you can perform the QIIME analyses that follow the 454 QIIME tutorial.

March 18th, 2011.
Initial chimera checking and removal has been performed on all V6V4 and V3V5 sequences using the uchime option of USearch. We are currently using a combination of reference set comparison (similar to Chimera Slayer) and de novo checking (directly comparing sequences within each PCR amplification). We are currently running additional review of possible chimeras that have not been identified and removed (false negatives) and expect to refine our chimera removal process in the near future.
Information on USearch can be found here:

November 8th, 2010.
v6 clusters have been updated and are available here.

August 9th, 2010.
The VAMPS data has been updated to Silva 102. For v6v4 and v3v5 projects, chimeras will be removed as soon as possible.

April 5th, 2010.
The operational taxonomic units (OTUs) for the V6 sequencing data have been recalculated and are now available through the Clusters and Diversity page. All V6 data have been clustered together using the new SLP method based on pairwise distances, a noise-reducing preclustering step, and an average linkage clustering. The advantage of the new method is that it provides correction for sequencing errors, it minimizes the propagation of OTUs with sampling depth, and can be run across all projects. Unique OTU IDs can be used to compare between projects and datasets. Any samples sequenced since March 16 have not yet been incorporated with the new clusters. The previous multiple sequence alignment, complete linkage clusters are still available through the VAMPS archive.

October 9th, 2009.
Upload Untrimmed Data.
The ability to upload untrimmed data has been added. Uploads include raw sequence data, quality data, primers, and run keys. After uploading, the data is automatically run through our trimming process. The data can then be viewed in the Community Visualization tools.

August 23rd, 2009.
Improved Trimming and GASTing

  • The base-calling software internal to the GS-FLX was updated late February 2008. Prior to February 2008, only the length of homopolymers dictated the quality scores. All runs March 7, 2008 and later report an improved quality score that we are now in a position to improve the quality of sequence reads. Data on the VAMPS site posted on March 7, 2008 or later no longer contain reads with an average base quality less than 30.
  • The fuzzy trimming of the distal primer infrequently left the first two or three bases of the primer at the end of the read. This would only occur in cases where the "fuzz" was just after the first few bases, and the remaining bases were recognized as valid primer. We now require the first 3 bases of the primer to be part of the "fuzzy" find for it to be valid. In many cases, moving down the list of fuzzy matches, we were still able to remove the primer at the start. The clustering OTUs ignore terminal gaps, so the effect should be minimal on clustering.
  • The current GAST process has a more stringent requirement for a valid prefiltering BLAST match. If the BLAST match of the read has an alignment length less than 80% of the read length, it is not considered an adequate match. In cases where no valid BLAST match now exists, the GAST process will assign "Unknown" as the taxonomy. In experiments of known sequence, we have found that this helps to remove nonV6 contamination. The reads are not deleted - they may still be a high quality read, but they will not contribute to the clusters and diversity, because they will be filtered out before the clusters are run.

The previous data have been archived at:

July 30th, 2009.
Clusters and Diversity. A taxonomy file (*.tax) has been added to the OTU Cluster files. The taxonomy file shows the taxonomy corresponding to the reads in the *.list file for the 0.03, 0.06, and 0.10 widths.

April 2nd, 2009.
Export Taxcounts. The ability to export data for selected domains was added to the Export Taxcounts page. The user can now select Archaea, Bacteria, Eukarya, or Organelle sequences for download, as well as all domains together.

March 12th, 2009.
Exporting Fasta Sequences. The GAST distance was added to the definition line in the fasta files. The format of the definition line is now Sample ID | Project | Dataset | GAST Distance and Taxonomy | Count.

March 9th, 2009.
Exporting Fasta Sequences. The ability to export data for selected domains was added. The user can now select Archaea, Bacteria, Eukarya, or Organelle sequences for download. The taxonomy of each sequence is included in the definition line.

March 9th, 2009.
Exporting Fasta Sequences. The compression of fasta files has been eliminated. It is no longer necessary to unzip the downloaded files.

February 13th, 2009.
Community Visualization. A change was made to how the normalization of data is done for custom datasets in the Community Visualization page. The custom datasets were previously normalized as one unit. Now the normalization also is done within the custom dataset, normalizing the individual dataset components against each other.

February 11th, 2009.
Export Taxcounts. The ability to select datasets through saved custom datasets was added. When selecting data as normalized to maximum or by frequency, the datasets are normalized within the custom dataset by normalizing the individual dataset components against each other.

February 5th, 2009.
DOTUR Clusters. Projects are now filtered by taxonomic domain before running DOTUR clusters. Projects using Bacterial primers, for instance, are filtered to only include Bacteria before generating OTUs. Similarly, projects using Archaeal primers will only have OTUs for Archaea, and Eukaryal projects will only have Eukaryal OTUs.

January 26th, 2009.
Project Naming.

  • CoMM projects now begin with a prefix of ICM_
  • Keck projects begin with KCK_
  • LTER projects begin with LTR_
  • All other projects begin with a three-letter code designating the research facility or the researchers initials.
  • All projects end a suffix designating the domain and hypervariable region targeted by the primers. For instance _Bv6 used primers for the V6 region in Bacteria, _Av6 is the V6 region in Archaea, and Ev9 is the V9 region in Eukarya.

December 9th, 2008.
Export Taxcounts. The ability to select datasets through saved custom projects has been temporarily disabled.

December 1st, 2008.
Export Taxcounts. The ability to select datasets through saved custom projects was added.

November 5th, 2008.
Community Visualization and Export Taxcounts. V9 data were added to the database.

October 29th, 2008.
Community Visualization and Export Taxcounts. NOTICE TO USERS: The taxonomy data have been updated as of October 27th and 29th, 2008. The raw sequences have been retrimmed to improve the removal of primer sequences, and the taxonomy has been updated to SILVA 95, with taxonomic assignments contributed from Entrez Genome, Hugenholtz, RDP, SILVA, and several users.

October 10th, 2008.
Export Taxcounts. The ability to select individual datasets from different projects has been added. The user can now combine projects and datasets in order to do a customized export of data.

August 19th, 2008.
Community Visualization. The appearance of the Community Visualization entry page was updated. The Novice tool was renamed the Simple Taxonomy tool.

July 31st, 2008.
Clustering and Diversity. OTU data packages were updated. An error was regrettably discovered in our distance calculations which were used to generate the OTU data downloads. This was corrected and all the OTU data packages were updated.

July 28th, 2008.
Clustering and Diversity. OTU files specifying OTU sizes were added to the OTU Cluster data package.

July 28th, 2008.
Databases. Download of RefV3 database was added.

July 28th, 2008.
Software. Download of GAST software was added.

July 3rd, 2008.
Export Taxonomic Counts. Updated downloads. If left-clicking on download link, a new window will be opened so that the existing page and links will be preserved.

July 2nd, 2008.
Export Taxonomic Counts. Updated processing of export data to improve download speed.

June 24th, 2008.
Export Taxonomic Counts. Updated the Normalize By Percent calculation to be written with ten decimals in order to minimize rounding errors.

June 12th, 2008.
Export Taxonomic Counts. Updated processing of export data to improve download speed.

June 9th, 2008.
Export Taxonomic Counts. Removed Totals column from normalized data outputs.

June 6th, 2008.
Export Taxonomic Counts. Normalization By Percent was added to the output options. Species and strain were added to the taxonomy. Compression of output files was removed.