Instructions for creating rarefaction curves from VAMPS .mtx files
We recommend using mothur to create rarefaction curves.
Download the .mtx files (3, 6 and 10% clustering) for the project you wish to analyze.
After downloading, choose which file to use and open it in Excel.
Then determine the column number of the dataset you wish to perform rarefaction on.
In this example we will use the first column of data (column 8).
Determine the total number of OTUs in the dataset:
Remember these are OTUs that do not have an abundance of 0, so you can't simply count the total number of rows in the matrix.
In a blank cell you can use the Excel countif function on the range of OTU abundances for the dataset, ex. =COUNTIF(H4:H19753, ">0")
Create a mothur-formatted rank abundance file from the OTU matrix file that you downloaded from VAMPS:
The format is simply a list of all the OTU sizes in ranked order (largest first), preceded by the clustering label and the number of OTUs.
The following unix commandline should work to create an initial rabund (rank abundance) file.
NOTE: Change the column number, matrix filename and output filename in the 'cut' command below to match your chosen project.
cut -f 8 ICM_VAG_Bv6.Bacteria.slp.otus.03.mtx | tail -n +2 | sort -nr | grep -v "^0$" | perl -p -l040 -e "print \"$F\";" | sed 's/ / /g' > ICM_VAG_Bv6.03.rabund
This is a Unix command and it should be all on one line. If you don't have access to a Unix commandline you may be able to create
the rank abundance file from inside Excel.
If you do find a simpler method to create the rank abundance file please let us know so we can include it in these instructions.
The 'cut' command works by
Cutting out the 8th column of the file (be sure to set to your own column number and file name).
Passing along all lines starting with the second line (the first line was the dataset name).
Sorting in reverse numeric order.
Swapping from multiple lines to one line with the numbers separated by spaces.
Replacing double spaces with single spaces (the double spaces were an artifact).
Outputting the results to a new filename.
Insert the cluster label and number of OTUs (from step 3) into the rabund file:
Open the file you just created and edit the line so it starts with the cluster size (0.03, 0.06 or 0.10)
then a space then the total number of clusters, then a space then the data.
Your final file should look something like the following (only a lot longer and on one line).
0.03 2271 513 326 269 223 216 190 174 123 116 97 97 66 65 64 63 62 62 61 60 58 50 50 49 49 36 36 35 34 32 26 26 25 24 24 24 23 23 10 9 9 98 8 8 8 8 8 8 7 7 7 7 6 6 6 5 5 4 3 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 1
use mothur to create the rarefaction data (replace the filename with your own):