Main Paper
- Figure 1: NMDS analysis of host feeding behavior.
- Figure 2A: Taxa abundance by host species.
- Figure 2B: Alpha diversity.
- Figure 2C: Beta diversity.
- Figure 3: DA ASV tree.
- Figure 4: DA ASV heatmap.
- Table 1: Summary of habitat preference.
In this study, we looked at intestinal microbial communities from five herbivorous fish species across two families—Sparisoma viride, Sparisoma aurofrenatum, and Scarus taeniopterus from the Labridae and Acanthurus tractus and Acanthurus coeruleus from the Acanthuridae.
In the first section we run some analyses on field-based behavioral assays of the different herbivorous reef fish species.
In this part we go through the process of processing raw 16S rRNA read data including assessing read quality, filtering reads, correcting errors, and infersing amplicon sequence variants (ASVs).
Next we go through the steps of defining sample groups, creating phyloseq objects, removing unwanted samples, and removing contaminant ASVs. Various parts of this section can easily be modified to perform different analyses. For example, if you were only interested in a specific taxa or group of samples, you could change the code here to create new phyloseq objects.
Here we assess taxonomic composition, alpha diversity, and beta diversity. Phyloseq offers many options for assessing diversity, including several alpha diversity metrics, additional ordination and distance methods, and so on. You can play around with these settings to how it affects the results.
We wanted to understand how ASVs partitioned across host species. We also wanted to assess the specificity of each ASV to determine habitat preference. To our knowledge there is no quantitative way to do this. The only attempt we are aware of was MetaMetaDB but it is based on a 454 database and no longer seems to be in active development. So we used an approach based on the work of Sullam et. al., first identifying differentially abundant ASVs, then searching for closest database hits, and finally using phylogenetic analysis and top hit metadata (isolation source, natural host) to infer habitat preference.
This section contains information on a) other analyses & visualizations, b) tools & resources used in this workflow, c) submitting sequencing data to public archives, and d) specific R package & versions used in this workflow.
All tables and figures presented herein are named as they appeared in the original publication. We also include many additional data productes that were not part of the original publication.
Throughout this workflow we are going to rely on color to help us tell a story. We will use color to delineate host fish species and to delineate microbial taxa. Microbial diversity is pretty vast and it can be difficult to display all of this diversity in a single, static figure.
Many of us perceive color and/or differences in color, well, differently. So when designing figures it is important to use a) a relatively few colors and b) a palette that is friendly to a variety of people. For our figures, we generated a palette based on Bang Wong’s scheme described in this paper. Wong’s scheme uses contrasting colors that can be distinguished by a range of people. Consider that roughly 8% of people (mostly males) are color blind. So what do you think? Do you want Keanu Reeves to understand your figures or not?
Wong’s scheme is conservative—there are only 7 colors. We added black grey, and a blueish white to give us some wiggle room (we cheated a little). Others have developed 12 and 15 color palette schemes and these are worth looking into, but be careful—figures with too many colors can inhibit our ability to discern patterns. This conservative palette forced us to choose carefully when deciding which taxa to target or how many groups to display. To keep things simple, we created two palettes—one for microbial taxa (friend_pal
) with all the colors and another for the five host fish species (samp_pal
). The fish palette is just a subset of the full palette. Here is the code:
#Full palette
friend_pal <- c("#009E73", "#D55E00", "#F0E442",
"#CC79A7", "#56B4E9", "#E69F00",
"#0072B2", "#7F7F7F", "#B6DBFF",
"#000000")
#Fish palette
samp_pal <- c("#CC79A7", "#0072B2", "#009E73",
"#56B4E9", "#E69F00")
cols <- function(a) image(1:10, 1, as.matrix(1:10),
col=a, axes=FALSE , xlab="", ylab="")
cols(friend_pal)
There is a great article on Coloring for Colorblindness by David Nichols that has an interactive color picker and recommendations for accessible palettes. This is also a really cool site for looking at color combinations. Both resources are highly recommended.
Use the links below if you want to jump directly to the code used to produce the figures and tables from the original publication. You can also find the full Supplementary files for the paper here but there is no R code on this page. If you want to see the code that produced the supplemental material, the direct links are also below. There is no code for Tables S2 and S7.