R: Calculate chemical metric for taxa in each sample aggregated...

get_metric_byrank {chem16S}

R Documentation

Calculate chemical metric for taxa in each sample aggregated to a specified rank

Description

Calculates a single chemical metrics for taxa in each sample aggregated to a specified rank.

Usage

  get_metric_byrank(RDP, map, refdb = "GTDB_220", taxon_AA = NULL, groups = NULL,
    zero_AA = NULL, metric = "Zc", rank = "genus")

Arguments

`RDP`	data frame, taxonomic abundances produced by `read_RDP` or `ps_taxacounts`
`map`	data frame, taxonomic mapping produced by `map_taxa`
`refdb`	character, name of reference database (‘⁠GTDB_220⁠’ or ‘⁠RefSeq_206⁠’)
`taxon_AA`	data frame, amino acid compositions of taxa, used to bypass `refdb` specification
`groups`	list of indexing vectors, samples to be aggregated into groups
`zero_AA`	character, three-letter abbreviation(s) of amino acid(s) to assign zero counts for calculating chemical metrics
`metric`	character, chemical metric to calculate
`rank`	character, amino acid compositions of all lower-ranking taxa (to genus) are aggregated to this rank

Details

This function adds up amino acid compositions of taxa up to the specified rank and returns a data frame samples on the rows and taxa on the columns. Because amino acid composition for genera have been precomputed from species-level genomes in a reference database, chemical metrics for genera are constant. In contrast, chemical metrics for higher-level taxa is variable as they depend on the reference genomes as well as relative abundances of children taxa.

The value for rank should be one of ‘⁠rootrank⁠’, ‘⁠domain⁠’, ‘⁠phylum⁠’, ‘⁠class⁠’, ‘⁠order⁠’, ‘⁠family⁠’, or ‘⁠genus⁠’. For all ranks other than ‘⁠genus⁠’, the amino acid compositions of all lower-ranking taxa are weighted by taxonomic abundance and summed in order to calculate the chemical metric at the specified rank. If the rank is ‘⁠genus⁠’, then no aggregation is done (because it is lowest-level rank available in the classifications), and the values of the metric for all genera in each sample are returned. If the rank is ‘⁠rootrank⁠’, then the results are equivalent to community reference proteomes (i.e., get_metrics).

The RDP, map, ‘⁠refdb⁠’, and groups arguments are the same as described in get_metrics. See calc_metrics for available metrics.

Value

A data frame of numeric values with row names corresponding to samples and column names corresponding to taxa.

References

Dick JM, Shock E. 2013. A metastable equilibrium model for the relative abundances of microbial phyla in a hot spring. PLOS One 8: e72395. doi:10.1371/journal.pone.0072395

Examples

# Plot similar to Fig. 1 in Dick and Shock (2013)
# Read example dataset
RDPfile <- system.file("extdata/RDP-GTDB_220/SMS+12.tab.xz", package = "chem16S")
RDP <- read_RDP(RDPfile)
# Get mapping to reference database
map <- map_taxa(RDP)
# Calculate phylum-level Zc
phylum_Zc <- get_metric_byrank(RDP, map, rank = "phylum")
# Keep phyla present in at least two samples
n_values <- colSums(!sapply(phylum_Zc, is.na))
phylum_Zc <- phylum_Zc[n_values > 2]
# Swap first two samples to get them in the right location
# (MG-RAST accession numbers for these samples are not in spatial order)
phylum_Zc <- phylum_Zc[c(2, 1, 3, 4, 5), ]
matplot(phylum_Zc, type = "b", xlab = "Location (1-hot, 5-cool)", ylab = "Zc")
title("Phylum-level Zc at Bison Pool hot spring")

[Package chem16S version 1.1.0-5 Index]