geo16S {JMDplots}R Documentation

Chemical links between redox conditions and community reference proteomes

Description

Plots from the paper by Dick and Tan (2023).

Usage

  geo16S1(pdf = FALSE)
  geo16S2(pdf = FALSE)
  geo16S3(pdf = FALSE)
  geo16S4(pdf = FALSE)
  geo16S5(pdf = FALSE)
  geo16S_S1(pdf = FALSE)
  geo16S_S2(pdf = FALSE)
  geo16S_S3(pdf = FALSE)
  geo16S_S4(pdf = FALSE)
  geo16S_S5(pdf = FALSE, H2O = FALSE)
  geo16S_S6(pdf = FALSE)
  getmdat_geo16S(study, metrics = NULL, dropNA = TRUE)
  getmetrics_geo16S(study, quiet = TRUE, ...)
  plotmet_geo16S(study, quiet = TRUE, ...)

Arguments

pdf

logical, make a PDF file?

H2O

logical, make plots for nH2O instead of ZC?

study

character, study name

metrics

data frame, output of get_metrics

dropNA

logical, exclude samples with NA name in metadata?

quiet

logical, change to FALSE to print details about data processing

...

additional arguments passed to read_RDP (for getmetrics_geo16S) or plot_metrics (for plotmet_geo16S)

Details

This table gives a brief description of each plotting function.

geo16S1 Distinct chemical parameters of reference proteomes for major taxonomic groups
geo16S2 Estimated community proteomes from different environments have distinct chemical signatures
geo16S3 Lower carbon oxidation state is tied to oxygen depletion in water columns
geo16S4 Common trends of carbon oxidation state of community reference proteomes for shale gas wells and hydrothermal systems
geo16S5 Comparison of protein ZC from metagenomic or metatranscriptomic data with estimates from 16S and reference sequences
geo16S_S1 RefSeq and 16S rRNA data processing outline
geo16S_S2 Scatterplots of ZC and nH2O for genera vs higher taxonomic levels
geo16S_S3 nH2O-ZC plots for major phyla and their genera
geo16S_S4 Venn diagrams for phylum and genus names in the RefSeq (NCBI), RDP, and SILVA taxonomies
geo16S_S5 Correlations between ZC estimated from metagenomes and 16S rRNA sequences
geo16S_S6 Correlation of ZC with GC content of metagenomic and 16S amplicon reads

getmdat_geo16S gets metadata for the indicated study and adds columns for plot parameters (pch, col). For some datasets, sample subsets are indicated by appending a suffix to the study name separated by an underscore. The default for dropNA means to exclude samples with NA name in the metadata file, which is used to exclude outliers from the analysis. If metrics is supplied, the samples are sorted in the same order as the metadata file, and the function returns a list with both ‘⁠metadata⁠’ and the sorted ‘⁠metrics⁠’.

getmetrics_geo16S calculates chemical metrics (ZC and nH2O) for the indicated study. ... is used to supply arguments to read_RDP.

Files in extdata/geo16S

pipeline.R

Pipeline for sequence data processing (uses external programs fastq-dump, vsearch, seqtk, RDP Classifier).

metadata/*.csv

Sample metadata for each study.

RDP/*.csv.xz

RDP Classifier results combined into a single CSV file for each study, created with the mkRDP function in ‘pipeline.R’.

AWDM15.csv

Accession numbers and sample descriptions for the Human Microbiome Project, Soils, and Mammalian Guts datasets taken from Supplementary Material of Aßhauer et al. (2015) and used in geo16S5. GC content for metagenome and 16S amplicon reads was taken from https://trace.ncbi.nlm.nih.gov/Traces/sra/?run=SRR*******.

Files in extdata/geo16S/ARAST

*_AA.csv

Sum of amino acid compositions of protein fragments inferred from metagenomes, produced by runARAST.R and used in geo16S5. The chains column has the number of protein fragments.

scripts/

Directory with script files ARAST.R for the analysis pipeline, runARAST.R to run the pipeline for each dataset, and supporting Bash and Perl script files. The ARAST.R file is a modified version of the pipeline used in a previous study (see gradox).

Files in extdata/geo16S/taxonomy

process.R

Script used for extracting taxonomic names from RDP and SILVA sequence files. See comments for details about data sources.

[RDP|SILVA][phyla|genera].txt

Files with archaeal and bacterial phylum and genus names in RDP and SILVA, used in geo16S_S4.

References

Aßhauer KP, Wemheuer B, Daniel R and Meinicke P (2015) Tax4Fun: predicting functional profiles from metagenomic 16S rRNA data. Bioinformatics 31, 2882–2884. doi:10.1093/bioinformatics/btv287

Dick JM and Tan J (2023) Chemical links between redox conditions and estimated community proteomes from 16S rRNA and reference protein sequences. Microb. Ecol. 85, 1338–1355. doi:10.1007/s00248-022-01988-9

Examples

# Make Figure 1
geo16S1()

# Get metrics of community reference proteomes and paired metadata for one study
metrics <- getmetrics_geo16S("BCA+21")
mdat <- getmdat_geo16S("BCA+21", metrics = metrics)

# Get *all* available metadata
metadata <- getmdat_geo16S("BCA+21", dropNA = FALSE)
stopifnot(nrow(metadata) > nrow(mdat$metadata))

# Make a nH2O-Zc plot with lots of messages printed
# Symbols are coded in getmetrics_geo16S
# (blue: oxic, black: suboxic, red: euxinic)
plotmet_geo16S("SVH+19", quiet = FALSE)

# List datasets used in geo16S paper
mdatdir <- system.file("extdata/geo16S/metadata", package = "JMDplots")
gsub(".csv", "", dir(mdatdir))
# Get metadata for one study
mdat <- getmdat_geo16S("BGPF13")

[Package JMDplots version 1.2.19-14 Index]