geo16S {JMDplots} | R Documentation |
Plots from the paper by Dick and Tan (2023).
geo16S1(pdf = FALSE)
geo16S2(pdf = FALSE)
geo16S3(pdf = FALSE)
geo16S4(pdf = FALSE)
geo16S5(pdf = FALSE)
geo16S_S1(pdf = FALSE)
geo16S_S2(pdf = FALSE)
geo16S_S3(pdf = FALSE)
geo16S_S4(pdf = FALSE)
geo16S_S5(pdf = FALSE, H2O = FALSE)
geo16S_S6(pdf = FALSE)
getmdat_geo16S(study, metrics = NULL, dropNA = TRUE)
getmetrics_geo16S(study, quiet = TRUE, ...)
plotmet_geo16S(study, quiet = TRUE, ...)
pdf |
logical, make a PDF file? |
H2O |
logical, make plots for nH2O instead of ZC? |
study |
character, study name |
metrics |
data frame, output of |
dropNA |
logical, exclude samples with NA name in metadata? |
quiet |
logical, change to FALSE to print details about data processing |
... |
additional arguments passed to |
This table gives a brief description of each plotting function.
geo16S1 | Distinct chemical parameters of reference proteomes for major taxonomic groups |
geo16S2 | Estimated community proteomes from different environments have distinct chemical signatures |
geo16S3 | Lower carbon oxidation state is tied to oxygen depletion in water columns |
geo16S4 | Common trends of carbon oxidation state of community reference proteomes for shale gas wells and hydrothermal systems |
geo16S5 | Comparison of protein ZC from metagenomic or metatranscriptomic data with estimates from 16S and reference sequences |
geo16S_S1 | RefSeq and 16S rRNA data processing outline |
geo16S_S2 | Scatterplots of ZC and nH2O for genera vs higher taxonomic levels |
geo16S_S3 | nH2O-ZC plots for major phyla and their genera |
geo16S_S4 | Venn diagrams for phylum and genus names in the RefSeq (NCBI), RDP, and SILVA taxonomies |
geo16S_S5 | Correlations between ZC estimated from metagenomes and 16S rRNA sequences |
geo16S_S6 | Correlation of ZC with GC content of metagenomic and 16S amplicon reads |
getmdat_geo16S
gets metadata for the indicated study
and adds columns for plot parameters (pch
, col
).
For some datasets, sample subsets are indicated by appending a suffix to the study name separated by an underscore.
The default for dropNA
means to exclude samples with NA name in the metadata file, which is used to exclude outliers from the analysis.
If metrics
is supplied, the samples are sorted in the same order as the metadata file, and the function returns a list with both ‘metadata’ and the sorted ‘metrics’.
getmetrics_geo16S
calculates chemical metrics (ZC and nH2O) for the indicated study.
...
is used to supply arguments to read_RDP
.
Pipeline for sequence data processing (uses external programs fastq-dump, vsearch, seqtk, RDP Classifier).
Sample metadata for each study.
RDP Classifier results combined into a single CSV file for each study, created with the mkRDP
function in ‘pipeline.R’.
Accession numbers and sample descriptions for the Human Microbiome Project, Soils, and Mammalian Guts datasets taken from Supplementary Material of Aßhauer et al. (2015) and used in geo16S5
. GC content for metagenome and 16S amplicon reads was taken from https://trace.ncbi.nlm.nih.gov/Traces/sra/?run=SRR*******
.
*_AA.csv
Sum of amino acid compositions of protein fragments inferred from metagenomes, produced by runARAST.R
and used in geo16S5
. The chains
column has the number of protein fragments.
scripts/
Directory with script files ARAST.R
for the analysis pipeline, runARAST.R
to run the pipeline for each dataset, and supporting Bash and Perl script files. The ARAST.R
file is a modified version of the pipeline used in a previous study (see gradox
).
process.R
Script used for extracting taxonomic names from RDP and SILVA sequence files. See comments for details about data sources.
[RDP|SILVA][phyla|genera].txt
Files with archaeal and bacterial phylum and genus names in RDP and SILVA, used in geo16S_S4
.
Aßhauer KP, Wemheuer B, Daniel R and Meinicke P (2015) Tax4Fun: predicting functional profiles from metagenomic 16S rRNA data. Bioinformatics 31, 2882–2884. doi:10.1093/bioinformatics/btv287
Dick JM and Tan J (2023) Chemical links between redox conditions and estimated community proteomes from 16S rRNA and reference protein sequences. Microb. Ecol. 85, 1338–1355. doi:10.1007/s00248-022-01988-9
# Make Figure 1
geo16S1()
# Get metrics of community reference proteomes and paired metadata for one study
metrics <- getmetrics_geo16S("BCA+21")
mdat <- getmdat_geo16S("BCA+21", metrics = metrics)
# Get *all* available metadata
metadata <- getmdat_geo16S("BCA+21", dropNA = FALSE)
stopifnot(nrow(metadata) > nrow(mdat$metadata))
# Make a nH2O-Zc plot with lots of messages printed
# Symbols are coded in getmetrics_geo16S
# (blue: oxic, black: suboxic, red: euxinic)
plotmet_geo16S("SVH+19", quiet = FALSE)
# List datasets used in geo16S paper
mdatdir <- system.file("extdata/geo16S/metadata", package = "JMDplots")
gsub(".csv", "", dir(mdatdir))
# Get metadata for one study
mdat <- getmdat_geo16S("BGPF13")