evdevH2O {JMDplots}R Documentation

Water activity and redox potential in evolution and development

Description

Plots from the paper by Dick (2022).

Usage

  evdevH2O1(pdf = FALSE)
  evdevH2O2(pdf = FALSE, boot.R = 99)
  evdevH2O3(pdf = FALSE, H2O = FALSE)
  evdevH2O4(pdf = FALSE)
  evdevH2O5(pdf = FALSE)
  evdevH2O6(pdf = FALSE)
  evdevH2O7(pdf = FALSE, boot.R = 99)
  evdevH2O8(pdf = FALSE, boot.R = 99)
  runMaximAct(dataset = "TPPG17", seed = 1:100, nbackground = 2000,
    res = 256, bg_organism = "Hsa", writeFiles = TRUE)
  LYSC_example()

Arguments

pdf

logical, make a PDF file?

boot.R

integer, number of bootstrap replicates

H2O

logical, make plots for nH2O instead of ZC?

dataset

character, dataset to use for constructing target proteins

seed

numeric, seeds to use for subsampling background proteins from human proteome

nbackground

numeric, number of background proteins to sample

res

numeric, grid resolution for logfO2 and logaH2O

writeFiles

logical, creates CSV and PNG output files?

bg_organism

character, use background proteins from this organism.

Details

This table gives a brief description of each plotting function.

evdevH2O1 Comparison of different sets of basis species
evdevH2O2 Protein length and chemical metrics for phylostratigraphic age groups
evdevH2O3 Evolution of protein ZC in eukaryotic lineages
evdevH2O4 Thermodynamic analysis of optimal logaH2O and logfO2 for target proteins
evdevH2O5 Optimal logaH2O and logfO2 and virtual Eh for target proteins
evdevH2O6 Chemical metrics for and thermodynamic parameters with different background proteomes
evdevH2O7 Chemical and thermodynamic analysis of B. subtilis biofilm transcriptome and proteome
evdevH2O8 Organismal water content, proteomic nH2O, and optimal logaH2O for fruit fly development

runMaximAct sets up calculations with MaximAct to compute the optimal logaH2O and logfO2 that maximize the equilibrium activity of target proteins against a proteomic background. The available datasets are described below.

⁠TPPG17⁠

Phylostrata from Trigos et al., 2017; target proteins are mean amino acid compositions for each phylostratum.

⁠LMM16⁠

Gene ages from Liebeskind et al., 2016; target proteins are mean amino acid compositions for each gene age.

⁠transcriptome⁠’ or ‘⁠proteome⁠

Futo et al., 2021; target proteins are mean amino acid composiitons for each stage of biofilm development, weighted by gene or protein expression values.

⁠fly_embryo⁠’ or ‘⁠fly_adult⁠

Differentially expressed proteins from Fabre et al., 2019; target proteins are all proteins (i.e. not mean compositions) with higher expression in embryo or adult flies.

⁠fly_development⁠

Fly developmental proteome from Casas-Vila et al., 2017; target proteins are mean amino acid composition for each time point.

bg_organism can be one of ‘⁠Hsa⁠’ (Homo sapiens), ‘⁠Sce⁠’ (Saccharomyces cerevisiae), ‘⁠Eco⁠’ (Escherichia coli), ‘⁠Mja⁠’ (Methanocaldococcus jannaschii), ‘⁠Dme⁠’ (Drosophila melanogaster), or ‘⁠Bsu⁠’ (Bacillus subtilis). These are used to retrieve the background proteins used for the AA_background argument in MaximAct.

If writeFiles is TRUE, a PNG plot file is created and optimal logfO2 or logaH2O values for each seed and protein are saved in CSV files named by pasting together the dataset, bg_organism, and ‘⁠O2.csv⁠’ or ‘⁠H2O.csv⁠’, separated by ‘⁠_⁠’.

LYSC_example prints the chemical formula, formation reaction, equilibrium constant, and activity product for the example protein (LYSC_CHICK) used in the text.

Files in extdata/evdevH2O/MaximAct

runall.sh

Script to run runMaximAct for different datasets and background proteomes.

[dataset]_H2O.csv

Output file with optimal values of logaH2O for each random seed.

[dataset]_O2.csv

Output file with optimal values of logfO2 for each random seed.

The output files are used in evdevH2O5 (thermodynamic analysis of phylostrata and gene ages) and evdevH2O7 (thermodynamic analysis of biofilm developmental stages). To reduce package size, the PNG files output by runMaximAct are not included here.

Files in extdata/evdevH2O/LMM16

mkaa.R

Script to sum amino acid composition of all proteins in each gene age category for each organism in the consensus tables of Liebeskind et al. (2016). This script requires files available from UniProt and Zenodo (doi:10.5281/zenodo.51708); see comments in the script for details.

reference_proteomes.csv

IDs of UniProt reference proteomes for 31 organisms in Liebeskind et al. (2016) (input file for mkaa.R).

modeAges_names.csv

Names of gene age categories for each organism (output by mkaa.R).

modeAges_aa.csv

Summed amino acid composition for proteins in each gene age category (output by mkaa.R).

Files in extdata/evdevH2O/JWN+21

mkAA.R

Script to assemble amino acid compositions from data files of James et al. (2021). This script requires files available from Figshare (doi:10.6084/m9.figshare.12037281.v1); see comments for details.

pfam_animal_nontrans_AA.csv.xz, pfam_animal_trans_AA.csv.xz, pfam_plant_nontrans_AA.csv.xz, pfam_plant_trans_AA.csv.xz

Output files from the script.

References

Casas-Vilas N et al. (2017) The developmental proteome of Drosophila melanogaster. Genome Res. 27, 1273–1285. doi:10.1101/gr.213694.116

Dick JM (2022) A thermodynamic model for water activity and redox potential in evolution and developent. J. Mol. Evol 90, 182–199. doi:10.1007/s00239-022-10051-7

Fabre B, Korona D, Lees JG, Lazar I, Livneh I, Brunet M, Orengo CA, Russell S and Lilley KS (2019) Comparison of Drosophila melanogaster embryo and adult proteome by SWATH-MS reveals differential regulation of protein synthesis, degradation machinery, and metabolism modules. J. Proteome Res. 18, 2525–2534. doi:10.1021/acs.jproteome.9b00076

Futo M et al. (2021) Embryo-like features in developing Bacillus subtilis biofilms. Mol. Biol. Evol. 38, 31–47. doi:10.1093/molbev/msaa217

James JE, Willis SM, Nelson PG, Weibel C, Kosinski LJ and Masel J (2021) Universal and taxon-specific trends in protein sequences as a function of age. eLife 10, e57347. doi:10.7554/eLife.57347

Liebeskind BJ, McWhite CD and Marcotte EM (2016) Towards consensus gene ages. Genome Biol. Evol. 8, 1812–1823. doi:10.1093/gbe/evw113

Trigos AS, Pearson RB, Papenfuss AT and Goode DL (2017) Altered interactions between unicellular and multicellular genes drive hallmarks of transformation in a diverse range of solid tumors. Proc. Natl. Acad. Sci. 114, 6406–6411. doi:10.1073/pnas.1617743114

See Also

devodata for data sources for developmental proteomes. PS for functions used to retrieve phylostrata from UniProt IDs.

Examples

evdevH2O2()

# Start with a fresh plot device to run the next example
# (Something from above messes up split.screen())
dev.off()

MA <- runMaximAct("transcriptome", seed = 1:3, nbackground = 200, res = 15, writeFiles = FALSE)
# The logaH2O decreases at the end of the time course
H2O <- colMeans(MA$H2O)
stopifnot(H2O["14D"] > H2O["1M"])
stopifnot(H2O["1M"] > H2O["2M"])
# The final value of logaH2O is close to 0
stopifnot(round(H2O["2M"]) == 0)

[Package JMDplots version 1.2.19-14 Index]