MaximAct {JMDplots} | R Documentation |
Calculate the logaH2O and logfO2 that maximize the activities of given target proteins on a proteomic background.
MaximAct(AA_target, seed = 1:100, nbackground = 2000,
plot.it = TRUE, xlab = "sample", names = NULL,
O2 = c(-72, -67), H2O = c(-2, 6), pH = NULL,
AA_background = NULL)
getphyloaa(PS_source)
AA_target |
data frame, amino acid composition in the format of |
seed |
numeric, seeds to use for subsampling background proteins from human proteome |
nbackground |
numeric, number of background proteins to sample |
plot.it |
logical, make a plot? |
xlab |
character or expression, label for x axis |
names |
character, sample names |
O2 |
numeric, range of logfO2 values |
H2O |
numeric, range of logaH2O values |
pH |
numeric, range of pH values |
AA_background |
data frame, amino acid compositions of background proteins |
PS_source |
Source of phylostrata: ‘TPPG17’ or ‘LMM16’ |
MaximAct
is used to compute optimal values of logaH2O, logfO2, and optionally pH; that is, those that maximize the equilibrium activity of each of the target proteins against a background of other proteins.
The calculation is described by Dick (2021).
The amino acid compositions of the target and background proteins are given in AA_target
and AA_background
, respectively.
If AA_background
is NULL, the background corresponds to human proteins that have phylostrata assignments in both the Trigos and Liebeskind datasets (see PS
).
The entire human proteome is not used to avoid inclusion of background proteins with unusual amino acid compositions.
The basis species used for the calculation are ‘QEC’ (glutamine - glutamic acid - cysteine - O2 - H2O) (see Dick et al., 2020), or ‘QEC+’ (the same basis species with the addition of H+) if pH
is not NULL.
The NULL default for pH
means that the system consists of uncharged species, and pH is therefore not defined.
The ranges in O2
, H2O
, and pH
can have an optional third value to specify the resolution; e.g. c(-70, -65, 100)
represents 100 points in the range [-70, -65] (see affinity
).
Due to memory constraints, a single calculation does not include the entire human proteome, but a random subsample of nbackground
human proteins.
The subsampling is performed for each random seed in seed
.
The results are returned in a list with components O2
and H2O
.
By default, a pair of plots is made that are updated during the calculation; at the end, the mean values of logfO2 and logaH2O are plotted as red lines.
Change plot.it
to FALSE to prevent making the plots.
The protein names (used to label the x axis of the plots and for the column names in the output) are taken from AA_target$protein
or the names
argument if it is not NULL.
getphyloaa
computes the mean amino acid composition of proteins in each phylostratum using phylostrata from Trigos et al. (2017) (‘TPPG17’) or Liebeskind et al. (2016) (‘LMM16’).
Dick JM, Yu M and Tan J (2020) Uncovering chemical signatures of salinity gradients through compositional analysis of protein sequences. Biogeosciences 17, 6145–6162. doi:10.5194/bg-17-6145-2020
Dick JM (2022) A thermodynamic model for water activity and redox potential in evolution and developent. J. Mol. Evol. doi:10.1007/s00239-022-10051-7
Liebeskind BJ, McWhite CD and Marcotte EM (2016) Towards consensus gene ages. Genome Biol. Evol. 8, 1812–1823. doi:10.1093/gbe/evw113
Trigos AS, Pearson RB, Papenfuss AT and Goode DL (2017) Altered interactions between unicellular and multicellular genes drive hallmarks of transformation in a diverse range of solid tumors. Proc. Natl. Acad. Sci. 114, 6406–6411. doi:10.1073/pnas.1617743114
runMaximAct
, used to set up the calculations reported in Dick (2021).
# Run MaximAct() for mean amino acid compositions of proteins in phylostrata
AA_target <- getphyloaa("TPPG17")
MA <- MaximAct(AA_target, seed = 1:3, nbackground = 200, O2 = c(-72,-65,15), H2O = c(-2,5,15))
# logfO2 increases and logaH2O decreases between phylostrata 1 and 16
O2 <- colMeans(MA$O2)
stopifnot(O2["16"] - O2["1"] > 0)
H2O <- colMeans(MA$H2O)
stopifnot(H2O["16"] - H2O["1"] < 0)
# The final value of logaH2O is close to 0
stopifnot(round(H2O["16"]) == 0)