metrics {canprot} | R Documentation |
Calculate chemical metrics for proteins from their amino acid compositions.
Zc(AAcomp, ...)
nO2(AAcomp, basis = "QEC", ...)
nH2O(AAcomp, basis = "QEC", terminal_H2O = 0)
GRAVY(AAcomp, ...)
pI(AAcomp, terminal_H2O = 1, ...)
MW(AAcomp, terminal_H2O = 0, ...)
pMW(AAcomp, terminal_H2O = 1, ...)
V0(AAcomp, terminal_H2O = 0, ...)
pV0(AAcomp, terminal_H2O = 1, ...)
V0g(AAcomp, ...)
Density(AAcomp, ...)
S0(AAcomp, terminal_H2O = 0, ...)
pS0(AAcomp, terminal_H2O = 1, ...)
S0g(AAcomp, ...)
SV(AAcomp, ...)
Zcg(AAcomp, ...)
nH2Og(AAcomp, ...)
nO2g(AAcomp, ...)
HC(AAcomp, ...)
NC(AAcomp, ...)
OC(AAcomp, ...)
SC(AAcomp, ...)
nC(AAcomp, ...)
pnC(AAcomp, ...)
plength(AAcomp, ...)
Cost(AAcomp, ...)
RespiratoryCost(AAcomp, ...)
FermentativeCost(AAcomp, ...)
B20Cost(AAcomp, ...)
Y20Cost(AAcomp, ...)
H11Cost(AAcomp, ...)
cplab
AAcomp |
data frame, amino acid compositions |
... |
ignored additional arguments |
basis |
character, set of basis species |
terminal_H2O |
numeric, number of pairs of terminal groups |
Columns in AAcomp
should be named with the three-letter abbreviations for the amino acids.
Case-insensitive matching matching of the abbreviations is used; e.g., ‘Ala’, ‘ALA’, ‘ala’ all refer to alanine.
Metrics are normalized per amino acid residue except for Zc
, pI
, Density
, plength
, and other functions starting with p
(for protein).
The contribution of protein terminal groups (-H and -OH) to residue-normalized metrics is turned off by default.
Set terminal_H2O
to 1 (or to the number of polypeptide chains, if greater than one) to include their contribution.
The metrics are described below:
Zc
Average oxidation state of carbon (ZC) (Dick, 2014). This metric is independent of the choice of basis species. Note that ZC is normalized by number of carbon atoms, not by number of residues.
nO2
Stoichiometric oxidation state (nO2 per residue).
The available basis
species are:
‘QEC’ - glutamine, glutamic acid, cysteine, H2O, O2 (Dick et al., 2020)
‘QCa’ - glutamine, cysteine, acetic acid, H2O, O2
nH2O
Stoichiometric hydration state (nH2O per residue). The basis species also affect this calculation.
GRAVY
Grand average of hydropathy. Values of the hydropathy index for individual amino acids are from Kyte and Doolittle (1982).
pI
Isoelectric point.
The net charge for each ionizable group was pre-calculated from pH 0 to 14 at intervals of 0.01.
The isoelectric point is found as the pH where the sum of charges of all groups in the protein is closest to zero.
The pK values for the terminal groups and sidechains are taken from Bjellqvist et al. (1993) and Bjellqvist et al. (1994); note that the calculation does not implement position-specific adjustments described in the latter paper.
The number of N- and C-terminal groups is taken from terminal_H2O
.
MW
Molecular weight.
pMW
Molecular weight per protein.
V0
Standard molal volume. The values are derived from group contributions of amino acid sidechains and protein backbones (Dick et al., 2006).
pV0
Standard molal volume per protein.
V0g
Specific volume (reciprocal density).
Density
Density (MW / V0).
S0
Standard molal entropy. The values are derived from group contributions of amino acid sidechains and protein backbones (Dick et al., 2006).
pS0
Standard molal entropy per protein.
S0g
Specific entropy.
SV
Entropy density.
Zcg
Carbon oxidation state per gram.
nO2g
Stoichiometric oxidation state per gram.
nH2Og
Stoichiometric hydration state per gram.
HC
H/C ratio (not counting terminal -H and -OH groups).
NC
N/C ratio.
OC
O/C ratio (not counting terminal -H and -OH groups).
SC
S/C ratio.
nC
Number of carbon atoms per residue.
pnC
Number of carbon atoms per protein.
plength
Protein length (number of amino acid residues).
Cost
Metabolic cost (Akashi and Gojobori, 2002).
RespiratoryCost
Respiratory cost (Wagner, 2005).
FermentativeCost
Fermentative cost (Wagner, 2005).
B20Cost
Biosynthetic cost in bacteria (Zhang et al., 2018).
Y20Cost
Biosynthetic cost in yeast (Zhang et al., 2018).
H11Cost
Biosynthetic cost in humans (Zhang et al., 2018).
...
is provided to permit get
or do.call
constructions with the same arguments for all metrics.
For instance, a terminal_H2O
argument can be suppled to either Zc
or nH2O
, but it only has an effect on the latter.
cplab
is a list of formatted labels for each of the chemical metrics listed here.
A check in the code ensures that the names of the functions for calculating metrics and the names for labels listed cplab
are identical.
Akashi H, Gojobori T. 2002. Metabolic efficiency and amino acid composition in the proteomes of Escherichia coli and Bacillus subtilis. Proceedings of the National Academy of Sciences 99(6): 3695–3700. doi:10.1073/pnas.062526999
Bjellqvist B, Hughes GJ, Pasquali C, Paquet N, Ravier F, Sanchez J-C, Frutiger S, Hochstrasser D. 1993. The focusing positions of polypeptides in immobilized pH gradients can be predicted from their amino acid sequences. Electrophoresis 14: 1023–1031. doi:10.1002/elps.11501401163
Bjellqvist B, Basse B, Olsen E, Celis JE. 1994. Reference points for comparisons of two-dimensional maps of proteins from different human cell types defined in a pH scale where isoelectric points correlate with polypeptide compositions. Electrophoresis 15: 529–539. doi:10.1002/elps.1150150171
Dick JM, LaRowe DE, Helgeson HC. 2006. Temperature, pressure, and electrochemical constraints on protein speciation: Group additivity calculation of the standard molal thermodynamic properties of ionized unfolded proteins. Biogeosciences 3(3): 311–336. doi:10.5194/bg-3-311-2006
Dick JM. 2014. Average oxidation state of carbon in proteins. J. R. Soc. Interface 11: 20131095. doi:10.1098/rsif.2013.1095
Dick JM, Yu M, Tan J. 2020. Uncovering chemical signatures of salinity gradients through compositional analysis of protein sequences. Biogeosciences 17: 6145–6162. doi:10.5194/bg-17-6145-2020
Kyte J, Doolittle RF. 1982. A simple method for displaying the hydropathic character of a protein. J. Mol. Biol. 157: 105–132. doi:10.1016/0022-2836(82)90515-0
Wagner A. 2005. Energy constraints on the evolution of gene expression. Molecular Biology and Evolution 22(6): 1365–1374. doi:10.1093/molbev/msi126
Zhang H, Wang Y, Li J, Chen H, He X, Zhang H, Liang H, Lu J. 2018. Biosynthetic energy cost for amino acids decreases in cancer evolution. Nature Communications 9(1): 4124. doi:10.1038/s41467-018-06461-1
For calculation of ZC from an elemental formula (instead of amino acid composition), see the ZC
function in CHNOSZ.
calc_metrics
is a wrapper to calculate one or more metrics specified in an argument.
# Amino acid composition of a tripeptide (Gly-Ala-Gly)
aa <- data.frame(Ala = 1, Gly = 2)
# Calculate Zc, nH2O, and length
Zc(aa)
nH2O(aa)
plength(aa)
# Make a plot with formatted labels
plot(Zc(aa), nH2O(aa), xlab = cplab$Zc, ylab = cplab$nH2O)