add.protein {CHNOSZ}R Documentation

Amino Acid Compositions of Proteins


Functions to get amino acid compositions and add them to protein list for use by other functions.


  seq2aa(protein, sequence)
  aasum(aa, abundance = 1, average = FALSE, protein = NULL, organism = NULL)



data frame, amino acid composition in the format of thermo$protein


character, name of protein; numeric, indices of proteins (rownumbers of thermo$protein)


character, protein sequence


additional arguments passed to read.csv


numeric, abundances of proteins


logical, return the weighted average of amino acid counts?


character, name of organism


A protein in CHNOSZ is defined by its identifying information and the amino acid composition, stored in thermo$protein. The names of proteins in CHNOSZ are distinguished from those of other chemical species by having an underscore character ("_") that separates two identifiers, referred to as the protein and organism. An example is LYSC_CHICK. The purpose of the functions described here is to identify proteins and work with their amino acid compositions. From the amino acid compositions, the thermodynamic properties of the proteins can be estimated by group additivity.

seq2aa returns a data frame of amino acid composition, in the format of thermo$protein, corresponding to the provided sequence. Here, the protein argument indicates the name of the protein with an underscore (e.g. LYSC_CHICK).

aasum returns a data frame representing the sum of amino acid compositions in the rows of the input aa data frame. The amino acid compositions are multiplied by the indicated abundance; that argument is recycled to match the number of rows of aa. If average is TRUE the final sum is divided by the number of input compositions. The name used in the output is taken from the first row of aa or from protein and organism if they are specified.

Given amino acid compositions returned by the *aa functions described above, add.protein adds them to thermo$protein for use by other functions in CHNOSZ. The amino acid compositions of proteins in aa with the same name as one in thermo$protein are replaced. The value returned by this function is the rownumbers of thermo$protein that are added and/or replaced.

See Also

read.fasta, uniprot.aa, more.aa for other ways of getting amino acid compositions.

pinfo for protein-level functions (length, chemical formulas, reaction coefficients of basis species).

read.expr for working with protein abundance and subcellular localization data.

protein for examples of affinity calculations and diagrams.


# manually adding a new protein
# Human Gastric juice peptide 1
aa <- seq2aa("GAJU_HUMAN", "LAAGKVEDSD")
ip <- add.protein(aa)
# the chemical formula of this peptide
as.chemical.formula(protein.formula(ip)) # "C41H69N11O18"
# we can also calculate a formula without using add.protein
aa <- seq2aa("pentapeptide_test", "ANLSG")

[Package CHNOSZ version 1.1.0 Index]