PS {JMDplots} | R Documentation |
Retrieves the phylostrata for protein-coding genes according to Liebeskind et al. (2016) or Trigos et al. (2017).
PS(uniprot, source = "TPPG17")
uniprot |
character, UniProt accession numbers |
source |
character, ‘TPPG17’ or ‘LMM16’ |
The phylostratum for each protein is found by matching the UniProt ID in one of these data files:
extdata/evdevH2O/phylostrata/TPPG17.csv.xz
This file has columns ‘GeneID’ (gene name), ‘Entrez’, ‘Entry’, and ‘Phylostrata’. Except for ‘Entry’, the values are from Dataset S1 of Trigos et al. (2017). UniProt acession numbers in ‘Entry’ were generated using the UniProt mapping tool first for ‘Entrez’, followed by ‘GeneID’ for the unmatched genes. ‘Entry’ is NA for genes that remain unmatched to any proteins after both mapping steps.
extdata/evdevH2O/phylostrata/LMM16.csv.xz
This file has columns ‘UniProt’, ‘modeAge’, and ‘PS’.
The data are from file main_HUMAN.csv
in Gene-Ages v1.0 (https://zenodo.org/record/51708; Liebeskind et al. (2016)).
The modeAges were converted to phylostrata values 1-8 (‘PS’ column) in this order: Cellular_organisms, Euk_Archaea, Euk+Bac, Eukaryota, Opisthokonta, Eumetazoa, Vertebrata, Mammalia.
Liebeskind BJ, McWhite CD and Marcotte EM (2016) Towards consensus gene ages. Genome Biol. Evol. 8, 1812–1823. doi:10.1093/gbe/evw113
Trigos AS, Pearson RB, Papenfuss AT and Goode DL (2017) Altered interactions between unicellular and multicellular genes drive hallmarks of transformation in a diverse range of solid tumors. Proc. Natl. Acad. Sci. 114, 6406–6411. doi:10.1073/pnas.1617743114
These data, but not this function, are used in the evdevH2O
plots.
# Get protein expression data for one dataset
pd <- pdat_colorectal("JKMF10")
# Get phylostrata
PS(pd$pcomp$uniprot)
# Compare the two sources
PSdir <- system.file("extdata/evdevH2O/phylostrata", package = "JMDplots")
TPPG17 <- read.csv(file.path(PSdir, "TPPG17.csv.xz"))
LMM16 <- read.csv(file.path(PSdir, "LMM16.csv.xz"))
IDs <- intersect(TPPG17$Entry, LMM16$UniProt)
PS_TPPG17 <- TPPG17$Phylostrata[match(IDs, TPPG17$Entry)]
PS_LMM16 <- LMM16$PS[match(IDs, LMM16$UniProt)]
plot(jitter(PS_TPPG17), jitter(PS_LMM16), pch = ".")