PS {JMDplots}R Documentation

Retrieve phylostrata for given UniProt IDs

Description

Retrieves the phylostrata for protein-coding genes according to Liebeskind et al. (2016) or Trigos et al. (2017).

Usage

  PS(uniprot, source = "TPPG17")

Arguments

uniprot

character, UniProt accession numbers

source

character, ‘⁠TPPG17⁠’ or ‘⁠LMM16⁠

Details

The phylostratum for each protein is found by matching the UniProt ID in one of these data files:

extdata/evdevH2O/phylostrata/TPPG17.csv.xz

This file has columns ‘⁠GeneID⁠’ (gene name), ‘⁠Entrez⁠’, ‘⁠Entry⁠’, and ‘⁠Phylostrata⁠’. Except for ‘⁠Entry⁠’, the values are from Dataset S1 of Trigos et al. (2017). UniProt acession numbers in ‘⁠Entry⁠’ were generated using the UniProt mapping tool first for ‘⁠Entrez⁠’, followed by ‘⁠GeneID⁠’ for the unmatched genes. ‘⁠Entry⁠’ is NA for genes that remain unmatched to any proteins after both mapping steps.

extdata/evdevH2O/phylostrata/LMM16.csv.xz

This file has columns ‘⁠UniProt⁠’, ‘⁠modeAge⁠’, and ‘⁠PS⁠’. The data are from file main_HUMAN.csv in Gene-Ages v1.0 (https://zenodo.org/record/51708; Liebeskind et al. (2016)). The modeAges were converted to phylostrata values 1-8 (‘⁠PS⁠’ column) in this order: Cellular_organisms, Euk_Archaea, Euk+Bac, Eukaryota, Opisthokonta, Eumetazoa, Vertebrata, Mammalia.

References

Liebeskind BJ, McWhite CD and Marcotte EM (2016) Towards consensus gene ages. Genome Biol. Evol. 8, 1812–1823. doi:10.1093/gbe/evw113

Trigos AS, Pearson RB, Papenfuss AT and Goode DL (2017) Altered interactions between unicellular and multicellular genes drive hallmarks of transformation in a diverse range of solid tumors. Proc. Natl. Acad. Sci. 114, 6406–6411. doi:10.1073/pnas.1617743114

See Also

These data, but not this function, are used in the evdevH2O plots.

Examples

# Get protein expression data for one dataset
pd <- pdat_colorectal("JKMF10")
# Get phylostrata
PS(pd$pcomp$uniprot)

# Compare the two sources
PSdir <- system.file("extdata/evdevH2O/phylostrata", package = "JMDplots")
TPPG17 <- read.csv(file.path(PSdir, "TPPG17.csv.xz"))
LMM16 <- read.csv(file.path(PSdir, "LMM16.csv.xz"))
IDs <- intersect(TPPG17$Entry, LMM16$UniProt)
PS_TPPG17 <- TPPG17$Phylostrata[match(IDs, TPPG17$Entry)]
PS_LMM16 <- LMM16$PS[match(IDs, LMM16$UniProt)]
plot(jitter(PS_TPPG17), jitter(PS_LMM16), pch = ".")

[Package JMDplots version 1.2.19-14 Index]