This file is associated with the course on Functional Analysis,
accessible at this link: https://moodle.france-bioinformatique.fr/course/view.php?id=28.
This file shows the code to perform over-representation analysis and
gene set enrichment analysis. The analyses are based on the
clusterProfiler
package:
citation("clusterProfiler")
## Please cite S. Xu (2024) for using clusterProfiler. In addition, please
## cite G. Yu (2010) when using GOSemSim, G. Yu (2015) when using DOSE and
## G. Yu (2015) when using ChIPseeker.
##
## S Xu, E Hu, Y Cai, Z Xie, X Luo, L Zhan, W Tang, Q Wang, B Liu, R
## Wang, W Xie, T Wu, L Xie, G Yu. Using clusterProfiler to characterize
## multiomics data. Nature Protocols. 2024,
## doi:10.1038/s41596-024-01020-z
##
## T Wu, E Hu, S Xu, M Chen, P Guo, Z Dai, T Feng, L Zhou, W Tang, L
## Zhan, X Fu, S Liu, X Bo, and G Yu. clusterProfiler 4.0: A universal
## enrichment tool for interpreting omics data. The Innovation. 2021,
## 2(3):100141
##
## Guangchuang Yu, Li-Gen Wang, Yanyan Han and Qing-Yu He.
## clusterProfiler: an R package for comparing biological themes among
## gene clusters. OMICS: A Journal of Integrative Biology 2012,
## 16(5):284-287
##
## To see these entries in BibTeX format, use 'print(<citation>,
## bibtex=TRUE)', 'toBibtex(.)', or set
## 'options(citation.bibtex.max=999)'.
We load packages of interest:
library(clusterProfiler) # Make enrichment analysis
library(enrichplot) # Awesome graphs
library(org.At.tair.db) # A. Thaliana annotation
.libPaths()
## [1] "/shared/home/aonfroy/R/x86_64-conda-linux-gnu-library/4.4"
## [2] "/shared/ifbstor1/software/miniconda/envs/r-4.4.1/lib/R/library"
We load the data from the differential expression analysis.
deseq_genes = read.table(
file = "./tables/KOvsWT.complete.txt",
sep = "\t",
header = TRUE
)
We assess the dimensions of the data
colnames(deseq_genes)
## [1] "Id" "WT1" "WT2" "WT3"
## [5] "KO1" "KO2" "KO3" "norm.WT1"
## [9] "norm.WT2" "norm.WT3" "norm.KO1" "norm.KO2"
## [13] "norm.KO3" "baseMean" "WT" "KO"
## [17] "FoldChange" "log2FoldChange" "stat" "pvalue"
## [21] "padj" "dispGeneEst" "dispFit" "dispMAP"
## [25] "dispersion" "betaConv" "maxCooks"
nrow(deseq_genes)
## [1] 27655
head(deseq_genes$Id)
## [1] "gene:AT1G01010" "gene:AT1G01020" "gene:AT1G01030" "gene:AT1G01040"
## [5] "gene:AT1G01050" "gene:AT1G01060"
We explore the data:
deseq_genes[deseq_genes$Id == "gene:AT1G61580", ]
## Id WT1 WT2 WT3 KO1 KO2 KO3 norm.WT1 norm.WT2 norm.WT3 norm.KO1
## 5120 gene:AT1G61580 248 231 205 119 131 125 229 210 215 131
## norm.KO2 norm.KO3 baseMean WT KO FoldChange log2FoldChange stat
## 5120 131 123 173.19 218 128 0.588 -0.766 -4.48
## pvalue padj dispGeneEst dispFit dispMAP dispersion betaConv
## 5120 7.465947e-06 0.0001156724 0 0.0311 0.0149 0.0149 TRUE
## maxCooks
## 5120 0.0222
For a computer, gene:AT1G01010
is not
AT1G01010
. We fix identifiers for the computer:
head(deseq_genes$Id)
## [1] "gene:AT1G01010" "gene:AT1G01020" "gene:AT1G01030" "gene:AT1G01040"
## [5] "gene:AT1G01050" "gene:AT1G01060"
deseq_genes$Id = sub(pattern = "gene:",
replacement = "",
x = deseq_genes$Id)
head(deseq_genes$Id)
## [1] "AT1G01010" "AT1G01020" "AT1G01030" "AT1G01040" "AT1G01050" "AT1G01060"
For a human, AT1G61580
is horrible to remember. We can
add human-readable names. The latter are called “symbol”.
annotation = clusterProfiler::bitr(
geneID = deseq_genes$Id, # Our gene list
fromType = "TAIR", # We have TAIR ID
toType = c("ENTREZID", "SYMBOL"), # What we want
OrgDb = org.At.tair.db, # Our annotation
drop = FALSE
)
## 'select()' returned 1:many mapping between keys and columns
## Warning in clusterProfiler::bitr(geneID = deseq_genes$Id, fromType = "TAIR", :
## 3.42% of input gene IDs are fail to map...
head(annotation)
## TAIR ENTREZID SYMBOL
## 1 AT1G01010 839580 ANAC001
## 2 AT1G01010 839580 NAC001
## 3 AT1G01010 839580 NTL10
## 4 AT1G01020 839569 ARV1
## 5 AT1G01030 839321 NGA3
## 6 AT1G01040 839574 ASU1
dim(annotation)
## [1] 38947 3
We add the translation to the result table
deseq_genes_with_symbol = merge(
x = deseq_genes,
y = annotation,
by.x = "Id", # In deseq_genes, TAIR IDs are stored in the Id column
by.y = "TAIR" # In annotation, TAIR IDs are stored in the TAIR column
)
dim(deseq_genes)
## [1] 27655 27
dim(deseq_genes_with_symbol)
## [1] 38947 29
head(deseq_genes_with_symbol[, c("Id", "SYMBOL", "ENTREZID")])
## Id SYMBOL ENTREZID
## 1 AT1G01010 ANAC001 839580
## 2 AT1G01010 NAC001 839580
## 3 AT1G01010 NTL10 839580
## 4 AT1G01020 ARV1 839569
## 5 AT1G01030 NGA3 839321
## 6 AT1G01040 ASU1 839574
Adding symbol changes the number of dimensions of the table:
dim(deseq_genes)
## [1] 27655 27
dim(deseq_genes_with_symbol)
## [1] 38947 29
We are going to use the original table, so we clean the annotation and annotated table:
rm(annotation, deseq_genes_with_symbol)
We need to filter differentially expressed genes in order to perform ORA.
dim(deseq_genes)
## [1] 27655 27
de_genes = deseq_genes[deseq_genes[, "padj"] <= 0.001, ]
de_genes = de_genes[!is.na(de_genes[, "log2FoldChange"]), ]
dim(de_genes)
## [1] 1807 27
We perform the ORA using the gene ontology for cellular components:
ego = clusterProfiler::enrichGO(
gene = de_genes$Id, # gene list
universe = deseq_genes$Id, # all genes
OrgDb = org.At.tair.db, # annotation
keyType = "TAIR", # nature of the genes ID
ont = "CC", # Cellular Components
pvalueCutoff = 1, # significance threshold (take all)
pAdjustMethod = "BH", # p-value adjustment method
readable = TRUE # For human beings
)
# View(ego)
How it looks like ?
head(ego@result, 3)
## ID Description GeneRatio BgRatio
## GO:0055035 GO:0055035 plastid thylakoid membrane 74/1785 357/26909
## GO:0009535 GO:0009535 chloroplast thylakoid membrane 73/1785 349/26909
## GO:0019867 GO:0019867 outer membrane 85/1785 477/26909
## pvalue p.adjust qvalue
## GO:0055035 1.033262e-18 1.443340e-16 1.310881e-16
## GO:0009535 1.042123e-18 1.443340e-16 1.310881e-16
## GO:0019867 5.207312e-17 3.606064e-15 3.275125e-15
## geneID
## GO:0055035 PSAD-2/EMB2784/PnsL2/NDF1/NDF6/LHCA2*1/ATLFNR2/DRT112/COR413-TM1/AB180/AB165/AB140/PSAF/PRK/CP22/PSAH-2/PSAG/RBCS1A/cS23z/CRR23/LIL8/SPPA/NDH-O/CRR3/LHCB2/LHCB2/NA/NA/PSBW/DEG11/DRN1/ATBCA1/EMB3119/LHCB4.2/ATNCED3/ATHM4/AOC1/ATPRX/CCL/DEG13/bL12cz/ATLOX2/PTAC16/CAB4/CHL/LHCA1/CYP20-3/OEP6/ATPC1/ATPD/PSAL/ATRAB8D/PSBQ/PSAE-1/PDE334/NDH-M/CURT1D/FKBP16-2/ATPLAT1/LHCB4.1/ATFER1/BETA/NA/PHGAP2/ATHMA8/NA/ATGSL1/NDH18/PAM68L/LHCB3/PSAN/PSAB/PSAA/AthCF1beta
## GO:0009535 PSAD-2/EMB2784/PnsL2/NDF1/NDF6/LHCA2*1/ATLFNR2/COR413-TM1/AB180/AB165/AB140/PSAF/PRK/CP22/PSAH-2/PSAG/RBCS1A/cS23z/CRR23/LIL8/SPPA/NDH-O/CRR3/LHCB2/LHCB2/NA/NA/PSBW/DEG11/DRN1/ATBCA1/EMB3119/LHCB4.2/ATNCED3/ATHM4/AOC1/ATPRX/CCL/DEG13/bL12cz/ATLOX2/PTAC16/CAB4/CHL/LHCA1/CYP20-3/OEP6/ATPC1/ATPD/PSAL/ATRAB8D/PSBQ/PSAE-1/PDE334/NDH-M/CURT1D/FKBP16-2/ATPLAT1/LHCB4.1/ATFER1/BETA/NA/PHGAP2/ATHMA8/NA/ATGSL1/NDH18/PAM68L/LHCB3/PSAN/PSAB/PSAA/AthCF1beta
## GO:0019867 PSAD-2/FZL/EMB2784/PnsL2/NDF1/NDF6/LHCA2*1/ATLFNR2/AtTTM2/COR413-TM1/AB180/AB165/AB140/PSAF/PRK/CP22/NA/PSAH-2/PSAG/RBCS1A/cS23z/CRR23/LIL8/AtTTM1/SPPA/NDH-O/ARC3/CRR3/LHCB2/LHCB2/NA/NA/PSBW/DEG11/DRN1/ATBCA1/EMB3119/LHCB4.2/ATNCED3/ATHM4/AOC1/ATPRX/CCL/DEG13/bL12cz/AtOM47/ATLOX2/PTAC16/CAB4/CHL/AtBCS1/LHCA1/CYP20-3/OEP6/ATPC1/ATPD/AZI3/PSAL/ATRAB8D/PSBQ/PSAE-1/KOC1/PDE334/NDH-M/CURT1D/FKBP16-2/ATPLAT1/LHCB4.1/ATFER1/BETA/NA/PHGAP2/ATHMA8/ATKO1/NA/ATGSL1/NDH18/At-NEET/PAM68L/NA/LHCB3/PSAN/PSAB/PSAA/AthCF1beta
## Count
## GO:0055035 74
## GO:0009535 73
## GO:0019867 85
We visualize results:
graphics::barplot(ego,
showCategory = 15)
enrichplot::dotplot(ego,
showCategory = 15)
We search for enriched terms that related to roots.
grep(ego@result$Description,
pattern = "root",
value = TRUE)
## [1] "root hair"
There is only one. What about the result regarding this gene set ?
ego@result[ego@result$Description == "root hair", ]
## ID Description GeneRatio BgRatio pvalue p.adjust
## GO:0035618 GO:0035618 root hair 5/1785 23/26909 0.01573318 0.1562631
## qvalue geneID Count
## GO:0035618 0.1419224 MATE/ATCNGC6/PRX44/AtSFH1/PRX73 5
We perform the ORA using the gene ontology for biological processes:
ego = clusterProfiler::enrichGO(
gene = de_genes$Id, # gene list
universe = deseq_genes$Id, # all genes
OrgDb = org.At.tair.db, # annotation
keyType = "TAIR", # nature of the genes ID
ont = "BP", # Biological Processes
pvalueCutoff = 1, # significance threshold (take all)
pAdjustMethod = "BH", # p-value adjustment method
readable = TRUE # For human beings
)
We visualize results:
graphics::barplot(ego,
showCategory = 15)
enrichplot::dotplot(ego,
showCategory = 15)
Again, we search for enriched terms that related to roots.
root_names = grep(ego@result$Description,
pattern = "root",
value = TRUE)
root_names
## [1] "root morphogenesis"
## [2] "lateral root development"
## [3] "root epidermal cell differentiation"
## [4] "post-embryonic root development"
## [5] "root hair cell development"
## [6] "root hair cell differentiation"
## [7] "lateral root morphogenesis"
## [8] "post-embryonic root morphogenesis"
## [9] "root hair elongation"
## [10] "lateral root formation"
## [11] "root hair cell tip growth"
## [12] "root hair initiation"
## [13] "regulation of root meristem growth"
## [14] "root meristem growth"
## [15] "primary root development"
## [16] "regulation of lateral root development"
## [17] "regulation of root development"
## [18] "root cap development"
## [19] "regulation of post-embryonic root development"
## [20] "regulation of root morphogenesis"
## [21] "maintenance of root meristem identity"
There are a lot ! What about the associated results ?
ego@result[ego@result$Description %in% root_names, ]
## ID Description GeneRatio
## GO:0010015 GO:0010015 root morphogenesis 47/1383
## GO:0048527 GO:0048527 lateral root development 24/1383
## GO:0010053 GO:0010053 root epidermal cell differentiation 24/1383
## GO:0048528 GO:0048528 post-embryonic root development 25/1383
## GO:0080147 GO:0080147 root hair cell development 18/1383
## GO:0048765 GO:0048765 root hair cell differentiation 20/1383
## GO:0010102 GO:0010102 lateral root morphogenesis 14/1383
## GO:0010101 GO:0010101 post-embryonic root morphogenesis 14/1383
## GO:0048767 GO:0048767 root hair elongation 13/1383
## GO:0010311 GO:0010311 lateral root formation 9/1383
## GO:0048768 GO:0048768 root hair cell tip growth 5/1383
## GO:0048766 GO:0048766 root hair initiation 4/1383
## GO:0010082 GO:0010082 regulation of root meristem growth 4/1383
## GO:0010449 GO:0010449 root meristem growth 4/1383
## GO:0080022 GO:0080022 primary root development 3/1383
## GO:2000023 GO:2000023 regulation of lateral root development 2/1383
## GO:2000280 GO:2000280 regulation of root development 5/1383
## GO:0048829 GO:0048829 root cap development 1/1383
## GO:2000069 GO:2000069 regulation of post-embryonic root development 2/1383
## GO:2000067 GO:2000067 regulation of root morphogenesis 1/1383
## GO:0010078 GO:0010078 maintenance of root meristem identity 1/1383
## BgRatio pvalue p.adjust qvalue
## GO:0010015 347/21050 1.917062e-06 0.0001694857 0.0001400648
## GO:0048527 151/21050 4.914273e-05 0.0020336727 0.0016806485
## GO:0010053 154/21050 6.783874e-05 0.0026704699 0.0022069044
## GO:0048528 164/21050 7.002260e-05 0.0026704699 0.0022069044
## GO:0080147 107/21050 2.007462e-04 0.0061976408 0.0051217955
## GO:0048765 132/21050 3.878455e-04 0.0094264546 0.0077901211
## GO:0010102 78/21050 5.062118e-04 0.0115833177 0.0095725753
## GO:0010101 79/21050 5.786498e-04 0.0126457730 0.0104505996
## GO:0048767 77/21050 1.430280e-03 0.0229591337 0.0189736692
## GO:0010311 60/21050 1.584524e-02 0.1199182390 0.0991016918
## GO:0048768 25/21050 2.143399e-02 0.1426382809 0.1178777729
## GO:0048766 20/21050 3.859919e-02 0.1981569812 0.1637590098
## GO:0010082 44/21050 3.273507e-01 0.6163573779 0.5093642084
## GO:0010449 56/21050 5.065207e-01 0.7684733018 0.6350744050
## GO:0080022 41/21050 5.107809e-01 0.7695343076 0.6359512314
## GO:2000023 31/21050 6.133967e-01 0.8095479937 0.6690189616
## GO:2000280 82/21050 6.327005e-01 0.8196782385 0.6773907023
## GO:0048829 15/21050 6.393069e-01 0.8196782385 0.6773907023
## GO:2000069 33/21050 6.476445e-01 0.8265343358 0.6830566531
## GO:2000067 33/21050 8.940083e-01 0.9496702381 0.7848174556
## GO:0010078 42/21050 9.425670e-01 0.9682636046 0.8001832089
## geneID
## GO:0010015 ATCTL1/ZFP5/LRX1/ATEXP7/AtDTX31/ATP8/ACT8/ATROPGEF11/ABS4/AtRHD6/ATPIN3/LAX3/AHK4/AIR3/P4H5/ICK1/ARR12/ABR/AIR12/AtTCTP1/ACT2/ATPLT5/ATNEK5/ATPRP3/AtTIR1/AHDP/ATEXP17/RHS13/ATXT2/AIR1/AMT1;1/CLEL/BAM3/ABS3/AtSFH1/AtHMP42/AtPRPL1/SHBY/DGR2/IAA28/ATSOS4/ATMYA2/COBL9/CEPR1/XBAT32/DROP3/CAP1
## GO:0048527 ACH1/ATGSTU17/AtNPF6.3/AtLrgB/ATP8/BDG1/LAX3/AIR3/ICK1/ATWRKY46/AIR12/AtTCTP1/ATPLT5/ATNEK5/ABCB19/MYB77/AtTIR1/ATEXP17/AIR1/AMT1;1/CLEL/IAA28/CEPR1/XBAT32
## GO:0010053 ATCTL1/ZFP5/LRX1/ATEXP7/AtDTX31/ACT8/ATROPGEF11/AtRHD6/ATPIN3/P4H5/ABR/AtTCTP1/ACT2/ATPRP3/AHDP/RHS13/ATXT2/AtSFH1/AtPRPL1/ATSOS4/ATMYA2/COBL9/DROP3/CAP1
## GO:0048528 ACH1/ATGSTU17/AtNPF6.3/AtLrgB/ATP8/BDG1/LAX3/AIR3/MCA2/ICK1/ATWRKY46/AIR12/AtTCTP1/ATPLT5/ATNEK5/ABCB19/MYB77/AtTIR1/ATEXP17/AIR1/AMT1;1/CLEL/IAA28/CEPR1/XBAT32
## GO:0080147 ZFP5/ATEXP7/AtDTX31/ACT8/ATROPGEF11/ATPIN3/P4H5/ABR/AtTCTP1/ACT2/RHS13/ATXT2/AtSFH1/AtPRPL1/ATMYA2/COBL9/DROP3/CAP1
## GO:0048765 ZFP5/ATEXP7/AtDTX31/ACT8/ATROPGEF11/AtRHD6/ATPIN3/P4H5/ABR/AtTCTP1/ACT2/AHDP/RHS13/ATXT2/AtSFH1/AtPRPL1/ATMYA2/COBL9/DROP3/CAP1
## GO:0010102 ATP8/LAX3/AIR3/ICK1/AIR12/ATPLT5/ATNEK5/AtTIR1/ATEXP17/AIR1/AMT1;1/IAA28/CEPR1/XBAT32
## GO:0010101 ATP8/LAX3/AIR3/ICK1/AIR12/ATPLT5/ATNEK5/AtTIR1/ATEXP17/AIR1/AMT1;1/IAA28/CEPR1/XBAT32
## GO:0048767 ZFP5/ATEXP7/AtDTX31/ACT8/ATPIN3/ABR/AtTCTP1/ACT2/ATXT2/AtSFH1/AtPRPL1/ATMYA2/COBL9
## GO:0010311 LAX3/ICK1/ATPLT5/ATNEK5/AtTIR1/ATEXP17/AMT1;1/CEPR1/XBAT32
## GO:0048768 ACT8/AtTCTP1/ACT2/AtSFH1/COBL9
## GO:0048766 ZFP5/AtRHD6/ATPIN3/ABR
## GO:0010082 ATPIN3/ARR12/CLEL/SHBY
## GO:0010449 ATPIN3/ARR12/CLEL/SHBY
## GO:0080022 BRON/ARR12/ATEXP5
## GO:2000023 CLEL/CEPR1
## GO:2000280 GLP5/OPS/CLEL/ATEXO70C1/CEPR1
## GO:0048829 LAX3
## GO:2000069 CLEL/CEPR1
## GO:2000067 CLEL
## GO:0010078 BAM3
## Count
## GO:0010015 47
## GO:0048527 24
## GO:0010053 24
## GO:0048528 25
## GO:0080147 18
## GO:0048765 20
## GO:0010102 14
## GO:0010101 14
## GO:0048767 13
## GO:0010311 9
## GO:0048768 5
## GO:0048766 4
## GO:0010082 4
## GO:0010449 4
## GO:0080022 3
## GO:2000023 2
## GO:2000280 5
## GO:0048829 1
## GO:2000069 2
## GO:2000067 1
## GO:0010078 1
We visualize the results as graphs:
graphics::barplot(ego,
showCategory = root_names)
enrichplot::dotplot(ego,
showCategory = root_names)
We need to build a named vector which contains sorted numbers. So, we explore results to guess the right column to extract:
colnames(deseq_genes)
## [1] "Id" "WT1" "WT2" "WT3"
## [5] "KO1" "KO2" "KO3" "norm.WT1"
## [9] "norm.WT2" "norm.WT3" "norm.KO1" "norm.KO2"
## [13] "norm.KO3" "baseMean" "WT" "KO"
## [17] "FoldChange" "log2FoldChange" "stat" "pvalue"
## [21] "padj" "dispGeneEst" "dispFit" "dispMAP"
## [25] "dispersion" "betaConv" "maxCooks"
We choose to use the stat
column
geneList = as.numeric(de_genes$stat)
names(geneList) = de_genes$Id
geneList = sort(geneList, decreasing = TRUE)
head(geneList)
## AT2G17820 AT5G19600 AT2G25760 AT3G19670 AT3G48110 AT5G11800
## 18.377 16.078 16.002 15.616 15.249 14.443
We perform the GSEA using the gene ontology for biological processes:
gsea = clusterProfiler::gseGO(
geneList = geneList, # ranked gene list
ont = "BP", # Biological Processes
OrgDb = org.At.tair.db, # annotation
keyType = "TAIR", # nature of the genes ID
pAdjustMethod = "BH", # p-value adjustment method
pvalueCutoff = 1, # significance threshold (take all)
seed = 1 # fix randomness for permutations
)
We explore results:
gsea@result %>%
dplyr::filter(p.adjust < 0.05) %>%
dplyr::top_n(., n = 8, wt = abs(NES)) %>%
dplyr::select(Description, NES, p.adjust, setSize)
## Description
## GO:0090304 nucleic acid metabolic process
## GO:0006139 nucleobase-containing compound metabolic process
## GO:0006396 RNA processing
## GO:0016071 mRNA metabolic process
## GO:0008380 RNA splicing
## GO:0000375 RNA splicing, via transesterification reactions
## GO:0000377 RNA splicing, via transesterification reactions with bulged adenosine as nucleophile
## GO:0000398 mRNA splicing, via spliceosome
## NES p.adjust setSize
## GO:0090304 3.205898 7.612500e-09 269
## GO:0006139 3.108906 7.612500e-09 313
## GO:0006396 3.294111 1.875476e-08 49
## GO:0016071 3.329805 1.995708e-08 44
## GO:0008380 3.201659 4.633670e-07 25
## GO:0000375 3.163249 1.374899e-06 23
## GO:0000377 3.163249 1.374899e-06 23
## GO:0000398 3.163249 1.374899e-06 23
We still focus on root-related terms:
root_names = grep(gsea@result$Description,
pattern = "root",
value = TRUE)
root_names
## [1] "root epidermal cell differentiation" "root hair cell differentiation"
## [3] "root hair cell development" "root system development"
## [5] "root development" "root morphogenesis"
## [7] "root hair elongation" "lateral root development"
## [9] "post-embryonic root development" "post-embryonic root morphogenesis"
## [11] "lateral root morphogenesis"
What are the significant results associated with these terms ?
gsea@result %>%
dplyr::filter(p.adjust < 0.05) %>%
dplyr::filter(Description %in% root_names) %>%
dplyr::top_n(., n = 8, wt = abs(NES)) %>%
dplyr::select(Description, NES, p.adjust, setSize)
## Description NES p.adjust setSize
## GO:0010053 root epidermal cell differentiation -1.992990 0.02082870 24
## GO:0048765 root hair cell differentiation -1.803941 0.03351513 20
We want to visualize the GSEA curve associated with one of these terms:
gene_set_name = "root hair cell differentiation"
gene_set_id = which(gsea@result$Description == gene_set_name)
gene_set_id
## [1] 207
enrichplot::gseaplot2(
x = gsea,
geneSetID = gene_set_id,
title = gene_set_name
)
In this section, we propose other ways to visualize the results from
ORA (ego
object) or GSEA (gsea
object).
enrichplot::gseaplot2(
x = gsea,
geneSetID = c(1:3),
title = "Most enriched terms"
)
enrichplot::heatplot(
x = ego, # Our ORA
showCategory = root_names, # Gene sets of interest
foldChange = setNames(nm = de_genes$Id,
de_genes$log2FoldChange) # Our fold changes
)
ego = enrichplot::pairwise_termsim(ego)
enrichplot::upsetplot(x = ego, # Our ORA
n = 10) # Nb of terms to display
enrichplot::emapplot(ego, showCategory = root_names)
enrichplot::cnetplot(ego,
showCategory = root_names,
foldChange = setNames(nm = de_genes$Id,
de_genes$log2FoldChange))
## Warning in cnetplot.enrichResult(x, ...): Use 'color.params = list(foldChange = your_value)' instead of 'foldChange'.
## The foldChange parameter will be removed in the next version.
## Scale for size is already present.
## Adding another scale for size, which will replace the existing scale.
To be able to re-run the analysis or to understand why outputs are different between two compilations, it is important to display the version of the packages we used:
sessionInfo()
## R version 4.4.1 (2024-06-14)
## Platform: x86_64-conda-linux-gnu
## Running under: Ubuntu 20.04.6 LTS
##
## Matrix products: default
## BLAS/LAPACK: /shared/ifbstor1/software/miniconda/envs/r-4.4.1/lib/libopenblasp-r0.3.27.so; LAPACK version 3.12.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## time zone: Europe/Paris
## tzcode source: system (glibc)
##
## attached base packages:
## [1] stats4 stats graphics grDevices utils datasets methods
## [8] base
##
## other attached packages:
## [1] org.At.tair.db_3.19.1 AnnotationDbi_1.66.0 IRanges_2.38.1
## [4] S4Vectors_0.42.1 Biobase_2.64.0 BiocGenerics_0.50.0
## [7] enrichplot_1.24.4 clusterProfiler_4.12.6
##
## loaded via a namespace (and not attached):
## [1] DBI_1.2.3 gson_0.1.0 shadowtext_0.1.3
## [4] gridExtra_2.3 httr2_1.0.1 rlang_1.1.4
## [7] magrittr_2.0.3 DOSE_3.30.1 compiler_4.4.1
## [10] RSQLite_2.3.7 png_0.1-8 vctrs_0.6.5
## [13] reshape2_1.4.4 stringr_1.5.1 pkgconfig_2.0.3
## [16] crayon_1.5.3 fastmap_1.2.0 XVector_0.44.0
## [19] labeling_0.4.3 ggraph_2.2.1 utf8_1.2.4
## [22] HDO.db_0.99.1 rmarkdown_2.28 UCSC.utils_1.0.0
## [25] purrr_1.0.2 bit_4.5.0 xfun_0.48
## [28] zlibbioc_1.50.0 cachem_1.1.0 aplot_0.2.3
## [31] GenomeInfoDb_1.40.1 jsonlite_1.8.9 blob_1.2.4
## [34] highr_0.11 BiocParallel_1.38.0 tweenr_2.0.3
## [37] parallel_4.4.1 R6_2.5.1 bslib_0.8.0
## [40] stringi_1.8.4 RColorBrewer_1.1-3 jquerylib_0.1.4
## [43] GOSemSim_2.30.2 Rcpp_1.0.13 knitr_1.48
## [46] R.utils_2.12.3 Matrix_1.7-1 splines_4.4.1
## [49] igraph_2.1.1 tidyselect_1.2.1 qvalue_2.36.0
## [52] rstudioapi_0.17.0 yaml_2.3.10 viridis_0.6.5
## [55] codetools_0.2-20 lattice_0.22-6 tibble_3.2.1
## [58] plyr_1.8.9 treeio_1.28.0 withr_3.0.1
## [61] KEGGREST_1.44.0 evaluate_1.0.1 gridGraphics_0.5-1
## [64] scatterpie_0.2.4 polyclip_1.10-7 ggupset_0.4.0
## [67] Biostrings_2.72.1 ggtree_3.12.0 pillar_1.9.0
## [70] ggfun_0.1.6 generics_0.1.3 ggplot2_3.5.1
## [73] tidytree_0.4.6 munsell_0.5.1 scales_1.3.0
## [76] glue_1.8.0 lazyeval_0.2.2 tools_4.4.1
## [79] ggnewscale_0.4.10 data.table_1.16.2 fgsea_1.30.0
## [82] fs_1.6.4 graphlayouts_1.1.1 fastmatch_1.1-4
## [85] tidygraph_1.3.1 cowplot_1.1.3 grid_4.4.1
## [88] ape_5.8 tidyr_1.3.1 colorspace_2.1-1
## [91] nlme_3.1-165 patchwork_1.3.0 GenomeInfoDbData_1.2.12
## [94] ggforce_0.4.2 cli_3.6.3 rappdirs_0.3.3
## [97] fansi_1.0.6 viridisLite_0.4.2 dplyr_1.1.4
## [100] gtable_0.3.5 R.methodsS3_1.8.2 yulab.utils_0.1.7
## [103] sass_0.4.9 digest_0.6.37 ggplotify_0.1.2
## [106] ggrepel_0.9.6 farver_2.1.2 memoise_2.0.1
## [109] htmltools_0.5.8.1 R.oo_1.26.0 lifecycle_1.0.4
## [112] httr_1.4.7 GO.db_3.19.1 bit64_4.5.2
## [115] MASS_7.3-61