The enrichment analysis is based on the subset of the network connected to a particular community as identified by calculateCommunitiesStats , see calculateTFEnrichment and calculateGeneralEnrichment for TF-specific and general enrichment, respectively. This function requires the existence of the eGRN graph in the GRN object as produced by build_eGRN_graph as well as community information as calculated by calculateCommunitiesStats. Results can subsequently be visualized with the function plotCommunitiesEnrichment.

calculateCommunitiesEnrichment(
  GRN,
  ontology = c("GO_BP", "GO_MF"),
  algorithm = "weight01",
  statistic = "fisher",
  background = "neighborhood",
  background_geneTypes = "all",
  selection = "byRank",
  communities = seq_len(10),
  pAdjustMethod = "BH",
  forceRerun = FALSE
)

Arguments

GRN

Object of class GRN

ontology

Character vector of ontologies. Default c("GO_BP", "GO_MF"). Valid values are "GO_BP", "GO_MF", "GO_CC", "KEGG", "DO", and "Reactome", referring to GO Biological Process, GO Molecular Function, GO Cellular Component, KEGG, Disease Ontology, and Reactome Pathways, respectively. GO ontologies require the topGO, "KEGG" the clusterProfiler, "DO" the DOSE, and "Reactome" the ReactomePA packages, respectively. As they are listed under Suggests, they may not yet be installed, and the function will throw an error if they are missing.

algorithm

Character. Default "weight01". One of: "classic", "elim", "weight", "weight01", "lea", "parentchild". Only relevant if ontology is GO related (GO_BP, GO_MF, GO_CC), ignored otherwise. Name of the algorithm that handles the GO graph structures. Valid inputs are those supported by the topGO library. For general information about the algorithms, see https://academic.oup.com/bioinformatics/article/22/13/1600/193669. weight01 is a mixture between the elim and the weight algorithms.

statistic

Character. Default "fisher". One of: "fisher", "ks", "t", "globaltest", "sum", "ks.ties". Statistical test to be used. Only relevant if ontology is GO related (GO_BP, GO_MF, GO_CC), and valid inputs are those supported by the topGO library, ignored otherwise. For the other ontologies the test statistic is always Fisher.

background

Character. Default "neighborhood". One of: "all_annotated", "all_RNA", "all_RNA_filtered", "neighborhood". Set of genes to be used to construct the background for the enrichment analysis. This can either be all annotated genes in the reference genome (all_annotated), all genes from the provided RNA data (all_RNA), all genes from the provided RNA data excluding those marked as filtered after executing filterData (all_RNA_filtered), or all the genes that are within the neighborhood of any peak (before applying any filters except for the user-defined promoterRange value in addConnections_peak_gene) (neighborhood).

background_geneTypes

Character vector of gene types that should be considered for the background. Default "all". Only gene types as defined in the GRN object, slot GRN@annotation$genes$gene.type are allowed. The special keyword "all" means no filter on gene type.

selection

Character. Default "byRank". One of: "byRank", "byLabel". Specify whether the communities enrichment will by calculated based on their rank, where the largest community (with most vertices) would have a rank of 1, or by their label. Note that the label is independent of the rank.

communities

Numeric vector. Default c(1:10). Depending on what was specified in the display parameter, this parameter would indicate either the rank or the label of the communities to be plotted. i.e. for communities = c(1,4), if display = "byRank" the GO enrichment for the first and fourth largest communities will be calculated if display = "byLabel", the results for the communities labeled "1", and "4" will be plotted.

pAdjustMethod

Character. Default "BH". One of: "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr". This parameter is only relevant for the following ontologies: KEGG, DO, Reactome. For the other ontologies, the algorithm serves as an adjustment.

forceRerun

TRUE or FALSE. Default FALSE. Force execution, even if the GRN object already contains the result. Overwrites the old results.

Value

An updated GRN object, with the enrichment results stored in the stats$Enrichment$byCommunity slot.

Details

All enrichment functions use the TF-gene graph as defined in the `GRN` object. See the `ontology` argument for currently supported ontologies. Also note that some parameter combinations for `algorithm` and `statistic` are incompatible, an error message will be thrown in such a case.

Examples

# See the Workflow vignette on the GRaNIE website for examples
GRN = loadExampleObject()
#> Downloading GRaNIE example object from https://git.embl.de/grp-zaugg/GRaNIE/-/raw/master/data/GRN.rds
#> Finished successfully. You may explore the example object. Start by typing the object name to the console to see a summaty. Happy GRaNIE'ing!
GRN = calculateCommunitiesEnrichment(GRN, ontology = c("GO_BP"), forceRerun = FALSE)
#> WARN [2023-03-06 16:39:22] Some of the requested communities (7,8,9,10) were not found. Only the following communities are available: 1,5,4,2,3,6. 
#> This warning may or may not be ignored. Carefully check its significance and whether it may affect the results.
#> 
#> Warning: Some of the requested communities (7,8,9,10) were not found. Only the following communities are available: 1,5,4,2,3,6. 
#> This warning may or may not be ignored. Carefully check its significance and whether it may affect the results.
#> INFO [2023-03-06 16:39:22] Running enrichment analysis for selected 6 communities. This may take a while...
#> INFO [2023-03-06 16:39:22]  Community 1
#> INFO [2023-03-06 16:39:23] Data already exists in object. Set forceRerun = TRUE to regenerate and overwrite.
#> INFO [2023-03-06 16:39:23]  Community 5
#> INFO [2023-03-06 16:39:23] Data already exists in object. Set forceRerun = TRUE to regenerate and overwrite.
#> INFO [2023-03-06 16:39:23]  Community 4
#> INFO [2023-03-06 16:39:23] Data already exists in object. Set forceRerun = TRUE to regenerate and overwrite.
#> INFO [2023-03-06 16:39:23]  Community 2
#> INFO [2023-03-06 16:39:23] Data already exists in object. Set forceRerun = TRUE to regenerate and overwrite.
#> INFO [2023-03-06 16:39:23]  Community 3
#> INFO [2023-03-06 16:39:23] Data already exists in object. Set forceRerun = TRUE to regenerate and overwrite.
#> INFO [2023-03-06 16:39:23]  Community 6
#> INFO [2023-03-06 16:39:23] Data already exists in object. Set forceRerun = TRUE to regenerate and overwrite.
#> INFO [2023-03-06 16:39:23]  Finished successfully. Execution time: 1.9 secs