runGRaNIE_batchMode.Rd
This function runs the GRaNIE pipeline in batch mode, processing multiple clustering resolutions and generating gene regulatory network analyses.
runGRaNIE_batchMode(
datasetName,
inputDir,
outputDir,
clusterResolutions = c(0.1, seq(0.25, 1, 0.25), seq(2, 10, 1), seq(12, 20, 2)),
filenameSuffix = "",
idColumn_peaks = "peakID",
idColumn_RNA = "ENSEMBL",
genomeAssembly = "hg38",
TFBS_source = "custom",
TFBS_folder = NULL,
TFBS_JASPAR_useSpecificTaxGroup = NULL,
nCores = 4,
normRNA_all = c("limma_quantile"),
normATAC_all = c("DESeq2_sizeFactors"),
includeSexChr = FALSE,
minCV = 0,
minNormalizedMean_peaks = 5,
minNormalizedMean_RNA = 1,
minSizePeaks = 5,
corMethod = "pearson",
promoterRange = 250000,
useGCCorrection = FALSE,
TF_peak.fdr.threshold = 0.2,
peak_gene.fdr.threshold = 0.1,
runTFClassification = FALSE,
runNetworkAnalyses = FALSE,
forceRerun = TRUE
)
Character string specifying the name of the dataset.
Character string specifying the directory where the input files are located.
Character string specifying the directory where the output files will be saved.
Numeric vector specifying the clustering resolutions to consider. Default is `c(0.1, seq(0.25, 1, 0.25), seq(2,10,1), seq(12,20,2))`.
Character string specifying the suffix for the output file names. Default is `""`.
Character string specifying the column name for peak IDs. Default is `"peakID"`.
Character string specifying the column name for RNA IDs. Default is `"ENSEMBL"`.
Character string specifying the genome assembly to use. Default is `"hg38"`.
Character string specifying the source for transcription factor binding sites. Options are `"custom"`, `"JASPAR2022"`, `"JASPAR2024"`. Default is `"custom"`.
Character string specifying the folder containing custom transcription factor binding site files. Default is `NULL`.
Character string specifying the specific taxonomic group to use from JASPAR. Default is `NULL`.
Integer value specifying the number of cores to use for parallel processing. Default is `4`.
Character vector specifying the normalization methods to apply to RNA data. Default is `c("limma_quantile")`.
Character vector specifying the normalization methods to apply to ATAC data. Default is `c("DESeq2_sizeFactors")`.
Logical value indicating whether to include sex chromosomes in the analysis. Default is `FALSE`.
Numeric value specifying the minimum coefficient of variation for filtering. Default is `0`.
Numeric value specifying the minimum normalized mean for peak filtering. Default is `5`.
Numeric value specifying the minimum normalized mean for RNA filtering. Default is `1`.
Integer value specifying the minimum size for peaks. Default is `5`.
Character string specifying the correlation method to use. Options are `"pearson"`, `"spearman"`, etc. Default is `"pearson"`.
Integer value specifying the range around the promoter to consider for peak-gene connections. Default is `250000`.
Logical value indicating whether to use GC content correction. Default is `FALSE`.
Numeric value specifying the FDR threshold for transcription factor-peak connections. Default is `0.2`.
Numeric value specifying the FDR threshold for peak-gene connections. Default is `0.1`.
Logical value indicating whether to run transcription factor classification. Default is `FALSE`.
Logical value indicating whether to run network analyses. Default is `FALSE`.
A logical value indicating whether to force rerun the function and re-generate the output even if the output files already exist on disk or in the object. Default is FALSE.
The function processes the dataset and saves the results in the specified output directory.
if (FALSE) {
# Example usage:
runGRaNIE_batchMode(
datasetName = "example_dataset",
inputDir = "data/input/",
outputDir = "data/output/",
clusterResolutions = c(0.1, 0.5, 1),
TFBS_source = "JASPAR2024",
nCores = 8
)
}