GRaNPA_main_function.Rd
Main GRaNPA function. it will use differential expression data to construct a random forest model to predict it using the information from GRN.
GRaNPA_main_function(
DE_data,
GRN_matrix_filtered,
DE_pvalue_th = 0.2,
logFC_th = 0,
num_run = 5,
num_run_CR = 2,
num_run_random = 5,
cores = 10,
importance = "permutation",
ML_type = "regression",
control = "cv",
train_part = 1
)
Differential expression data. The DE matrix should contain 'ENSEMBL', 'logFC' and 'padj' columns
A data.frame with at least 2 columns contain TF.name column for TF names and gene.ENSEMBL for gene ENSEMBL ids. Optionally it can have a weight columns for weighting the connections
a cut off on adjusted pvalue for filtering the DE data. Default is 0.2
a cut off on absolute log2 Fold Change for filtering the DE data. Default is 0
Number of runs for real GRN (should be > 0). Default is 5 and at least 3 times is suggested.
Number of runs for quality control GRN (should be > 0). Default is 2 and at least 2 times is suggested.
Number of runs for randomized GRN (should be > 0). Default is 5 and at least 3 times is suggested.
Number of cores. default is 10.
this is the algorithm to use for finding most important TFs. Default is permutation. impurity_corrected and impurity are the other options.
Could be regression or classification. For regression it computes R^2 to predict actual values. For classification it computes accuracy to predict directionality of DE data.
Could be "cv" for " 10-fold cross validation" or 'oob' for "Out Of Bag" or 'bt' for "Bootstrap"
You can divide genes into the train and test. Here you can mention how much of data you want to use for training
a GRaNPA object contains normal_dist = distribution of R^2 for actual network, random_dist = distribution of R^2 for random network, CR_dist = distribution of R^2 for QC network, nrm_imp_unscaled = unscaled importance score for each TF and each run nrm_imp_scaled = scaled importance score for each TF and each run normal_data = The actual data matrix which the RF has been applied normal_models = all the RF models for actual network random_models = all the RF models for random network
GRaNPA_main_function(DE_data, GRN_matrix_filtered)
#> INFO [2024-05-05 13:15:49] GRaNPA Main function
#> WARN [2024-05-05 13:15:49] Both Differentiall expression data and GRN should contain 'ENSEMBL ID' for the list of genes
#> INFO [2024-05-05 13:15:49] Differential expression will be filltered by 0.2 adjusted pvalue and 0 absolute logFC
#> Error in is.data.frame(x): object 'GRN_matrix_filtered' not found