foreach - How can I parallelize a double for loop in R? -


i've been trying parallelize code because i'm using double loop record results. i've been trying see how use snow , doparallel packages in r this.

if replicable example, use

residual_anomalies <- matrix(sample(c('anomaly','no signal'),300,replace=t),nrow=100) 

instead of using these 3 lines

inputfile <- paste0("simulation_",i,"_",metrics[k],"_us.csv") data <- residuals(inputfile)  residual_anomalies <- conceptdrift(data,length=10,threshold=.05) 

in nested loop. whole code below.

source("getmetrics.r") source("slowdrift_resampling_vectorized.r")  metrics <- unique(metrics) num_metrics <- length(metrics)  f1_scores_table_raw = data.frame(matrix(ncol=10,nrow=46)) f1_scores_table_pred = data.frame(matrix(ncol=10,nrow=46))  rownames(f1_scores_table_raw) <- metrics colnames(f1_scores_table_raw) <- paste0("sim",1:10)  rownames(f1_scores_table_pred) <- metrics colnames(f1_scores_table_pred) <- paste0("sim",1:10)   for(k in 1:num_metrics){    for(i in 1:10){     #inputfile <- paste0("simulation_",i,"_",metrics[k],"_us.csv")     #data <- residuals(inputfile)      #residual_anomalies <- conceptdrift(data,length=10,threshold=.05)      #the above how data frame i'll create 1 reproducibility.     residual_anomalies <- as.data.frame(matrix(sample(c('anomaly','no signal'),300,replace=t),nrow=100))     names(residual_anomalies) <- c("raw_anomaly","prediction_anomaly","true_anomaly")      #calculate precision , recall f1 score      #first raw data      counts <- ifelse(rowsums(residual_anomalies[c("raw_anomaly","true_anomaly")]=='anomaly')==2,1,0)     correct_detections <- sum(counts)      total_predicted = sum(residual_anomalies$raw_anomaly =='anomaly')     total_actual = sum(residual_anomalies$true_anomaly =='anomaly')      raw_precision = correct_detections / total_predicted     raw_recall = correct_detections / total_actual      f1_raw = 2*raw_precision*raw_recall / (raw_precision+raw_recall)      #then prediction (dlm,esp,mlr) data      counts <- ifelse(rowsums(residual_anomalies[c("prediction_anomaly","true_anomaly")]=='anomaly')==2,1,0)     correct_detections <- sum(counts)      total_predicted = sum(residual_anomalies$prediction_anomaly =='anomaly')     total_actual = sum(residual_anomalies$true_anomaly =='anomaly')      pred_precision = correct_detections / total_predicted     pred_recall = correct_detections / total_actual      f1_pred = 2*pred_precision*pred_recall / (pred_precision+pred_recall)      f1_scores_table_raw[[k,i]] <- f1_raw     f1_scores_table_pred[[k,i]] <- f1_pred   }  } 

before, using foreach on outer loop %dopar% issue i'm having kept getting issue '%dopar%' not found. should parallelize both loops or one?

also know foreach creates list , stores variable, can still have other variables store data in foreach loop? example, still want record data f1_scores_table_raw , f1_scores_table_pred arrays.

thanks!

foreach automatically handle if use %:% operator between loop levels (see "nesting" vignette):

require(foreach) # register parallel backend  foreach (k = 1:num_metrics) %:% # nesting operator   foreach (i = 1:10) %dopar% {     # code parallelise } 

Comments

Popular posts from this blog

PHP DOM loadHTML() method unusual warning -

python - How to create jsonb index using GIN on SQLAlchemy? -

c# - TransactionScope not rolling back although no complete() is called -