What about parallelism?
In the context of the topic outlined in Motivation - Multiprocessing, one of the factors motivating the implementation of this library is the introduction of parallelism in metalearners. We’ve discovered three potential levels for executing parallelism:
-
Base model level: Certain base models implement the option to use multiple threads during their training. Examples of these models include LightGBM or RandomForest from sklearn. These models can be instantiated with the
n_jobsparameter to use multi-threading.To use parallelism at this level, you can use the
nuisance_model_params,treatment_model_params, andpropensity_model_paramsparameters when instantiating the metalearner. -
Cross-fitting level: As explained in the cross-fit FAQ, cross-fitting is employed for all stand-alone nuisance and treatment models, irrespective of the MetaLearner. This introduces a new possible level of parallelism, as the model associated with each fold can be trained independently of the others.
To use parallelism at this level, you can use the
n_jobs_cross_fittingparameter of thefitmethod of the metalearner. -
Stage level: A majority of MetaLearners entail multiple nuisance models and/or treatment models. Within an individual stage, these models are independent of each other, an example of this would be one propensity model and an outcome model for each treatment variant. This independence translates into another possibility for parallelism.
To use parallelism at this level, you can use the
n_jobs_base_learnersparameter of thefitmethod of the MetaLearner.
Our experiments leveraging parallelism at various levels reveal that there is not a ‘one-size-fits-all’ setting; the optimal configuration varies significantly based on factors such as the choice of base models, the number of variants, the number of folds, or the number of observations.
We suggest assessing several configurations to determine the most effective approach for your specific use case. It is essential to remember that the most efficient configuration for one scenario may not work as effectively for another.