I am trying to build ensembles to enhance the performance of predictive models and I am using caretEnsemble package for this purpose. I know that caretList is used to create the list of algorithms that we want to be in the ensemble and then we use caretStack or caretEnsemble function to train the ensemble and use the new model in predicting test data. I have used different combination of these algorithms "rpart" "glmboost" "svmRadial" "svmLinear" "rf" "xgbTree" "xgbLinear". It never resulted in good performance in the ensemble. This is an example of code I have used:
set.seed(1234) model_list <- caretList(X_train, y_train, trControl = my_control, methodList = c("lm", "glmboost", "svmLinear"), tuneList = NULL, continue_on_fail = FALSE, preProcess = c("center","scale")) #Ensemble using caretStack set.seed(123) ensemble2 <- caretStack(model_list, method = "gbm", metric = "RMSE", trControl = my_control) #Ensemble using caretEnsemble set.seed(123) ensemble1 <- caretEnsemble(model_list, metric = "RMSE", trControl = my_control)
I am not satisfied with the results of ensemble and did not have at least one ensemble model that exceeded the performance of one the individual model.
Also I have used model_list to predict the test data and that was the only time I got good results. I know it does not make sense since caretList just simply creates a list of models and does not train an ensembling model but how come it achieved good results?