# Multivariate Regression Ensemble Models for Errors Prediction

Written on 2023-11-29

In the last blog post about Multistep forecasting losses, I showed the usage of the fantastic method adam from the smooth R package on household electricity consumption data, and compared it with benchmarks.

Since I computed predictions from 10 methods/models for a long period of time, it would be nice to create some ensemble models for precise prediction for our household consumption data. For that purpose, it would be great to predict for example future errors of these methods. It is used in some known ensemble methods, which are not direct about stacking. Predicting errors can be beneficial for prediction weighting or for predicting the rank of methods (i.e. best one prediction). For the sake of learning something new, I will try multivariate regression models, so learning from multiple targets at once. At least, it has the benefit of simplicity, that we need only one model for all base prediction models.

What I will show you in this post:

• Prediction of forecasting errors (absolute residuals) of simple exponential smoothing methods.
• Usage of multivariate multiple linear regression benchmarks.
• Check of right rank prediction of forecasting methods.
• Catboost MultiRMSE capability and evaluation if it is more powerful than linear models.
• Computation of two types of ensemble predictions from predicted errors.

### Simple forecasting methods error prediction

In the beginning, we need to load the required packages, be aware that catboost has to be installed by the devtools package in R.

I will use predictions and errors from a previous blog post, you can download them from my GitHub repository.

We will model absolute errors of predictions:

To be able to model all method’s predictions at once, we need to cast predictions and errors to wide form.

Since we will train the regression model, there can be added also other helpful features. Let’s define helping daily and weekly seasonality features in the form of Sinus and Cosines. You can add also other features like holidays and weather parameters if it makes sense for your case.

Let’s split the train and test set: the first 170 days of predictions will go to the train part, rest of the 44 days for the testing (evaluation) part.

Let’s see our error targets:

### Multivariate Linear Regression Model

As a benchmark, as opposed to catboost, we will try a simple multivariate multiple linear regression model by the base lm function. You just need to use the cbind function inside the formula to incorporate multiple targets in the model.

Summaries of the model would be very large to print here, so I commented on the most useful functions to do your analysis of the trained model with summary, manova, coef, and confint.

Now, let’s predict errors on our selected test set and compute directly prediction errors and Rank accuracy.

We can see that the best MAEs have GTMSE, TMSE, and GPL losses. On the other hand, best Rank estimation has MSEh loss and best first Rank estimation accuracy has simple MAE loss method. In general, we miss Rank on average by 2.6 from 10 possible ranks.

Let’s see error predictions from four of our methods.

At first sight, it can be seen that we can nicely hit low and medium magnitudes of errors. Again, as shown in my previous post, higher values can’t be predicted.

### Shrinkage - LASSO with full interactions

As the second multivariate model benchmark, I will try LASSO (glmnet) shrinkage capability and use all possible interactions to model (by using .^2 in the formula).

To check estimated coefficients, we can use the predict method.

Let’s predict errors on the test set and compute prediction errors.

We can see that the best MAEs have the same methods as with MMLM. On the other hand, best Rank estimation has GTMSE loss and best first Rank estimation accuracy has MSCE multistep loss method. In general, we miss Rank on average by 2.9 from 10 possible ranks, so worse by 0.3 as opposed to MMLM. We can say that LASSO didn’t much help to our error modeling.

But, let’s see LASSO error predictions from four of our methods.

### Catboost MultiRMSE

The final multivariate regression method used will be Catboost gradient boosting trees with Multi-target regression loss - MultiRMSE, you can check documentation for this loss here: https://catboost.ai/en/docs/concepts/loss-functions-multiregression.

Let’s define the training method with early stopping based on the validation set and directly train on our dataset.

We can check the Feature Importance of the trained model:

STL_ETS and MAE simple loss predictions have the highest importance to the multivariate model.

Let’s predict the test set and compute prediction errors.

We can see that the MAE and MdAE metrics were improved by catboost. On the other hand, average Rank estimation and first Rank estimation are not better than with MMLM.

Let’s plot our predicted errors.

### Models analysis

To have it nicely together, let’s bind all three models’ predictions and compute prediction errors for models and methods.

As already mentioned, the best multivariate regression model based on classical prediction metrics is catboosts MultiRMSE. On the other hand, the best model based on right Rank estimation is simple multivariate linear regression.

Let’s do some basic model analysis and see a boxplot of absolute errors.

Predicted vs. real values scatter plot:

Errors vs. real values scatter plot:

And heteroscedasticity check, so absolute errors vs. real values:

From all the above graphs, we can see that low and medium magnitudes of errors can be nicely predicted, but again prediction of high values of errors is a very difficult task.

We can use nonparametric multiple comparisons test (Nemenyi test) also to decide which error model is best (by Absolute Errors - AE).

Catboost is best all the way.

We can also use the Nemenyi test on a question that which of the 10 model errors can be predicted the best.

We see that GTMSE and TMSE have the most predictable errors, for non-multistep losses methods (benchmarks) it is more difficult.

### Ensemble predictions

Finally, we can now construct some ensemble predictions by our multivariate error model predictions.

#### Best Rank prediction ensemble

The first method will be very simple, we will use as final ensemble prediction the prediction with the lowest estimated error.

Let’s evaluate ensemble predictions by our error models and also compute metrics for our base predictions for comparison.

All three ensembles are better than any base prediction based on MAE. Well done!

Let’s plot a graph of the final ensemble predictions:

Catboost doesn’t go that much below zero than the other two.

#### Weighted average Ensemble

The second ensemble method will be based on weighting the estimated errors. Firstly, we need to normalize all weights (errors) by the Min-max method and compute the sum ratio.

We can use all predictions, but a better way is to remove the bad half of the predictions from the ensemble:

Now, let’s join weights and predictions and compute final ensemble predictions.

Let’s evaluate them:

Even better with weighting than the best rank way!

Let’s plot a graph of ensemble predictions:

Ultimately, let’s decide which model is best for household electricity consumption prediction:

Weighted average ensemble prediction using the MMLM method is best!

### Summary

• I showed how easily can be modeled multiple targets with three methods: MMLM, LASSO, and Catboost in R
• I trained a model for predicting absolute errors of mostly simple exponential smoothing methods
• The multivariate models can nicely predict errors of low-medium magnitude
• Again the high errors can not be predicted with those simple models
• Catboost showed the best results in regard of absolute error measure
• I used predicted errors to create two types of ensemble predictions of electricity consumption
• Both ensemble methods showed better results than any original prediction