Stacked Ensembles for Advanced Predictive Modeling With H2O.ai and Optuna | by Sheila Teo | Dec, 2023


And how I placed top 10% in Europe’s largest machine learning competition with them!

Image generated by DALL·E 3, depicting a stacked landscape

We all know that ensemble models outperform any singular model at predictive modeling. You’ve probably heard all about Bagging and Boosting as common ensemble methods, with Random Forests and Gradient Boosting Machines as respective examples.

But what about ensembling different models together under a separate higher-level model? This is where stacked ensembles comes in. This article is step-by-step guide on how to train stacked ensembles using the popular machine learning library, H2O.

To demonstrate the power of stacked ensembles, I will provide a walk-through of my full code for training a stacked ensemble of 40 Deep Neural Network, XGBoost and LightGBM models for the prediction task posed in the 2023 Cloudflight Coding Competition (AI Category), one of the largest coding competitions in Europe, where I placed top 10% on the competition leaderboard within a training time of 1 hour!

This guide will cover:

  1. What are stacked ensembles and how do they work?
  2. How to train stacked ensembles with H2O.ai
  3. Comparing the performance of a stacked ensemble versus standalone models

A stacked ensemble combines predictions from multiple models through another, higher-level model, with the aim being to increase overall predictive performance by capitalizing on the unique strengths of each constituent model. It involves 2 stages:

Stage 1: Multiple Base Models

First, multiple base models are independently trained on the same training dataset. These models should ideally be diverse, ranging from simple linear regressions to complex deep learning models. The key is that they should differ from each other in some way, either in terms of algorithm or hyperparameter settings.

The more diverse the base models are, the more powerful the eventual stacked ensemble. This is because different models are able to capture different patterns in the…



Source link

Leave a Comment