Skip to content

HistGradientBoosting

HistGradientBoosting (HistGBT) is scikit-learn’s native implementation of histogram-based gradient boosting, and was one of the three base learners in the v4.3 EPF ensemble.

Key Characteristics

PropertyDetail
Libraryscikit-learn 1.5+
ClassHistGradientBoostingRegressor
GrowthDepth-wise (level-by-level)
Lossquantile (q=0.55)
NaN handlingNative — learns optimal split direction for missing values
Histogram binsUp to 255 bins per feature

Why HistGBT in the Ensemble?

Native Missing Value Support

The most important advantage: HistGBT handles NaN values natively during tree construction. At each split, the algorithm considers sending missing values to the left child or the right child and picks whichever direction minimizes the loss.

This is critical for EPF because:

  • D+1 price features are NaN during morning (day-ahead) runs when OMIE hasn’t published yet
  • Some weather variables may be unavailable at certain hours
  • Commodity data has gaps on weekends and holidays

No imputation step is needed — the model learns the best default direction for each feature.

Scikit-learn Ecosystem

Being a scikit-learn estimator means HistGBT integrates seamlessly with:

  • TimeSeriesSplit for cross-validation
  • Joblib serialization for model persistence
  • Standard .fit() / .predict() API

Configuration

HistGradientBoostingRegressor(
loss="quantile",
quantile=0.55,
max_iter=500,
max_depth=8,
learning_rate=0.05,
min_samples_leaf=20,
l2_regularization=0.1,
early_stopping=True,
validation_fraction=0.1,
n_iter_no_change=20,
random_state=42,
)

Parameter Notes

  • loss="quantile", quantile=0.55 — Targets the 55th percentile, slightly above the median, to counteract underprediction bias on right-skewed electricity prices
  • max_depth=8 — Allows moderately deep trees to capture complex feature interactions (e.g., temperature × demand × time-of-day)
  • early_stopping=True — Reserves 10% of training data as validation; stops after 20 rounds with no improvement to prevent overfitting
  • min_samples_leaf=20 — Prevents overly specific leaf nodes, providing regularization

Performance in the Ensemble

HistGBT’s strengths and weaknesses complement the other models:

  • Tends to produce lower variance predictions due to conservative depth-wise growth
  • Handles feature interactions well through sufficient tree depth
  • May underfit slightly compared to LightGBM’s more aggressive leaf-wise growth

In the v4.3 strategic product, HistGBT achieved a MAE of 25.92 EUR/MWh with a bias of -13.71 EUR/MWh. While its individual bias was significant, it was offset by the opposing biases of the other models when averaged in the ensemble.

Hyperparameter Optimization

HistGBT parameters can be tuned with Optuna:

ParameterSearch RangeScale
max_iter200–1500Linear
max_depth4–12Linear
learning_rate0.01–0.2Log
l2_regularization0.01–10.0Log
min_samples_leaf5–50Linear
max_bins128–255Linear

Optimization uses 5-fold TimeSeriesSplit with quantile loss as the objective, typically running 100 trials per horizon group.