HistGradientBoosting

HistGradientBoosting (HistGBT) is scikit-learn’s native implementation of histogram-based gradient boosting, and was one of the three base learners in the v4.3 EPF ensemble.

Key Characteristics

Property	Detail
Library	scikit-learn 1.5+
Class	`HistGradientBoostingRegressor`
Growth	Depth-wise (level-by-level)
Loss	`quantile` (q=0.55)
NaN handling	Native — learns optimal split direction for missing values
Histogram bins	Up to 255 bins per feature

Why HistGBT in the Ensemble?

Native Missing Value Support

The most important advantage: HistGBT handles NaN values natively during tree construction. At each split, the algorithm considers sending missing values to the left child or the right child and picks whichever direction minimizes the loss.

This is critical for EPF because:

D+1 price features are NaN during morning (day-ahead) runs when OMIE hasn’t published yet
Some weather variables may be unavailable at certain hours
Commodity data has gaps on weekends and holidays

No imputation step is needed — the model learns the best default direction for each feature.

Scikit-learn Ecosystem

Being a scikit-learn estimator means HistGBT integrates seamlessly with:

TimeSeriesSplit for cross-validation
Joblib serialization for model persistence
Standard .fit() / .predict() API

Configuration

HistGradientBoostingRegressor(
    loss="quantile",
    quantile=0.55,
    max_iter=500,
    max_depth=8,
    learning_rate=0.05,
    min_samples_leaf=20,
    l2_regularization=0.1,
    early_stopping=True,
    validation_fraction=0.1,
    n_iter_no_change=20,
    random_state=42,
)

Parameter Notes

loss="quantile", quantile=0.55 — Targets the 55th percentile, slightly above the median, to counteract underprediction bias on right-skewed electricity prices
max_depth=8 — Allows moderately deep trees to capture complex feature interactions (e.g., temperature × demand × time-of-day)
early_stopping=True — Reserves 10% of training data as validation; stops after 20 rounds with no improvement to prevent overfitting
min_samples_leaf=20 — Prevents overly specific leaf nodes, providing regularization

Performance in the Ensemble

HistGBT’s strengths and weaknesses complement the other models:

Tends to produce lower variance predictions due to conservative depth-wise growth
Handles feature interactions well through sufficient tree depth
May underfit slightly compared to LightGBM’s more aggressive leaf-wise growth

In the v4.3 strategic product, HistGBT achieved a MAE of 25.92 EUR/MWh with a bias of -13.71 EUR/MWh. While its individual bias was significant, it was offset by the opposing biases of the other models when averaged in the ensemble.

Hyperparameter Optimization

HistGBT parameters can be tuned with Optuna:

Parameter	Search Range	Scale
`max_iter`	200–1500	Linear
`max_depth`	4–12	Linear
`learning_rate`	0.01–0.2	Log
`l2_regularization`	0.01–10.0	Log
`min_samples_leaf`	5–50	Linear
`max_bins`	128–255	Linear

Optimization uses 5-fold TimeSeriesSplit with quantile loss as the objective, typically running 100 trials per horizon group.