LightGBM

Overview

LightGBM (Light Gradient Boosting Machine) was the second model in the v4.3 EPF ensemble. Developed by Microsoft Research, it uses a leaf-wise tree growth strategy and histogram-based binning that differs from the other ensemble members, contributing algorithmic diversity to the combined forecast.

Hyperparameters

Parameter	Value	Purpose
`objective`	`quantile`	Quantile loss function
`alpha`	`0.55`	Quantile target (55th percentile)
`n_estimators`	500	Maximum boosting iterations
`max_depth`	8	Maximum tree depth
`learning_rate`	0.05	Shrinkage per iteration
`min_child_samples`	20	Minimum samples per leaf
`reg_lambda`	0.1	L2 regularization on leaf values
`num_leaves`	63	Maximum leaves per tree
`random_state`	42	Reproducibility
`device`	`gpu` / `cpu`	Hardware acceleration

Leaf-Wise Growth

LightGBM’s key algorithmic difference is leaf-wise tree construction:

Level-wise (HistGBM, XGBoost):
         [root]
        /      \
      [L1]    [R1]         ← grow entire level
     /   \   /    \
   [L2] [R2] [L3] [R3]    ← grow entire level

Leaf-wise (LightGBM):
         [root]
        /      \
      [L1]    [R1]         ← grow best leaf only
     /   \
   [L2] [R2]              ← grow best leaf only
               \
              [R2b]        ← grow best leaf only

Leaf-wise growth always splits the leaf with the highest loss reduction, regardless of depth. This produces more accurate trees with the same number of leaves but can overfit on small datasets.

The num_leaves=63 parameter controls complexity: fewer leaves = simpler trees, more regularization.

GPU Acceleration

LightGBM supports native GPU training:

{"device": "gpu" if gpu_available else "cpu"}

GPU acceleration provides 3–5× speedup for training, which is particularly valuable during:

Hyperparameter tuning (hundreds of trial combinations)
Backtesting with daily retraining (months of daily model fits)
15-minute resolution training (4× more data than hourly)

Ensemble Diversity Contribution

The value of LightGBM in the ensemble comes from its algorithmic differences:

Aspect	HistGBM	LightGBM	XGBoost
Growth strategy	Level-wise	Leaf-wise	Level-wise
Framework	scikit-learn	Microsoft	Distributed ML
Binning	Histogram	Histogram	Histogram
Regularization	L2 on leaves	L2 + num_leaves	L1 + L2
Missing values	Native	Native	Native

Because LightGBM uses leaf-wise growth, it makes different splitting decisions than the level-wise models. This means its errors are partially uncorrelated — when LightGBM makes a mistake, the other models may not, and vice versa. Ensembling averages out these individual errors.

Feature Importance

LightGBM provides feature_importances_ (gain-based) and supports SHAP analysis for interpretable feature contributions. The most important features for LightGBM typically align with the overall feature importance ranking:

Price lags (24h, 168h)
Demand / residual demand
Temporal encoding (hour, day of week)
Commodity prices (gas, carbon)
Weather interactions

Training Process

Receive training data (features + target prices) for a specific horizon group
Split into 5-fold TimeSeriesSplit for cross-validation
Train with early stopping: if validation loss doesn’t improve for 20 rounds, stop
Record per-fold metrics (MAE, RMSE)
Collect out-of-fold residuals for conformal calibration
Save trained model + metadata as joblib artifact