LightGBM
Overview
LightGBM (Light Gradient Boosting Machine) was the second model in the v4.3 EPF ensemble. Developed by Microsoft Research, it uses a leaf-wise tree growth strategy and histogram-based binning that differs from the other ensemble members, contributing algorithmic diversity to the combined forecast.
Hyperparameters
| Parameter | Value | Purpose |
|---|---|---|
objective | quantile | Quantile loss function |
alpha | 0.55 | Quantile target (55th percentile) |
n_estimators | 500 | Maximum boosting iterations |
max_depth | 8 | Maximum tree depth |
learning_rate | 0.05 | Shrinkage per iteration |
min_child_samples | 20 | Minimum samples per leaf |
reg_lambda | 0.1 | L2 regularization on leaf values |
num_leaves | 63 | Maximum leaves per tree |
random_state | 42 | Reproducibility |
device | gpu / cpu | Hardware acceleration |
Leaf-Wise Growth
LightGBM’s key algorithmic difference is leaf-wise tree construction:
Level-wise (HistGBM, XGBoost): [root] / \ [L1] [R1] ← grow entire level / \ / \ [L2] [R2] [L3] [R3] ← grow entire level
Leaf-wise (LightGBM): [root] / \ [L1] [R1] ← grow best leaf only / \ [L2] [R2] ← grow best leaf only \ [R2b] ← grow best leaf onlyLeaf-wise growth always splits the leaf with the highest loss reduction, regardless of depth. This produces more accurate trees with the same number of leaves but can overfit on small datasets.
The num_leaves=63 parameter controls complexity: fewer leaves = simpler trees, more regularization.
GPU Acceleration
LightGBM supports native GPU training:
{"device": "gpu" if gpu_available else "cpu"}GPU acceleration provides 3–5× speedup for training, which is particularly valuable during:
- Hyperparameter tuning (hundreds of trial combinations)
- Backtesting with daily retraining (months of daily model fits)
- 15-minute resolution training (4× more data than hourly)
Ensemble Diversity Contribution
The value of LightGBM in the ensemble comes from its algorithmic differences:
| Aspect | HistGBM | LightGBM | XGBoost |
|---|---|---|---|
| Growth strategy | Level-wise | Leaf-wise | Level-wise |
| Framework | scikit-learn | Microsoft | Distributed ML |
| Binning | Histogram | Histogram | Histogram |
| Regularization | L2 on leaves | L2 + num_leaves | L1 + L2 |
| Missing values | Native | Native | Native |
Because LightGBM uses leaf-wise growth, it makes different splitting decisions than the level-wise models. This means its errors are partially uncorrelated — when LightGBM makes a mistake, the other models may not, and vice versa. Ensembling averages out these individual errors.
Feature Importance
LightGBM provides feature_importances_ (gain-based) and supports SHAP analysis for interpretable feature contributions. The most important features for LightGBM typically align with the overall feature importance ranking:
- Price lags (24h, 168h)
- Demand / residual demand
- Temporal encoding (hour, day of week)
- Commodity prices (gas, carbon)
- Weather interactions
Training Process
- Receive training data (features + target prices) for a specific horizon group
- Split into 5-fold TimeSeriesSplit for cross-validation
- Train with early stopping: if validation loss doesn’t improve for 20 rounds, stop
- Record per-fold metrics (MAE, RMSE)
- Collect out-of-fold residuals for conformal calibration
- Save trained model + metadata as joblib artifact