Skip to content

LightGBM

Overview

LightGBM (Light Gradient Boosting Machine) was the second model in the v4.3 EPF ensemble. Developed by Microsoft Research, it uses a leaf-wise tree growth strategy and histogram-based binning that differs from the other ensemble members, contributing algorithmic diversity to the combined forecast.

Hyperparameters

ParameterValuePurpose
objectivequantileQuantile loss function
alpha0.55Quantile target (55th percentile)
n_estimators500Maximum boosting iterations
max_depth8Maximum tree depth
learning_rate0.05Shrinkage per iteration
min_child_samples20Minimum samples per leaf
reg_lambda0.1L2 regularization on leaf values
num_leaves63Maximum leaves per tree
random_state42Reproducibility
devicegpu / cpuHardware acceleration

Leaf-Wise Growth

LightGBM’s key algorithmic difference is leaf-wise tree construction:

Level-wise (HistGBM, XGBoost):
[root]
/ \
[L1] [R1] ← grow entire level
/ \ / \
[L2] [R2] [L3] [R3] ← grow entire level
Leaf-wise (LightGBM):
[root]
/ \
[L1] [R1] ← grow best leaf only
/ \
[L2] [R2] ← grow best leaf only
\
[R2b] ← grow best leaf only

Leaf-wise growth always splits the leaf with the highest loss reduction, regardless of depth. This produces more accurate trees with the same number of leaves but can overfit on small datasets.

The num_leaves=63 parameter controls complexity: fewer leaves = simpler trees, more regularization.

GPU Acceleration

LightGBM supports native GPU training:

{"device": "gpu" if gpu_available else "cpu"}

GPU acceleration provides 3–5× speedup for training, which is particularly valuable during:

  • Hyperparameter tuning (hundreds of trial combinations)
  • Backtesting with daily retraining (months of daily model fits)
  • 15-minute resolution training (4× more data than hourly)

Ensemble Diversity Contribution

The value of LightGBM in the ensemble comes from its algorithmic differences:

AspectHistGBMLightGBMXGBoost
Growth strategyLevel-wiseLeaf-wiseLevel-wise
Frameworkscikit-learnMicrosoftDistributed ML
BinningHistogramHistogramHistogram
RegularizationL2 on leavesL2 + num_leavesL1 + L2
Missing valuesNativeNativeNative

Because LightGBM uses leaf-wise growth, it makes different splitting decisions than the level-wise models. This means its errors are partially uncorrelated — when LightGBM makes a mistake, the other models may not, and vice versa. Ensembling averages out these individual errors.

Feature Importance

LightGBM provides feature_importances_ (gain-based) and supports SHAP analysis for interpretable feature contributions. The most important features for LightGBM typically align with the overall feature importance ranking:

  1. Price lags (24h, 168h)
  2. Demand / residual demand
  3. Temporal encoding (hour, day of week)
  4. Commodity prices (gas, carbon)
  5. Weather interactions

Training Process

  1. Receive training data (features + target prices) for a specific horizon group
  2. Split into 5-fold TimeSeriesSplit for cross-validation
  3. Train with early stopping: if validation loss doesn’t improve for 20 rounds, stop
  4. Record per-fold metrics (MAE, RMSE)
  5. Collect out-of-fold residuals for conformal calibration
  6. Save trained model + metadata as joblib artifact