Hyperparameter Tuning
Overview
Model hyperparameters (learning rate, tree depth, regularization strength, etc.) significantly affect forecast accuracy. The EPF system uses Optuna, a Bayesian optimization framework, to efficiently search the hyperparameter space for each model type.
Why Optuna?
Traditional approaches like grid search or random search are inefficient:
- Grid search: Evaluates every combination → exponential in number of parameters
- Random search: Better coverage but still wastes evaluations on poor regions
- Bayesian optimization (Optuna): Learns which regions are promising and focuses search there
Optuna uses a Tree-structured Parzen Estimator (TPE) to build a probabilistic model of the objective function, suggesting parameter combinations that are likely to improve on the best result so far.
Objective Function
The optimization objective is the mean cross-validated MAE across 5-fold TimeSeriesSplit:
objective = mean(MAE_fold_1, MAE_fold_2, ..., MAE_fold_5)This ensures tuned parameters generalize across different time periods rather than overfitting to a single validation window.
Search Spaces
HistGradientBoosting
| Parameter | Range | Scale |
|---|---|---|
max_iter | 200–1500 | Linear |
max_depth | 4–12 | Linear |
learning_rate | 0.01–0.2 | Log |
min_samples_leaf | 5–50 | Linear |
l2_regularization | 0.01–10.0 | Log |
max_bins | 128–255 | Linear |
LightGBM
| Parameter | Range | Scale |
|---|---|---|
n_estimators | 200–1500 | Linear |
max_depth | 4–12 | Linear |
learning_rate | 0.01–0.2 | Log |
min_child_samples | 5–50 | Linear |
reg_lambda | 0.01–10.0 | Log |
reg_alpha | 0.001–1.0 | Log |
num_leaves | 20–200 | Linear |
subsample | 0.6–1.0 | Linear |
colsample_bytree | 0.6–1.0 | Linear |
XGBoost
| Parameter | Range | Scale |
|---|---|---|
n_estimators | 200–1500 | Linear |
max_depth | 4–12 | Linear |
learning_rate | 0.01–0.2 | Log |
reg_lambda | 0.01–10.0 | Log |
reg_alpha | 0.001–1.0 | Log |
subsample | 0.6–1.0 | Linear |
colsample_bytree | 0.6–1.0 | Linear |
Log-Scale Parameters
Parameters like learning_rate and reg_lambda are searched on a logarithmic scale because:
- The difference between 0.01 and 0.02 is more impactful than between 0.19 and 0.20
- Log scale provides uniform coverage across orders of magnitude
- Prevents the search from spending too many trials in the high end of the range
Pruning
Optuna supports early pruning of unpromising trials. If a trial’s first 2 CV folds produce a MAE much worse than the current best, the remaining folds are skipped:
Trial 47: fold 1 MAE = 12.5, fold 2 MAE = 11.8 → Current best: 3.8 MAE → Pruned (no chance of beating best)This significantly reduces tuning time — typically 30–50% of trials are pruned.
Tuning Workflow
1. Define search space for model type2. Create Optuna study (minimize objective)3. For each trial (50–200 trials): a. Optuna suggests parameter combination b. Train model with 5-fold TimeSeriesSplit c. Compute mean CV MAE d. Report result to Optuna e. (Optuna updates its model of the objective function)4. Extract best parameters5. Retrain final model with best parameters on full training data6. Save model + tuned parameters as artifactDefault vs Tuned Parameters
The EPF system ships with carefully chosen default parameters that work well across typical market conditions:
# Defaults (good starting point){"max_depth": 8, "learning_rate": 0.05, "n_estimators": 500}Optuna tuning typically improves MAE by 3–8% over defaults, with the largest gains coming from:
- Learning rate + iterations: Finding the optimal trade-off between slow learning (many iterations) and fast learning (fewer iterations)
- Regularization: Matching L2 strength to the noise level in the data
- Tree complexity: Adjusting depth and leaf count to the signal-to-noise ratio
When to Retune
Hyperparameters should be retuned when:
- Market structure changes significantly (new regulations, plant closures)
- The feature set is updated (new features added, old features removed)
- Model drift persists after retraining with current parameters
- Seasonal performance differences suggest one set of parameters doesn’t fit all conditions
Regular retuning (quarterly or after major system changes) keeps parameters aligned with current data characteristics.