XGBoost

Overview

XGBoost is the sole production gradient-boosting model for all four EPF countries (ES, PT, FR, DE) since the M0.6 Phase F cutover on 2026-04-09. Each country has its own joblib per horizon group (day-ahead, strategic), trained on that country’s own price history and country-specific features, but using a shared base recipe derived from the v11.0 ES winner.

No ensemble. No LSTM. The pre-2026 ensemble (HistGradientBoosting + LightGBM + XGBoost equal-weight) was retired at Phase F cutover; the v10.x LSTM+XGBoost hybrid was retracted on the same day. Single XGBoost strictly dominates both on the same evaluation window.

Base recipe (shared across countries)

Parameter	Value	Purpose
`objective`	`reg:quantileerror`	Quantile loss function
`quantile_alpha`	`0.55`	Predict the 55th percentile — mild right-skew to reduce spike under-prediction
`max_depth`	12	Deep trees to capture interaction between ~90 features
`learning_rate`	0.03	Low shrinkage + high n_estimators balances bias and variance
`min_child_weight`	5	Smaller leaves than the legacy ensemble (was 20)
`reg_lambda`	0.3	L2 regularization on leaf values
`tree_method`	`hist`	Histogram-based binning
Price weighting	3× above 60 EUR/MWh	Up-weights scarcity and spike samples
Sample decay	365-day halflife	Recent data weighs more than older data
Target transform	`residual_1w` (except FR)	Predict deviation from the price one week prior at the same slot

Per-country hyperparameters

The base recipe is shared; what varies per country is the feature set (notably cross-price gating) and the target transform:

Country	Target transform	Cross-country prices	DA MAE (145-day)	Notes
ES	`residual_1w`	Off (Z3 ablation wins)	13.99 (v12.0-abl)	ES sets the Iberian price, so neighbour prices are noise
PT	`residual_1w`	DA off, ST on	DA 21.94 / ST 24.67	The only country where cross-prices help the strategic horizon
FR	None (raw EUR)	Off	24.52 (v6.0-abl)	Nuclear baseload means stable weekly patterns — residual_1w adds noise
DE	`residual_1w`	On	27.64 (v6.0)	Only country with a measurable lift from cross-prices

Cross-price gating is controlled at training/inference time by the EPF_CROSS_PRICE_COUNTRIES env var. Production: DE only. See cross-price gating for the ablation analysis.

Feature set (post-v6.0)

~90 tabular features per country, grouped:

Price lags / rolling stats — 24h, 48h, 168h, with rolling means and z-scores
Temporal / calendar — hour-of-day sin/cos, day-of-week, month, country-aware holiday indicators (Z1)
Demand and generation — hourly generation mix, generation-forecast targets for wind and solar (Z2)
Renewable mix ratios and interconnection flows (where available)
Weather — temperature, wind speed, ghi/dni/clear-sky, precipitation from population-weighted stations per country
Solar elevation at country-specific latitude (Z4)
Commodities — TTF gas, Brent, ETS carbon with dynamics (change, z-score, rolling)
Cross-country prices (Z3) — only injected when EPF_CROSS_PRICE_COUNTRIES includes the country

Training Process

Receive country + horizon-group slice of training data
Apply target transform (residual_1w for ES/PT/DE, none for FR)
Walk-forward 145-day backtest window, 2025-11-01 → 2026-03-25
Train with quantile loss objective (q=0.55), price weighting, sample decay
Record per-fold metrics and out-of-fold residuals for conformal calibration
Save model + metadata + feature list + transform as joblib artifact per country / horizon / date

Joblib filename convention: direct_model_xgboost_<COUNTRY>_15min_hybrid_<dayahead|strategic>_<YYYY-MM-DD>.joblib (ES drops the <COUNTRY>_ prefix for legacy reasons). Artifacts live at data/models/ locally and gs://epf-models-epriceforecaster/<COUNTRY>/<VERSION>/ for Cloud Run.

GPU Acceleration

XGBoost supports CUDA-based GPU training:

{"device": "cuda" if gpu_available else "cpu"}

Typical speedup is 3–5× for the training workload. GPU is optional — production retrains run on CPU.

Feature Importance

Post-training analyses use:

Gain — total improvement in loss from each feature’s splits (default plot)
SHAP values — game-theoretic feature attribution (most accurate, used for Price Drivers UI)

Why single XGBoost instead of ensemble

The v4.3-era ensemble combined HistGBM + LightGBM + XGBoost equally. When the v8 scout configuration (single XGBoost + residual_1w + pw3 + d365) was evaluated on the same window, it matched or beat the full ensemble on every metric — the other two estimators were adding no diversity signal beyond what XGBoost alone captures. The Phase F cutover retired HistGBM and LightGBM from production.

See the ensemble strategy page for the historical ensemble and retracted LSTM hybrid, and the v11.0 changelog for the retraction narrative.