v5.1 — Crisis Data Weighting (Rejected)
Date: March 17, 2026 | Status: Rejected — bias flipped positive (+11.92), overcorrected
Hypothesis
The v4.3 model has a consistent ~19–20% multiplicative underestimation for prices above 40 EUR. Two contributing factors:
-
Crisis price dominance: The 2022 energy crisis produced prices 3–5x higher than current levels. These extreme samples distort the loss landscape — the model learns to hedge against 300+ EUR errors at the cost of systematic underprediction in the 40–150 EUR range where prices actually live today.
-
Stale training data: Samples from 2022–2023 have equal weight to recent data, despite representing a fundamentally different market regime (war-driven gas prices, emergency capacity mechanisms).
What Changed
Winsorization (Cap at 200 EUR)
Training target prices are capped at 200 EUR/MWh before model fitting. This prevents extreme crisis prices from dominating the loss gradient.
- Affects 5.5–6.9% of training samples depending on horizon group
- Applied only to the target variable — features (including price lags) retain original values
- 200 EUR threshold chosen because: current price range is 20–160 EUR, and 200 EUR captures the full non-crisis distribution with margin
Time-Decay Sample Weighting
Exponential decay weighting with a 365-day halflife:
weight = exp(-ln(2) × days_ago / 365)- Most recent data: weight = 1.0
- 1 year ago: weight = 0.5
- 2 years ago: weight = 0.25
- Oldest data (~4 years): weight ≈ 0.058
This makes the model prioritize learning from recent market conditions while still using historical data for structural patterns.
Configuration
| Parameter | Value |
|---|---|
| Loss function | Quantile 0.55 |
| Ensemble | 3 models (HistGBT, LightGBM, XGBoost) |
| Features | 53–62 per horizon (feature selection applied) |
| Feature selection | Yes |
| Peak-split | No |
| Winsorize cap | 200 EUR |
| Sample weight halflife | 365 days |
Day-Ahead Results
152-day backtest (Oct 2025 – Mar 2026), ensemble:
| Metric | v4.3 | v5.1 | Change |
|---|---|---|---|
| MAE | 16.17 | 15.34 | -5.1% |
| MAPE | 51.3% | 45.3% | -11.7% |
| Bias | — | +11.92 | Shifted positive |
Note: v4.3 baseline is 16.17 on this 152-day window (not the 14.47 from the shorter 149-day window used in v4.3’s original evaluation).
Bias Shift
The winsorization + time-decay combination overcorrected the bias direction: from negative (underpredicting) to positive (+11.92, overpredicting). The proportional underestimation pattern (Cov-e) improved but was not eliminated (0.736 vs previous ~0.6).
Strategic Results
Strategic model training was not completed — the day-ahead bias flip was severe enough (+11.92 EUR/MWh) to conclude the approach was not viable without further investigation. The experiment was rejected before extending to strategic horizons.
What This Tells Us
The experiment confirmed that the model’s underprediction is partly driven by stale 2022-crisis data diluting gradient learning — MAE improved 5.1%. However, the combined approach of winsorization plus time-decay overcorrected the bias direction entirely (from -12 EUR/MWh to +11.92 EUR/MWh). This suggests:
- Bias corrections are sensitive to calibration — small changes in the decay rate or the price cap threshold produce large swings in bias direction, making the approach fragile to tune
- The two interventions interact — applying winsorization and time-decay simultaneously makes it impossible to isolate which factor drove the overcorrection, or how to correct it without a more systematic ablation
- Training data reweighting treats a symptom, not the cause — the structural underprediction was ultimately solved through an architectural change (LSTM embedding in v10.1) that increased the model’s predicted price range rather than rebalancing which historical periods it learns from