v5.1 — Crisis Data Weighting (Rejected)

Date: March 17, 2026 | Status: Rejected — bias flipped positive (+11.92), overcorrected

Hypothesis

The v4.3 model has a consistent ~19–20% multiplicative underestimation for prices above 40 EUR. Two contributing factors:

Crisis price dominance: The 2022 energy crisis produced prices 3–5x higher than current levels. These extreme samples distort the loss landscape — the model learns to hedge against 300+ EUR errors at the cost of systematic underprediction in the 40–150 EUR range where prices actually live today.
Stale training data: Samples from 2022–2023 have equal weight to recent data, despite representing a fundamentally different market regime (war-driven gas prices, emergency capacity mechanisms).

What Changed

Winsorization (Cap at 200 EUR)

Training target prices are capped at 200 EUR/MWh before model fitting. This prevents extreme crisis prices from dominating the loss gradient.

Affects 5.5–6.9% of training samples depending on horizon group
Applied only to the target variable — features (including price lags) retain original values
200 EUR threshold chosen because: current price range is 20–160 EUR, and 200 EUR captures the full non-crisis distribution with margin

Time-Decay Sample Weighting

Exponential decay weighting with a 365-day halflife:

weight = exp(-ln(2) × days_ago / 365)

Most recent data: weight = 1.0
1 year ago: weight = 0.5
2 years ago: weight = 0.25
Oldest data (~4 years): weight ≈ 0.058

This makes the model prioritize learning from recent market conditions while still using historical data for structural patterns.

Configuration

Parameter	Value
Loss function	Quantile 0.55
Ensemble	3 models (HistGBT, LightGBM, XGBoost)
Features	53–62 per horizon (feature selection applied)
Feature selection	Yes
Peak-split	No
Winsorize cap	200 EUR
Sample weight halflife	365 days

Day-Ahead Results

152-day backtest (Oct 2025 – Mar 2026), ensemble:

Metric	v4.3	v5.1	Change
MAE	16.17	15.34	-5.1%
MAPE	51.3%	45.3%	-11.7%
Bias	—	+11.92	Shifted positive

Note: v4.3 baseline is 16.17 on this 152-day window (not the 14.47 from the shorter 149-day window used in v4.3’s original evaluation).

Bias Shift

The winsorization + time-decay combination overcorrected the bias direction: from negative (underpredicting) to positive (+11.92, overpredicting). The proportional underestimation pattern (Cov-e) improved but was not eliminated (0.736 vs previous ~0.6).

Strategic Results

Strategic model training was not completed — the day-ahead bias flip was severe enough (+11.92 EUR/MWh) to conclude the approach was not viable without further investigation. The experiment was rejected before extending to strategic horizons.

What This Tells Us

The experiment confirmed that the model’s underprediction is partly driven by stale 2022-crisis data diluting gradient learning — MAE improved 5.1%. However, the combined approach of winsorization plus time-decay overcorrected the bias direction entirely (from -12 EUR/MWh to +11.92 EUR/MWh). This suggests:

Bias corrections are sensitive to calibration — small changes in the decay rate or the price cap threshold produce large swings in bias direction, making the approach fragile to tune
The two interventions interact — applying winsorization and time-decay simultaneously makes it impossible to isolate which factor drove the overcorrection, or how to correct it without a more systematic ablation
Training data reweighting treats a symptom, not the cause — the structural underprediction was ultimately solved through an architectural change (LSTM embedding in v10.1) that increased the model’s predicted price range rather than rebalancing which historical periods it learns from