v7.2 — Residual Target Transform (Scout)

Date: March 21, 2026 | Status: Scout — 150-day, DA-only, single model

Why This Experiment Exists

After 79 experiments across v6.1–v7.1, all tree-based models hit a hard regression slope ceiling at 0.70-0.72 — predictions capture only 70% of actual price variation due to leaf-node averaging. The v7.0 compression-breaking campaign showed that price weighting can redistribute WHERE the model allocates capacity (toward peaks) but can’t increase total capacity. The v7.1 MLP experiment confirmed compression is model-specific (NNs CAN output 200+ EUR) but sklearn’s optimizer can’t find the right weights.

The hypothesis: Instead of changing the model, change what it predicts. If the model predicts deviation = price - baseline instead of raw EUR, the target distribution becomes symmetric around zero (~-60 to +80) rather than right-skewed (0-200+). Tree leaf averaging on a symmetric distribution shouldn’t cause systematic underprediction.

What We Tested

Two baseline options, each tested standalone and combined with the proven v7.0 config:

#	Name	Transform	Additional Config	Purpose
1	res-1w	residual_1w	d12, q=0.55	Isolate 1w transform impact
2	res-4w	residual_4w	d12, q=0.55	Compare smoother baseline
3	res-1w-pw3-d365	residual_1w	+ pw-3x@60 + d365	Combined with proven config
4	res-4w-pw3-d365	residual_4w	+ pw-3x@60 + d365	Combined with proven config

Baseline definitions:

residual_1w: deviation = price - price_at_same_hour_7_days_ago — noisy but preserves short-term dynamics
residual_4w: deviation = price - mean(same_weekday_price_1w, 2w, 3w, 4w_ago) — smoother, more robust to one-off anomalies

All experiments used XGBoost depth=12, 150-day backtest window (2025-10-01 to 2026-03-17), DA-only.

Results

Experiment	MAE	Bias	Slope	Eve MAE	80-130	130+	Spike Rec	MaxPred
v7.0 best (ref)	12.69	-7.0	0.710	—	19.8	—	76.0%	135
res-1w-pw3-d365	12.92	-3.56	0.749	15.82	14.6	34.2	79.9%	159
res-1w	12.95	-3.42	0.736	16.11	15.0	34.9	77.9%	160
res-4w	15.13	-0.05	0.690	17.26	14.5	30.9	82.6%	185
res-4w-pw3-d365	15.52	-0.96	0.676	17.93	15.7	32.8	79.3%	168

Training residual statistics (confirming the transform works)

Group	Baseline	Mean	Std	Min	Max
DA1	1w	-0.81	44.20	-385.78	+343.60
DA1	4w	-1.87	39.78	-222.40	+397.21

The residual distribution is near-zero centered and roughly symmetric — exactly what the theory predicted.

What We Learned

1. The structural thesis IS confirmed

The residual transform broke through metrics that no previous experiment could move:

Metric	v7.0	v7.2 (res-1w-pw3)	Change	Significance
Slope	0.710	0.749	+5.5%	First time any experiment moved the slope ceiling
Bias	-7.0	-3.56	-51%	Underprediction halved
80-130 MAE	19.8	14.6	-26%	Bottleneck range massively improved
MaxPred	135	160	+18.5%	Prediction ceiling raised
Spike Rec	76%	80%	+5%	Better at catching spikes

2. Overall MAE didn’t improve — the accuracy redistribution problem

The residual transform makes the model much better at high prices but slightly worse at low prices. Since 66% of data is below 80 EUR (where raw models were already good), the overall MAE is dominated by the low-price degradation. The model is now more balanced but not more accurate on average.

This suggests that the path to sub-10 MAE requires improving both high AND low price accuracy simultaneously, not just trading between them.

3. 1-week baseline dramatically outperforms 4-week

Baseline	MAE	Bias	Slope
1w	12.92	-3.42	0.736
4w	15.13	-0.05	0.690

The 4w baseline is too smooth — it absorbs too much variation into the baseline, leaving the residuals noisier and harder to predict. However, the 4w baseline achieves near-zero bias (-0.05) and MaxPred 185, showing that smoother baselines reduce compression more aggressively. The 4w baseline may be useful in a future hybrid approach.

4. Price weighting has modest interaction with residual transform

Adding pw-3x + d365 to the residual model improved MAE from 12.95 to 12.92 and slope from 0.736 to 0.749. The gains are smaller than when applied to raw targets (where pw-3x dropped MAE from 13.20 to 12.69). This makes sense: price weighting and residual transform both address the same problem (underprediction of high prices), so their effects partially overlap.

Decision

Scouted — structural thesis confirmed, but not promotion-ready. The residual transform proves that tree compression CAN be reduced by changing the prediction target. However, the overall MAE (12.92) slightly exceeds v7.0 best (12.69).

Next steps

The most promising paths building on this finding:

S2 ratio transform (price / baseline) — multiplicative may handle scale better than additive residuals
Quantile target tuning — with the transform already correcting bias, q=0.50 may outperform q=0.55
S4 quantile ensemble — train q10/q50/q90 separately, reconstruct conditional mean
PyTorch TFT — removes the slope ceiling entirely (no tree compression)