Skip to content

v7.2 — Residual Target Transform (Scout)

Date: March 21, 2026 | Status: Scout — 150-day, DA-only, single model

Why This Experiment Exists

After 79 experiments across v6.1–v7.1, all tree-based models hit a hard regression slope ceiling at 0.70-0.72 — predictions capture only 70% of actual price variation due to leaf-node averaging. The v7.0 compression-breaking campaign showed that price weighting can redistribute WHERE the model allocates capacity (toward peaks) but can’t increase total capacity. The v7.1 MLP experiment confirmed compression is model-specific (NNs CAN output 200+ EUR) but sklearn’s optimizer can’t find the right weights.

The hypothesis: Instead of changing the model, change what it predicts. If the model predicts deviation = price - baseline instead of raw EUR, the target distribution becomes symmetric around zero (~-60 to +80) rather than right-skewed (0-200+). Tree leaf averaging on a symmetric distribution shouldn’t cause systematic underprediction.

What We Tested

Two baseline options, each tested standalone and combined with the proven v7.0 config:

#NameTransformAdditional ConfigPurpose
1res-1wresidual_1wd12, q=0.55Isolate 1w transform impact
2res-4wresidual_4wd12, q=0.55Compare smoother baseline
3res-1w-pw3-d365residual_1w+ pw-3x@60 + d365Combined with proven config
4res-4w-pw3-d365residual_4w+ pw-3x@60 + d365Combined with proven config

Baseline definitions:

  • residual_1w: deviation = price - price_at_same_hour_7_days_ago — noisy but preserves short-term dynamics
  • residual_4w: deviation = price - mean(same_weekday_price_1w, 2w, 3w, 4w_ago) — smoother, more robust to one-off anomalies

All experiments used XGBoost depth=12, 150-day backtest window (2025-10-01 to 2026-03-17), DA-only.

Results

ExperimentMAEBiasSlopeEve MAE80-130130+Spike RecMaxPred
v7.0 best (ref)12.69-7.00.71019.876.0%135
res-1w-pw3-d36512.92-3.560.74915.8214.634.279.9%159
res-1w12.95-3.420.73616.1115.034.977.9%160
res-4w15.13-0.050.69017.2614.530.982.6%185
res-4w-pw3-d36515.52-0.960.67617.9315.732.879.3%168

Training residual statistics (confirming the transform works)

GroupBaselineMeanStdMinMax
DA11w-0.8144.20-385.78+343.60
DA14w-1.8739.78-222.40+397.21

The residual distribution is near-zero centered and roughly symmetric — exactly what the theory predicted.

What We Learned

1. The structural thesis IS confirmed

The residual transform broke through metrics that no previous experiment could move:

Metricv7.0v7.2 (res-1w-pw3)ChangeSignificance
Slope0.7100.749+5.5%First time any experiment moved the slope ceiling
Bias-7.0-3.56-51%Underprediction halved
80-130 MAE19.814.6-26%Bottleneck range massively improved
MaxPred135160+18.5%Prediction ceiling raised
Spike Rec76%80%+5%Better at catching spikes

2. Overall MAE didn’t improve — the accuracy redistribution problem

The residual transform makes the model much better at high prices but slightly worse at low prices. Since 66% of data is below 80 EUR (where raw models were already good), the overall MAE is dominated by the low-price degradation. The model is now more balanced but not more accurate on average.

This suggests that the path to sub-10 MAE requires improving both high AND low price accuracy simultaneously, not just trading between them.

3. 1-week baseline dramatically outperforms 4-week

BaselineMAEBiasSlope
1w12.92-3.420.736
4w15.13-0.050.690

The 4w baseline is too smooth — it absorbs too much variation into the baseline, leaving the residuals noisier and harder to predict. However, the 4w baseline achieves near-zero bias (-0.05) and MaxPred 185, showing that smoother baselines reduce compression more aggressively. The 4w baseline may be useful in a future hybrid approach.

4. Price weighting has modest interaction with residual transform

Adding pw-3x + d365 to the residual model improved MAE from 12.95 to 12.92 and slope from 0.736 to 0.749. The gains are smaller than when applied to raw targets (where pw-3x dropped MAE from 13.20 to 12.69). This makes sense: price weighting and residual transform both address the same problem (underprediction of high prices), so their effects partially overlap.

Decision

Scouted — structural thesis confirmed, but not promotion-ready. The residual transform proves that tree compression CAN be reduced by changing the prediction target. However, the overall MAE (12.92) slightly exceeds v7.0 best (12.69).

Next steps

The most promising paths building on this finding:

  1. S2 ratio transform (price / baseline) — multiplicative may handle scale better than additive residuals
  2. Quantile target tuning — with the transform already correcting bias, q=0.50 may outperform q=0.55
  3. S4 quantile ensemble — train q10/q50/q90 separately, reconstruct conditional mean
  4. PyTorch TFT — removes the slope ceiling entirely (no tree compression)