Horizon-Adaptive Calibration
Overview
Forecast uncertainty naturally grows with the prediction horizon — a D+1 forecast has less uncertainty than a D+7 forecast. The EPF system maintains separate calibration distributions for each horizon bucket, allowing confidence interval widths to adapt to the specific uncertainty profile of each prediction distance.
Why Horizon Matters
Consider two forecasts from the same model:
D+1 hour 14 (28 hours ahead): Conditions well-constrained by recent data, weather forecast reliable Typical error: ±3-5 EUR/MWh 90% CI width: ~15 EUR/MWh
D+7 hour 14 (172 hours ahead): Conditions uncertain, weather forecast degraded Typical error: ±8-12 EUR/MWh 90% CI width: ~35 EUR/MWhA single global calibration would set the same interval width for both, either too wide for D+1 (overconservative) or too narrow for D+7 (overconfident).
Bucket-Specific Calibration
Each horizon bucket maintains its own residual distribution:
Conformal calibrator:├── DA1 residuals: [-4.2, 1.5, -2.1, 3.8, ...] (14-25h ahead)├── DA2 residuals: [-5.1, 2.3, -3.0, 5.2, ...] (26-37h ahead)├── S1 residuals: [-6.8, 3.1, -4.2, 7.5, ...] (33-56h ahead)├── S2 residuals: [-8.2, 4.5, -5.1, 9.8, ...] (57-80h ahead)├── S3 residuals: [-9.1, 5.2, -6.0, 11.3, ...] (81-104h ahead)├── S4 residuals: [-9.8, 5.8, -6.5, 12.1, ...] (105-128h ahead)└── S5 residuals: [-11.5, 7.2, -8.0, 15.3, ...] (129-176h ahead)How Widths Grow
The 90% CI quantiles (5th and 95th percentile of residuals) naturally widen across buckets:
| Bucket | 5th percentile | 95th percentile | CI Width |
|---|---|---|---|
| DA1 | -6.5 | +8.2 | 14.7 |
| DA2 | -7.8 | +10.5 | 18.3 |
| S1 | -9.2 | +13.1 | 22.3 |
| S2 | -10.8 | +15.5 | 26.3 |
| S3 | -11.5 | +17.2 | 28.7 |
| S4 | -12.1 | +18.0 | 30.1 |
| S5 | -13.8 | +21.5 | 35.3 |
(Values are illustrative)
The monotonic widening from DA1 to S5 creates the characteristic “uncertainty funnel” visible on multi-day forecast charts.
Horizon-to-Bucket Mapping
Each forecast hour is mapped to its bucket:
bucket_for_horizon = { 14: "DA1", 15: "DA1", ..., 25: "DA1", 26: "DA2", 27: "DA2", ..., 37: "DA2", 33: "S1", 34: "S1", ..., 56: "S1", # ... 129: "S5", 130: "S5", ..., 176: "S5",}At prediction time, each hour’s forecast is paired with its bucket’s residual distribution to compute the appropriate interval width.
Calibration from Live Predictions
In addition to cross-validation residuals, the calibrator can be rebuilt from live prediction data:
def build_calibrator_from_predictions(predictions_df, min_samples=168): valid = predictions_df.dropna(subset=["actual_price", "predicted_price"])
if len(valid) < min_samples: return None # insufficient data
residuals = valid["actual_price"] - valid["predicted_price"] hours_ahead = compute_hours_ahead(valid)
calibrator = ConformalCalibrator() calibrator.fit(residuals, hours_ahead, horizon_buckets) return calibratorThis allows the intervals to adapt to current model accuracy rather than relying solely on historical CV residuals.
Minimum Sample Requirements
Reliable quantile estimation requires a minimum number of residuals per bucket:
| Confidence Level | Minimum Samples | Reason |
|---|---|---|
| 50% CI | 50+ | 25th/75th percentiles need moderate sample |
| 90% CI | 168+ | 5th/95th percentiles are tail estimates, need more data |
With fewer samples, quantile estimates are noisy and may produce intervals that are erratically wide or narrow. The system requires at least 168 residuals (one full week of hourly predictions) before generating intervals.
Adaptive Behavior
As the system accumulates more live prediction data, the calibrator can be periodically refreshed to reflect current conditions:
- After retraining: New model may have different error characteristics → recalibrate
- After market shift: Volatility regime change → recalibrate with recent residuals
- Routine maintenance: Monthly recalibration ensures intervals stay aligned with recent accuracy