Why Conformal over Parametric
The Decision
The EPF system uses split conformal prediction for uncertainty quantification rather than parametric methods (e.g., Gaussian prediction intervals, quantile regression). This provides coverage guarantees without assuming any specific error distribution.
Parametric Approaches: Why They Fail
Gaussian Intervals
The simplest parametric approach assumes forecast errors follow a normal distribution:
Interval = prediction ± z × σwhere z = 1.645 for 90% CI, σ = standard deviation of errorsThis fails for electricity prices because:
- Heavy tails: Price errors have occasional extreme outliers (demand spikes, outages) that make the tails heavier than Gaussian
- Skewness: Errors are right-skewed — large positive errors (underestimation) are more common than large negative errors
- Heteroscedasticity: Error variance changes with hour of day, day of week, and price level. Peak hours have 2–3× the variance of off-peak hours
A Gaussian interval with σ estimated globally will be too wide during stable night hours and too narrow during volatile peak hours.
Quantile Regression
Quantile regression trains separate models for each quantile (e.g., 5th and 95th percentile). This avoids the Gaussian assumption but has its own problems:
- Crossing quantiles: Without constraints, the 5th percentile prediction can exceed the 95th percentile
- Double training: Two additional models per interval level, per horizon group
- No coverage guarantee: Empirical coverage depends entirely on model accuracy
Conformal Prediction: How It Works
Split conformal prediction computes intervals from the empirical distribution of actual prediction errors, requiring no distributional assumptions:
Step 1: Collect Residuals (During Training)
Cross-validation residuals are collected for each horizon bucket:
residual = actual_price - predicted_pricePositive residuals mean the model underpredicted. Negative residuals mean it overpredicted.
Step 2: Compute Quantiles (Per Horizon Bucket)
For a 90% confidence interval:
lower_shift = quantile(residuals, 0.05) # 5th percentileupper_shift = quantile(residuals, 0.95) # 95th percentileStep 3: Apply at Prediction Time
lower_bound = prediction + lower_shiftupper_bound = prediction + upper_shiftBecause the shifts come from the empirical error distribution, the intervals are automatically:
- Asymmetric: If underprediction errors are larger than overprediction errors, the upper band extends further
- Horizon-adaptive: Day 1 has tighter intervals than day 7 (separate calibration per bucket)
- Distribution-free: No assumption about error shape
Asymmetric Intervals
This is a key advantage. Electricity price errors are not symmetric — there’s more room to be wrong on the upside (price spikes to 200 EUR/MWh) than on the downside (prices rarely go below -20 EUR/MWh). Conformal prediction naturally captures this:
Example (peak hour, day 1): lower_shift = -8.2 EUR/MWh (model overpredicts by up to 8.2) upper_shift = +15.7 EUR/MWh (model underpredicts by up to 15.7)
Prediction: 55.0 EUR/MWh 90% CI: [46.8, 70.7] (asymmetric: 8.2 below, 15.7 above)A Gaussian interval would have been symmetric (±12.0), overcovering on the downside and undercovering on the upside.
Horizon Buckets
The system maintains separate residual distributions for each horizon bucket, because forecast uncertainty naturally grows with horizon:
| Bucket | Horizon | Typical 90% CI Width |
|---|---|---|
| DA1 | D+1 morning | Narrowest |
| DA2 | D+1 afternoon | Slightly wider |
| S1 | D+2 | Moderate |
| S2–S4 | D+3–D+5 | Wider |
| S5 | D+6–D+7 | Widest |
Coverage Guarantee
Conformal prediction provides a theoretical coverage guarantee: as the calibration set size approaches infinity, the coverage rate converges to the nominal level. In practice, with 168+ calibration residuals per bucket, coverage rates are within 2–3 percentage points of the target.
Two Confidence Levels
The system produces two interval levels:
- 50% CI (25th–75th percentile residuals): Tight band capturing the most likely price range
- 90% CI (5th–95th percentile residuals): Wide band capturing all but extreme surprises
Trade-off Summary
| Property | Conformal | Gaussian | Quantile Regression |
|---|---|---|---|
| Distribution assumption | None | Normal | None |
| Coverage guarantee | Yes (asymptotic) | Only if Gaussian | No |
| Asymmetric intervals | Yes (natural) | No | Yes |
| Horizon adaptation | Yes (per bucket) | Requires grouping | Requires separate models |
| Additional training | None | None | 2× models per level |
| Implementation complexity | Low | Low | Medium |