Why Conformal over Parametric

The Decision

The EPF system uses split conformal prediction for uncertainty quantification rather than parametric methods (e.g., Gaussian prediction intervals, quantile regression). This provides coverage guarantees without assuming any specific error distribution.

Parametric Approaches: Why They Fail

Gaussian Intervals

The simplest parametric approach assumes forecast errors follow a normal distribution:

Interval = prediction ± z × σ
where z = 1.645 for 90% CI, σ = standard deviation of errors

This fails for electricity prices because:

Heavy tails: Price errors have occasional extreme outliers (demand spikes, outages) that make the tails heavier than Gaussian
Skewness: Errors are right-skewed — large positive errors (underestimation) are more common than large negative errors
Heteroscedasticity: Error variance changes with hour of day, day of week, and price level. Peak hours have 2–3× the variance of off-peak hours

A Gaussian interval with σ estimated globally will be too wide during stable night hours and too narrow during volatile peak hours.

Quantile Regression

Quantile regression trains separate models for each quantile (e.g., 5th and 95th percentile). This avoids the Gaussian assumption but has its own problems:

Crossing quantiles: Without constraints, the 5th percentile prediction can exceed the 95th percentile
Double training: Two additional models per interval level, per horizon group
No coverage guarantee: Empirical coverage depends entirely on model accuracy

Conformal Prediction: How It Works

Split conformal prediction computes intervals from the empirical distribution of actual prediction errors, requiring no distributional assumptions:

Step 1: Collect Residuals (During Training)

Cross-validation residuals are collected for each horizon bucket:

residual = actual_price - predicted_price

Positive residuals mean the model underpredicted. Negative residuals mean it overpredicted.

Step 2: Compute Quantiles (Per Horizon Bucket)

For a 90% confidence interval:

lower_shift = quantile(residuals, 0.05)    # 5th percentile
upper_shift = quantile(residuals, 0.95)    # 95th percentile

Step 3: Apply at Prediction Time

lower_bound = prediction + lower_shift
upper_bound = prediction + upper_shift

Because the shifts come from the empirical error distribution, the intervals are automatically:

Asymmetric: If underprediction errors are larger than overprediction errors, the upper band extends further
Horizon-adaptive: Day 1 has tighter intervals than day 7 (separate calibration per bucket)
Distribution-free: No assumption about error shape

Asymmetric Intervals

This is a key advantage. Electricity price errors are not symmetric — there’s more room to be wrong on the upside (price spikes to 200 EUR/MWh) than on the downside (prices rarely go below -20 EUR/MWh). Conformal prediction naturally captures this:

Example (peak hour, day 1):
  lower_shift = -8.2 EUR/MWh    (model overpredicts by up to 8.2)
  upper_shift = +15.7 EUR/MWh   (model underpredicts by up to 15.7)

Prediction: 55.0 EUR/MWh
  90% CI: [46.8, 70.7]          (asymmetric: 8.2 below, 15.7 above)

A Gaussian interval would have been symmetric (±12.0), overcovering on the downside and undercovering on the upside.

Horizon Buckets

The system maintains separate residual distributions for each horizon bucket, because forecast uncertainty naturally grows with horizon:

Bucket	Horizon	Typical 90% CI Width
DA1	D+1 morning	Narrowest
DA2	D+1 afternoon	Slightly wider
S1	D+2	Moderate
S2–S4	D+3–D+5	Wider
S5	D+6–D+7	Widest

Coverage Guarantee

Conformal prediction provides a theoretical coverage guarantee: as the calibration set size approaches infinity, the coverage rate converges to the nominal level. In practice, with 168+ calibration residuals per bucket, coverage rates are within 2–3 percentage points of the target.

Two Confidence Levels

The system produces two interval levels:

50% CI (25th–75th percentile residuals): Tight band capturing the most likely price range
90% CI (5th–95th percentile residuals): Wide band capturing all but extreme surprises

Trade-off Summary

Property	Conformal	Gaussian	Quantile Regression
Distribution assumption	None	Normal	None
Coverage guarantee	Yes (asymptotic)	Only if Gaussian	No
Asymmetric intervals	Yes (natural)	No	Yes
Horizon adaptation	Yes (per bucket)	Requires grouping	Requires separate models
Additional training	None	None	2× models per level
Implementation complexity	Low	Low	Medium