Skip to content

Why Conformal over Parametric

The Decision

The EPF system uses split conformal prediction for uncertainty quantification rather than parametric methods (e.g., Gaussian prediction intervals, quantile regression). This provides coverage guarantees without assuming any specific error distribution.

Parametric Approaches: Why They Fail

Gaussian Intervals

The simplest parametric approach assumes forecast errors follow a normal distribution:

Interval = prediction ± z × σ
where z = 1.645 for 90% CI, σ = standard deviation of errors

This fails for electricity prices because:

  1. Heavy tails: Price errors have occasional extreme outliers (demand spikes, outages) that make the tails heavier than Gaussian
  2. Skewness: Errors are right-skewed — large positive errors (underestimation) are more common than large negative errors
  3. Heteroscedasticity: Error variance changes with hour of day, day of week, and price level. Peak hours have 2–3× the variance of off-peak hours

A Gaussian interval with σ estimated globally will be too wide during stable night hours and too narrow during volatile peak hours.

Quantile Regression

Quantile regression trains separate models for each quantile (e.g., 5th and 95th percentile). This avoids the Gaussian assumption but has its own problems:

  1. Crossing quantiles: Without constraints, the 5th percentile prediction can exceed the 95th percentile
  2. Double training: Two additional models per interval level, per horizon group
  3. No coverage guarantee: Empirical coverage depends entirely on model accuracy

Conformal Prediction: How It Works

Split conformal prediction computes intervals from the empirical distribution of actual prediction errors, requiring no distributional assumptions:

Step 1: Collect Residuals (During Training)

Cross-validation residuals are collected for each horizon bucket:

residual = actual_price - predicted_price

Positive residuals mean the model underpredicted. Negative residuals mean it overpredicted.

Step 2: Compute Quantiles (Per Horizon Bucket)

For a 90% confidence interval:

lower_shift = quantile(residuals, 0.05) # 5th percentile
upper_shift = quantile(residuals, 0.95) # 95th percentile

Step 3: Apply at Prediction Time

lower_bound = prediction + lower_shift
upper_bound = prediction + upper_shift

Because the shifts come from the empirical error distribution, the intervals are automatically:

  • Asymmetric: If underprediction errors are larger than overprediction errors, the upper band extends further
  • Horizon-adaptive: Day 1 has tighter intervals than day 7 (separate calibration per bucket)
  • Distribution-free: No assumption about error shape

Asymmetric Intervals

This is a key advantage. Electricity price errors are not symmetric — there’s more room to be wrong on the upside (price spikes to 200 EUR/MWh) than on the downside (prices rarely go below -20 EUR/MWh). Conformal prediction naturally captures this:

Example (peak hour, day 1):
lower_shift = -8.2 EUR/MWh (model overpredicts by up to 8.2)
upper_shift = +15.7 EUR/MWh (model underpredicts by up to 15.7)
Prediction: 55.0 EUR/MWh
90% CI: [46.8, 70.7] (asymmetric: 8.2 below, 15.7 above)

A Gaussian interval would have been symmetric (±12.0), overcovering on the downside and undercovering on the upside.

Horizon Buckets

The system maintains separate residual distributions for each horizon bucket, because forecast uncertainty naturally grows with horizon:

BucketHorizonTypical 90% CI Width
DA1D+1 morningNarrowest
DA2D+1 afternoonSlightly wider
S1D+2Moderate
S2–S4D+3–D+5Wider
S5D+6–D+7Widest

Coverage Guarantee

Conformal prediction provides a theoretical coverage guarantee: as the calibration set size approaches infinity, the coverage rate converges to the nominal level. In practice, with 168+ calibration residuals per bucket, coverage rates are within 2–3 percentage points of the target.

Two Confidence Levels

The system produces two interval levels:

  • 50% CI (25th–75th percentile residuals): Tight band capturing the most likely price range
  • 90% CI (5th–95th percentile residuals): Wide band capturing all but extreme surprises

Trade-off Summary

PropertyConformalGaussianQuantile Regression
Distribution assumptionNoneNormalNone
Coverage guaranteeYes (asymptotic)Only if GaussianNo
Asymmetric intervalsYes (natural)NoYes
Horizon adaptationYes (per bucket)Requires groupingRequires separate models
Additional trainingNoneNone2× models per level
Implementation complexityLowLowMedium