Why Direct over Recursive
The Decision
The EPF system uses a direct multi-horizon forecasting strategy: separate models are trained for each horizon group, each predicting target prices directly from origin-time features. The alternative — recursive prediction, where each step’s output feeds as input to the next step — was evaluated and rejected.
Direct vs Recursive: How They Work
Recursive (Rejected)
Origin → Model → Predict hour 1 → Use as input → Predict hour 2 → ... → hour 168A single model is trained, and predictions are made sequentially. Each prediction becomes a feature for the next time step. This mirrors the autoregressive structure of time series.
Direct (Chosen)
Origin → Model_DA1 → Predict hours 14-25 (D+1 morning)Origin → Model_DA2 → Predict hours 26-37 (D+1 afternoon)Origin → Model_S1 → Predict hours 33-56 (D+2)...Origin → Model_S5 → Predict hours 129-176 (D+6-D+7)Each horizon group has its own model, trained specifically for that range. All models use only features known at origin time — no predicted values are ever used as inputs.
Why Direct Wins for EPF
1. No Error Propagation
The fundamental problem with recursive prediction is error compounding:
Hour 1 error: ±2 EUR/MWhHour 2 input includes hour 1's error → Hour 2 error: ±3 EUR/MWhHour 3 input includes hour 2's error → Hour 3 error: ±5 EUR/MWh...Hour 168: errors have compounded across 168 stepsBy day 7, recursive errors can grow to the point where forecasts are no better than naive baselines. Direct prediction eliminates this entirely — each model’s error is independent.
2. Horizon-Specific Patterns
Different horizons have fundamentally different predictability:
- D+1 morning (DA1): Highly predictable from overnight conditions, demand forecasts
- D+1 afternoon (DA2): Solar generation creates volatility, but D+1 demand forecast is strong
- D+2 (S1): D+1 published prices are a powerful feature
- D+6-D+7 (S5): Weather forecast skill degrades; weekly seasonality dominates
A single recursive model must handle all these regimes. Direct models specialize, learning different feature importances for each horizon.
3. Parallelizable Prediction
Direct models are independent and can predict in parallel:
Sequential (recursive): ~168 model calls, each waiting for the previousParallel (direct): 7 model calls, all simultaneousThis is critical for production latency — the 10:00 UTC day-ahead forecast must complete quickly to be useful for trading decisions.
4. Independent Retraining
If day-ahead accuracy degrades but strategic accuracy is fine, only the DA1/DA2 models need retraining. Recursive models cannot be retrained at specific horizons without affecting all downstream predictions.
The Trade-off
Direct prediction has real costs:
| Aspect | Direct | Recursive |
|---|---|---|
| Number of models | 7 (DA1, DA2, S1–S5) | 1 |
| Model storage | ~7× more artifacts | Minimal |
| Training time | ~7× more (but parallelizable) | Less |
| Feature engineering | Must pre-compute all features at origin | Can use rolling features |
| Boundary effects | Potential discontinuities between groups | Smooth transitions |
Boundary Discontinuities
Adjacent horizon groups may predict slightly different prices at their boundary. For example, DA2’s prediction for D+1 23:00 and S1’s prediction for D+2 00:00 are generated by different models and may not transition smoothly. In practice, this is rarely noticeable because:
- The hours are 1 hour apart in real time
- Both models see similar features at the origin
- The ensemble averaging across three model types further smooths transitions
Horizon Group Design
The groups are designed to balance specialization with training sample efficiency:
| Group | Hours | Size | Rationale |
|---|---|---|---|
| DA1 | 14–25 | 12h | D+1 morning — distinct demand profile |
| DA2 | 26–37 | 12h | D+1 afternoon — solar peak, demand peak |
| S1 | 33–56 | 24h | D+2 — D+1 prices as features |
| S2 | 57–80 | 24h | D+3 |
| S3 | 81–104 | 24h | D+4 |
| S4 | 105–128 | 24h | D+5 |
| S5 | 129–176 | 48h | D+6–D+7 — merged for sample efficiency |
S5 combines two days because forecast skill at 6–7 day horizons is similar, and merging them doubles the training samples available to the model.