FAQ

General

What is EPF?

EPF (Electricity Price Forecasting) is a machine learning system that predicts Spanish day-ahead electricity market (OMIE) prices up to 7 days ahead. It produces two daily forecasts: a D+1 day-ahead product at ~10:00 UTC and a D+2 to D+7 strategic product at ~15:00 UTC.

What market does EPF cover?

EPF currently forecasts prices for the Spanish zone of the OMIE (Iberian electricity market) day-ahead auction. Support for Portugal and other European markets is planned.

How often are forecasts updated?

Twice daily:

~10:00 UTC — D+1 day-ahead forecast (24 hours of prices for tomorrow)
~15:00 UTC — D+2 to D+7 strategic forecast (144 hours of prices for the next 6 days)

What resolution is available?

Forecasts are available at 15-minute resolution (96 quarter-hours per day), aligned with the EU-wide MTU15 transition that took effect in the OMIE market in October 2025.

Accuracy & Methodology

How accurate are the forecasts?

Product	Model	MAE	Bias	Window
D+1 Day-Ahead	LSTM-XGBoost hybrid (v10.1)	15.73 EUR/MWh	-0.65 EUR/MWh	150-day incl. crisis
D+2–D+7 Strategic	Ensemble (v4.3)	19.79 EUR/MWh	-0.30 EUR/MWh	149-day

The v10.1 day-ahead model has near-zero bias (-0.65 EUR/MWh), a 19× improvement over the previous -12 EUR/MWh. The 150-day validation window includes the March 2026 Iran crisis (prices 170–247 EUR/MWh) — without it, MAE would be ~1–2 EUR/MWh lower.

Accuracy varies by horizon: near-term forecasts (D+1, D+2) are more accurate than longer-range predictions (D+6, D+7).

What models does EPF use?

Day-ahead (D+1): A task-aligned LSTM encoder processes 7-day price sequences into 64-dimensional temporal embeddings, augmenting XGBoost’s 90 tabular features. This v10.1 architecture provides temporal context for regime detection and spike forecasting that tree-based splits on lag columns cannot recover.

Strategic (D+2–D+7): An ensemble of three gradient boosting models:

HistGradientBoosting (scikit-learn)
LightGBM
XGBoost

All models are trained with quantile loss (q=0.55). See the Ensemble Strategy page for details.

What data sources feed the models?

Four data sources:

Source	Data	Update Frequency
REE/ESIOS	16 electricity indicators (demand, generation, interconnections, prices)	Hourly
Open-Meteo	Weather from 5 Spanish stations (temperature, wind, solar, precipitation)	Hourly
TTF Gas	Natural gas prices (EUR/MWh) from MIBGAS	Daily
EU ETS	Carbon emission allowance prices (EUR/tCO2)	Daily

What are the 50+ engineered features?

Features are derived from the raw data and include:

Price dynamics: lags, rolling means, volatility, momentum
Temporal: cyclical hour/day/month encoding, holidays, vacation periods
Generation mix: renewable share, wind/solar share, residual demand
Weather interactions: temperature-demand, wind-generation, solar irradiance
Commodity signals: marginal cost proxy, spark spread, gas-demand interaction

See the Feature Engineering section for the full list.

What economic quality metrics does EPF track?

Beyond MAE and RMSE, EPF evaluates 7 metrics that measure forecast value for trading:

Metric	What it measures
Corr-f (Deviation)	Within-day price shape accuracy (daily mean removed)
Direction Accuracy	% of hours with correct price movement direction
Spike Recall	Ability to identify the most expensive hours
Spread Capture	% of optimal BESS arbitrage captured by following the forecast

These are visible in the Evaluation > Accuracy tab under “Economic Performance”. See Economic Quality Metrics for detailed definitions and thresholds.

How does the two-product system work?

The D+1 day-ahead product is generated before OMIE publishes prices (~10:00 UTC), so traders can use it for bidding. The D+2–D+7 strategic product is generated after D+1 prices are known (~15:00 UTC), using published D+1 prices as the strongest predictive feature.

Each product has its own trained models, optimized for its specific information availability. See Two-Product System for details.

What are confidence intervals?

EPF provides 50% and 90% confidence bands around each point forecast using split conformal prediction. The 90% band means: historically, 90% of actual prices have fallen within this range for predictions at the same horizon.

The intervals are asymmetric — wider on the upside to reflect the right-skewed nature of electricity prices (bounded below, occasional high spikes).

Technical

Is there an API?

Yes. The REST API provides programmatic access to forecasts, market data, and evaluation metrics. See the API Overview for details. Public access is planned for a future release.

What happens when data is missing?

The system has several resilience mechanisms:

Weather data: Open-Meteo has high availability; if a station fails, the weighted average adjusts
Commodity data: Forward-filled over weekends and holidays (markets don’t trade daily)
REE data: The pipeline retries collection for the last 2 days on each run
Model features: Tree-based models handle NaN values natively — no imputation needed

How does the system handle price spikes?

Price spikes (>200 EUR/MWh) are part of the training data and the models learn to predict them. However, extreme events are inherently harder to forecast. The confidence intervals widen during volatile periods.

A sanity check aborts predictions if the maximum predicted price exceeds 500 EUR/MWh, flagging potential data quality issues.

How does the system handle negative prices?

Negative prices occur during periods of high renewable generation and low demand (typically midday in spring/autumn). The models are trained on historical negative price events and can predict them.

A post-prediction bias correction clips predicted prices at 0 EUR/MWh for hours where less than 5% of historical prices were negative. Hours with frequent negative prices (e.g., midday solar surplus periods) are left unconstrained with a floor at -50 EUR/MWh.

How is the system deployed?

EPF runs on secure cloud infrastructure with two automated daily forecast runs — day-ahead at ~10:00 UTC and strategic at ~15:00 UTC. A dedicated REST API provides programmatic access to forecasts and evaluation data.