Skip to content

System Architecture

EPF is a cloud-hosted electricity price forecasting system covering Spain, Portugal, France, and Germany. It collects market and grid data daily, generates machine-learning price forecasts with XGBoost, and serves them through a REST API and an interactive dashboard.

System Overview

LayerCapability
Data CollectionREE/ESIOS (ES), ENTSO-E (ES/PT/FR/DE), weather forecasts, commodity prices — refreshed daily
Feature Engineering~90 tabular features per horizon, country-aware (holidays, generation forecast targets, solar elevation)
ForecastingSingle-XGBoost + residual_1w target transform (all four countries) with per-country hyperparameter tuning
Prediction computeCloud Run Jobs per country × horizon group, triggered by Cloud Scheduler
APIRESTful access to forecasts, actuals, multi-country market data, and evaluation metrics
DashboardInteractive price charts, heatmaps, multi-country map view, and evaluation tools

Data Flow

Data Sources Processing Compute Output
──────────── ────────── ─────── ──────
REE/ESIOS (ES) ─────────┐
ENTSO-E (ES/PT/FR/DE) ──┼── Data Collection ── Feature ──┐
Open-Meteo weather ─────┤ (daily, Builder │
Commodity (TTF, Brent, ─┘ hourly ingest) (~90 ft) │
ETS via yfinance) │
Cloud Run Jobs (per country/horizon)
• v12.0-abl DA / v11.0 ST (ES)
• v6.0-abl DA / v6.0 ST (PT/FR)
• v6.0 DA / v6.0 ST (DE, only w/ cross-prices)
VM Postgres (predictions, latest_forecasts)
FastAPI REST layer
React 19 Frontend

Collection Layer

  • REE/ESIOS (Spain-specific): 16 electricity indicators (demand, generation mix, day-ahead and intraday prices, interconnections, MIBEL data) at hourly + 15-minute resolution.
  • ENTSO-E Transparency Platform (ES/PT/FR/DE): day-ahead prices, actual generation by type, generation forecast, total load — hourly for PT/FR/DE, synchronized to the 15-minute grid for ES where possible.
  • Open-Meteo: weather forecasts (temperature, wind, irradiance, cloud cover) from population-weighted stations per country.
  • Commodities: TTF natural gas, Brent crude, ETS carbon via yfinance with fallback chains.

Processing Layer

  • Feature Engineering (src/data/feature_engineering.py): builds ~90 tabular features per country. Includes country-aware calendar features, price lags, rolling statistics, renewable mix ratios, weather interactions, commodity dynamics, generation forecast targets (Z2), solar elevation (Z4), and country-aware holidays (Z1).
  • Cross-price gating (EPF_CROSS_PRICE_COUNTRIES env var): cross-country price features are only injected for countries where they measurably improve MAE. Current setting: DE only. See the Z3 ablation decision.
  • Target transform: residual_1w — model predicts deviation from the price at the same slot one week prior. This is the only sanctioned target transform; it survives every ablation.
  • Training: walk-forward backtest on a 145-day recent window, Optuna abandoned for hyperparameter search after the v6.1 regression (overfit TimeSeriesSplit folds containing 2022 crisis data).

Prediction (Day-Ahead and Strategic)

Single-model XGBoost (depth=12, learning_rate=0.03, q=0.55, price-weighting 3× above 60 EUR/MWh, sample decay 365-day halflife). No LSTM — the v10.x LSTM encoder was retracted in April 2026 after two code bugs were discovered that had made the LSTM block contribute zero useful signal; see the v11.0 changelog entry. No ensemble — the pre-2026 LightGBM/HistGB ensemble was retired at the same time; single XGBoost is strictly better.

Serving Layer

  • API (FastAPI on the VM): 30+ endpoints. /forecast/combined?country=PT returns dayahead + strategic + band config in one call. /countries enumerates supported countries. /market/* exposes ENTSO-E data.
  • Data Store: PostgreSQL 16 on the production VM (migrated from SQLite in Phase 2, April 2026). Country-aware PK on predictions, latest_forecasts, and every evaluation table.
  • Dashboard: React 19 + Vite, with a country selector, interactive Europe map on the Multi-Country page, month/year period pickers, and per-country forecast overlays.

Daily Operations

Two automated prediction runs per country per day, triggered by Cloud Scheduler against Cloud Run Jobs:

RunCloud Scheduler (UTC)ScopeOutput
Day-Ahead10:10D+1 (next 24h)96 quarter-hour prices (ES) / 24 hourly prices (PT/FR/DE)
Strategic15:10D+2 to D+7 (6 days)576 quarter-hour prices (ES) / 144 hourly prices (PT/FR/DE)

Each Cloud Run Job cold-starts a torch-free container, downloads the country/version joblib from GCS (gs://epf-models-epriceforecaster/<COUNTRY>/<VERSION>/), runs prediction, and writes rows to the VM PostgreSQL via a VPC connector. See the Cloud Run operations page for the full runbook.

Data collection (REE, ENTSO-E, weather, commodities, news) runs on VM cron independently and must complete before the Cloud Scheduler triggers fire.