v2.0 — Multi-Model Ensemble
Date: February 17, 2026
What Changed
Multi-Model Ensemble
Replaced the single HistGBT model with three gradient boosting implementations:
| Model | Library | Key Strength |
|---|---|---|
| HistGBT | scikit-learn | Fastest CPU training |
| LightGBM | lightgbm | Native categorical features, GPU support |
| XGBoost | xgboost | Strong regularization, GPU support |
All three train independently on the same features with MSE loss. Predictions are combined via equal-weight averaging for the ensemble.
Conformal Prediction
Added split conformal prediction for uncertainty quantification:
- Collects out-of-fold residuals during cross-validation
- Groups residuals by horizon bucket (day 1-7)
- Computes 50% and 90% confidence intervals per bucket
- Wider intervals for distant horizons, narrower for near-term
Walk-Forward Backtesting
Implemented a backtesting framework that simulates production conditions:
- Trains on data available up to each test date
- Generates predictions using only past information
- Stores results with
_backtestsuffix for separate evaluation
Naive Benchmarks
Added persistence and weekly seasonal baselines:
- Persistence: Tomorrow’s price = today’s same hour
- Weekly seasonal: Next week’s price = this week’s same hour and day
- Skill scores measure improvement over these naive baselines
Impact
- Ensemble reduces variance — individual model errors cancel out
- Confidence intervals enable risk-aware decision making
- Backtesting provides honest out-of-sample performance estimates
- Skill scores contextualize whether the model adds value vs simple heuristics