Model Artifacts
Overview
After training, each model is serialized as a joblib file containing the trained model object, metadata, and conformal calibration data. These artifacts are loaded at prediction time.
Artifact Structure
Each artifact is a Python dictionary saved with joblib.dump():
{ "model": trained_model_object, "version": "2026-02-27", "trained_at": "2026-02-27T14:30:00", "feature_names": ["hour_sin", "hour_cos", "price_lag_24h", ...], "cv_metrics": [ {"fold": 0, "mae": 3.2, "rmse": 4.8, "mape": 7.1}, {"fold": 1, "mae": 3.5, "rmse": 5.1, "mape": 7.6}, {"fold": 2, "mae": 3.1, "rmse": 4.5, "mape": 6.9}, {"fold": 3, "mae": 3.4, "rmse": 5.0, "mape": 7.4}, {"fold": 4, "mae": 3.3, "rmse": 4.9, "mape": 7.2}, ], "conformal_calibrator": { "residuals_by_bucket": { "DA1": [-2.1, 0.5, -1.3, 3.8, ...], "DA2": [-1.8, 1.2, -0.9, 4.5, ...], }, "bucket_for_horizon": {14: "DA1", 15: "DA1", ..., 26: "DA2", ...}, }}Naming Convention
model_{horizon_group}_{model_type}_{run_mode}.joblibExamples:
| File | Description |
|---|---|
model_DA1_histgb_dayahead.joblib | HistGBM, D+1 morning, day-ahead run |
model_DA2_lightgbm_dayahead.joblib | LightGBM, D+1 afternoon, day-ahead run |
model_S1_xgboost_strategic.joblib | XGBoost, D+2, strategic run |
model_S5_ensemble_strategic.joblib | Ensemble metadata, D+6-D+7, strategic run |
What Gets Saved
1. Trained Model Object
The scikit-learn, LightGBM, or XGBoost model instance with all learned parameters (tree structures, split values, leaf predictions). This is the core artifact used for inference.
2. Feature Names
The ordered list of feature column names used during training. At prediction time, features must be provided in this exact order. If a feature is missing or added, the prediction pipeline will raise an error rather than silently producing incorrect results.
3. Cross-Validation Metrics
Per-fold MAE, RMSE, and MAPE from the 5-fold TimeSeriesSplit. These metrics serve as the performance baseline for drift detection — if live accuracy degrades beyond 1.5× the CV MAE, drift is flagged.
4. Conformal Calibrator
The residual distributions used for confidence interval generation. Contains:
- Residuals by bucket: Signed residuals (actual - predicted) from out-of-fold predictions, grouped by horizon bucket
- Bucket mapping: Which horizon hours belong to which bucket
5. Version and Timestamp
The training date and execution timestamp, used for artifact versioning and auditing.
Loading at Prediction Time
The prediction pipeline loads artifacts by constructing the expected filename from the horizon group, model type, and run mode:
artifact = joblib.load(artifact_path)
model = artifact["model"]expected_features = artifact["feature_names"]calibrator = artifact["conformal_calibrator"]Feature alignment is verified before prediction — if the feature columns don’t match the training signature, an error is raised rather than producing incorrect results.
Artifact Lifecycle
1. Training → joblib.dump() → model artifact saved2. Prediction → joblib.load() → model.predict(features)3. Retraining → New artifact overwrites old one (same filename)4. Versioning → "version" field inside artifact tracks training dateOld artifacts are overwritten during retraining. For versioned history, the training date inside each artifact provides an audit trail.