Skip to content

Evaluation Workflow

Aggregate Results

Run on local computer after inpainting completes:

python aggregate_results.py \
  -e "paper-2025-06_inpainting" \
  --compute_wis \
  --create_ensemble \
  --plot_results

Operations

  • Read MLflow results: Collect all individual forecasts
  • Compute WIS scores: Weighted Interval Score across scenarios/dates/configs
  • Create ensemble forecasts: Combine predictions
  • Generate summary tables: Statistical summaries
  • Create plots: Visualizations for papers

WIS Scoring

Uses Adrian Lison's interval scoring library to compute Weighted Interval Scores for forecast evaluation.

Output

Results ready for: - Paper figures - Performance analysis - Model comparison