Inpainting Architecture
CoPaint Algorithm
InfluPaint uses the CoPaint algorithm (replacing earlier REpaint) to condition forecasts on observed data. Inpainting alters reverse diffusion iterations by sampling unmasked regions using ground-truth.
Atomic Inpainting Design
Each inpainting job is independent and atomic:
- Atomic: Single scenario + single date + single config per job
- Parallel: 20 dates × 3 configs = 60 parallel jobs
- Independent: Each job can run on different nodes/GPUs
- Fault-tolerant: Failed jobs don't affect others
Example: For scenario 5 with 20 dates and 3 configs: - Old way: 1 job × 60 sequential runs = slow - New way: 60 parallel jobs = fast
Model Loading
Inpainting uses dual model loading:
- MLflow-first: Load trained model from MLflow using run ID
- Filesystem fallback: Load from filesystem if MLflow unavailable
The get_mlflow_run_id.py utility maps scenario IDs to MLflow run IDs.
Mega-Array Submission
All scenarios submitted in one job array:
python generate_inpaint_jobs.py -e "paper-2025-06" --scenarios "0-31"
sbatch inpaint_array_paper-2025-06_all_scenarios.run
Example: 32 scenarios × 20 dates × 3 configs = 1,920 parallel jobs