From 0d5479cb170b7501d8319f12ecbd33c8c2230b55 Mon Sep 17 00:00:00 2001
From: Ignacio Lopez-Gomez <ilopezgp@google.com>
Date: Fri, 31 Jan 2025 11:06:29 -0800
Subject: [PATCH] Code update

PiperOrigin-RevId: 721834319
---
 .../downscaling/gcm_wrf/README.md             | 50 +++++++++++++------
 1 file changed, 35 insertions(+), 15 deletions(-)

diff --git a/swirl_dynamics/projects/probabilistic_diffusion/downscaling/gcm_wrf/README.md b/swirl_dynamics/projects/probabilistic_diffusion/downscaling/gcm_wrf/README.md
index b092ccb..1f8d233 100644
--- a/swirl_dynamics/projects/probabilistic_diffusion/downscaling/gcm_wrf/README.md
+++ b/swirl_dynamics/projects/probabilistic_diffusion/downscaling/gcm_wrf/README.md
@@ -2,12 +2,14 @@
 
 This repository includes the source code used to design, train, and evaluate the
 Regional Residual Diffusion-based Downscaling (R2-D2) model described in the
-paper "Dynamical-generative downscaling of climate model ensembles" [arXiv](
-https://doi.org/10.48550/arXiv.2410.01776).
+paper "Dynamical-generative downscaling of climate model ensembles"
+(Lopez-Gomez et al, [2024](https://doi.org/10.48550/arXiv.2410.01776)). The
+model weights of a pre-trained R2-D2 model, as well as pre-processed input and
+evaluation data, are available on [Google Cloud](https://console.cloud.google.com/storage/browser/dynamical_generative_downscaling).
 
 The R2-D2 model is designed to take coarse-resolution climate data over a
 limited-area region and provide plausible high-resolution samples of the input
-fields over the same domain. The method relies on paired trajectories of
+fields over the same domain. The method relies on loosely paired trajectories of
 coarse-resolution climate fields and a higher-fidelity, high-resolution version
 of the same fields for training. Such paired trajectories may be obtained from
 dynamically downscaled climate projections, which track the trajectory of a
@@ -15,13 +17,12 @@ coarse-resolution input, or by forcing a coarse-resolution model to track a
 high-fidelity, high-resolution observational dataset. This paper explores the
 first approach, which is applicable to future climate downscaling.
 
-
 ## Application to climate projections over the Western United States
 
 The paper explores the application of R2-D2 to downscale future
 climate projections over the western United States, under the SSP3-7.0 forcing
 scenario. The data used to train and evaluate the models comes from the WUS-D3
-dataset, described by Rahimi et al [(2023)](https://doi.org/10.5194/gmd-17-2265-2024).
+dataset, described by Rahimi et al [(2024)](https://doi.org/10.5194/gmd-17-2265-2024).
 The WUS-D3 dataset is openly available [here](https://registry.opendata.aws/wrf-cmip6/)
 All references to the dataset should follow the guidelines provided by the
 authors of the dataset.
@@ -32,29 +33,48 @@ Models (ESMs): CanESM5, MPI-ESM1-2-HR, EC-Earth3-Veg, UKESM1-0-LL, ACCESS-CM2,
 MIROC6, TaiESM1, and NorESM2-MM. Bias correction follows the procedure described
 by Risser et al [(2024)](https://doi.org/10.1029/2023GL105979).
 
+## Getting started
+
+The easiest way to learn how to use the models and navigate the codebase is
+through the demo colabs. The starting point for those interested in performing
+inference with the model is the Google Colab *OS_downscaling_inference.ipynb*.
+This first colab presents a step-by-step guide of installing the code, the
+expected format of input and output datasets, instantiating an R2-D2 model given
+model weights, and performing inference.
+
+Colab *OS_multimodel_distribution_analysis.ipynb* inspects pre-generated
+downscaled datasets and showcases a few of the evaluation metrics used in
+Lopez-Gomez et al ([2024](https://doi.org/10.48550/arXiv.2410.01776)).
 
 ## Data
 
 We consider data at hourly resolution over the western United States, at
-45 km resolution (`d01` in the bucket), and at 9 km resolution (`d02`). We
-include in parantheses the nomenclature used for these datasets in our source
-code in the following paragraph.
+45 km resolution (`hourly_d01`), and at 9 km resolution (`hourly_d02`). We
+include in parentheses the nomenclature used for these datasets in our source
+code.
 
 For each forcing ESM, we use high-resolution data (`hourly_d02.zarr`), which
 serves as the label; and low-resolution data interpolated to the high-resolution
 grid (`hourly_d01_cubic_interpolated_to_d02.zarr`), which serves as the input.
 Residual models are trained using the difference between the high-resolution
-data and the interpolated low-resolution data as the target.
+data and the interpolated low-resolution data as the target. For normalization,
+we also compute the mean and standard deviation of each field over a certain
+period of time, in our case 2020-2090.
 
-For normalization, we also want to compute the mean and standard deviation over
-a certain period of time, for example 2020-2090.
+We include samples of this dataset over the evaluation period on [Google Cloud](https://console.cloud.google.com/storage/browser/dynamical_generative_downscaling).
 
 ## Modeling framework
 
 The R2-D2 model is a generative diffusion model. The learned denoising network
 is parameterized by a neural network with a UNet architecture. The modeling
-framework used to design the model is `JAX`. The package `orbax` is used for
-checkpointing. The model was trained and evaluated using NVIDIA A100 GPUs, so
-using architectures with similar or greater memory (e.g., NVIDIA H100) is
-recommended to avoid out-of-memory issues.
+framework used to design the model is `JAX`. The
+[`swirl-dynamics`](https://github.com/google-research/swirl-dynamics) codebase
+contains numerous implementations of popular neural network architectures
+(e.g., UNet, ViT), as well as model templates for probabilistic diffusion.
+The package `orbax` is used for checkpointing. The model was trained and
+evaluated using NVIDIA A100 GPUs, so using architectures with similar or greater
+memory (e.g., NVIDIA H100) is recommended to avoid out-of-memory issues.
+
+## License
 
+Licensed under the Apache License, Version 2.0.