Historically, ILAMB is most often used to assess biogeochemical cycles on land, but our methods and software are generic and can be applied to any domain. Below is a tutorial that will teach you the basic steps required to use the ILAMB software on any reference/model data comparisons of your choosing. In this tutorial, you will learn how to:
Set up reference data
Set up model data
Run an ILAMB comparison & visualize the results in a dashboard
Before you start¶
Before getting started, you should (1) install the ilamb3 software and (2) have in mind a scientific question that you want to answer with your benchmarking study. This will help you decide which variables, datasets, and model outputs you want to use. For this tutorial, our scientific question is simple: How well does the CanESM5 model predict gross primary productivity (GPP)? ILAMB can handle broader, more complex questions, but we will start with this simple one to get familiar with the software.
Reference Data¶
For this example, you must decide which gpp reference datasets you want to use. For your convenience, ILAMB has catalogs of datasets that have already been formatted for use in ILAMB. We have legacy catalogs for land and ocean variables. We also have a new catalog containing datasets that better adhere to comumunity data standards than our legacy datasets, the ilamb3 catalog. For this tutorial, we will use the
WECANN-1-0/obs4MIPs_ColumbiaU_WECANN-1-0_mon_gpp_gn_v20260302.nc
dataset from the ilamb3 registry. This is a gridded product of gpp that was generated by Columbia University using the WECANN model and then formatted for use in ILAMB.
For ILAMB to know which dataset we want to use, we must add this key to a yaml file, which looks like this:
Ecosystem and Carbon Cycle:
Gross Primary Productivity:
WECANN-1-0:
sources:
gpp: WECANN-1-0/obs4MIPs_ColumbiaU_WECANN-1-0_mon_gpp_gn_v20260302.nc
variable_cmap: Greens
Using your preferred text editor, create a yaml file, e.g., my_benchmark_study.yaml, and copy/paste the contents above into it.
The my_benchmark_study.yaml file contains a nested set of dictionaries which are used to organize the benchmarking study. In this case, we call our top-level benchmarking category, Ecosystem and Carbon Cycle. Nested below it, we have a section for our variable, Gross Primary Productivity. Within Gross Primary Productivity, we have a section for our reference dataset, WECANN-1-0. You can Organizing a Benchmark Study your study in any way you prefer and name the sections whatever you like. The organization we use here keeps your study tidy and makes it easier to navigate the results dashboard. Plus, this layout makes it easy to add additional benchmarking categories, variables, and reference datasets. The organization you choose will be reflected in the output directory structure and the results dashboard that is generated at the end of the study.
At minimum, ILAMB needs to know more about the reference data within the sources section. In this section, we must provide the variable name, gpp, as well as the path to the dataset. Usually, the reference data are stored as NetCDFs. The variable gpp should exist inside the NetCDF that you point to. We also (optionally) set the color palette of the spatial maps that will appear in our results dashboard by adding a key called variable_cmap and assigning its value as Greens. To see what other configuration options are possible, check out this yaml configuration tutorial. You can also get a sense of configuraton options by checking out the yaml files for the
comprehensive ILAMB land and/or the IOMB ocean benchmarking studies.
In the sources section, you can show ILAMB where your reference data lives in three different ways:
Using a key from one of the ILAMB data catalogs.
Using an absolute path to a file on your system.
Using a path relative to the
ILAMB_ROOTenvironment variable.
Once you’re finished setting up your yaml file, you should ensure that the reference data you pointed to is downloaded and available on your system. In this example, we specified an item in the ilamb3 data catalog:
WECANN-1-0/obs4MIPs_ColumbiaU_WECANN-1-0_mon_gpp_gn_v20260302.nc
To download that data from the registry, open a terminal window and execute the following command:
ilamb fetch my_benchmark_study.yamlThis will download any keys it finds that are not already downloaded. Using data from our catalogs is a great way for you to build benchmarking comparisons that are portable to other systems and shareable with your colleagues. If there are data that you want to use that are not in our registries and you think may be useful to the broader community, you can format the data to be ILAMB-ready and submit a pull request in our ilamb3-data GitHub repository. If approved, your data will be added to the public registry and made available for anyone to use in their benchmarking studies.
Model Data¶
While we have written ilamb3 hoping that the reference and model datafiles follow NetCDF CF-Conventions, it is technically not required. At minimum, data must be readable by xarray and have some spatiotemporal dimensions (e.g., latitude, longitude, time, etc.). We understand that, often, benchmarking is used early in model development and requires some flexibility as the model continues to develop.
Just like the reference data, you will need to tell ilamb3 where your model data is located. Currently, ILAMB expects model data-related information stored in a tabular CSV file. Here is an example of what that CSV file should look like:
At minimum, the CSV file must:
Contain one row per unique variable/file. If you have a variable split into multiple files, as some CMIP models do, then you would still have a row per unique file. If your model data is closer to raw output where single files contain many variables, you would create a row for each unique variable and file even if you reference the same file multiple times.
Contain a
pathcolumn which provides the absolute path to the file location on your system.Contain the columns
source_id,member_id, andgrid_label, unless you have configuredilamb3otherwise. If you are using your own model data that has not been standardized, you may have to create these columns manually and fill with data you invent.
At the moment, generating these CSV files is up to the user.
Running the study¶
We now have all the ingredients needed to run the test study:
ilamb run my_benchmark_study.yaml --model-db CanESM5.csvInternally, ilamb3 will use the benchmark definitions found in my_benchmark_study.yaml to query the model data given in CanESM5.csv for relevant variables. For each unique combination of source_id, member_id, and grid_label, we will run a comparison and save the results in the _build directory:
_build/
├── EcosystemandCarbonCycle
│ └── GrossPrimaryProductivity
│ └── WECANN-1-0
│ ├── CanESM5.csv
│ ├── CanESM5.nc
│ ├── CanESM5_None_bias.png
│ ├── CanESM5_None_biasscore.png
│ ├── CanESM5_None_cycle.png
│ ├── CanESM5_None_cyclescore.png
│ ├── CanESM5_None_mean.png
│ ├── CanESM5_None_rmse.png
│ ├── CanESM5_None_rmsescore.png
│ ├── CanESM5_None_shift.png
│ ├── CanESM5_None_tmax.png
│ ├── CanESM5_None_trace.png
│ ├── None_None_taylor.png
│ ├── post.log
│ ├── Reference.nc
│ ├── Reference_None_mean.png
│ ├── Reference_None_tmax.png
│ └── WECANN-1-0.html
├── index.html
├── _lmtUDConfig.json
├── run.yaml
└── scalar_database.jsonLet’s unpack what ilamb3 run has saved in the _build directory. We can see that the output has been stored in subdirectories that mirror the organization found in my_benchmark_study.yaml. The directory EcosystemandCarbonCycle/GrossPrimaryProductivity/WECANN-1-0 contains the benchmarking study results for WECANN-1-0. Within that directory, you will find the files generated from 2 phases of the ilamb run process. In the first phase, we perform all the comparisons and writes out intermediate files:
CanESM5.csvstores all comparison scalars that were generated.CanESM5.nccontains the data used for the plots (the .pngs)CanESM5.logcontains errors (if any) that occured during the run.Reference.nccontains the data to plot but from whatever reference data product was used. In this case that isWECANN-1-0.
In the second phase, ilamb run will post-process these files and generate plots and a web page.
Images following a naming convention of
{DATA}_{REGION}_{PLOT}.png, where{DATA}is the model name or “Reference” for the reference data. If a plot is composited across all models, then{DATA}may appear asNone.post.logcontains errors that occured during the second phase of theilamb3 run.WECANN-1-0.htmlis a data dashboard that you can open in a browser to explore the results. The page should look something like this.
There are also several files located at the root of the _build directory. The file run.yaml is a copy of the benchmark configuration file. This is for reproducibility so that you can always tell what produced any given ilamb3 build. The remaining files were produced as a synopsis of the full ilamb3 run:
index.htmlis the main entry point into the benchmark results. It leverages the Unified Dashboard to create a dynamic portrait plot that allows you to see how each model performs across all benchmarks. You can use the portrait plot to identify which models are performing well and which benchmarks are driving model performance.scalar_database.jsoncontains all the scalars that were generated across all comparisons and models. This file is used to create the portrait plot inindex.html._lmtUDConfig.jsoncontains options that the dashboard uses to create its default view.
Visualizing the study results¶
Since index.html will load a data file, you cannot simply open it in a browser because it violates security policy. You need to either move the _build directory to a web-visible location or you can emulate an HTTP server locally. Navigate to the _build directory and, in your terminal, run:
python -m http.serverClick on the link this command generates or copy and paste it into a browser. As this tutorial sample contains a single model and benchmark, the portrait plot is minimal. However as more models and benchmarks are included (see this example) it will become a useful tool in navigating the different benchmarks. By clicking on the row labeled GrossPrimaryProductivity, you will expand the row to show the underlying datasets. Clicking on the WECANN-1-0 label will load a data dashboard page for that benchmark.
The benchmark dashboard is designed to assist you in discovering patterns in the results. It contains a series of plots that are generated for each model and reference dataset. The plots include maps of bias, RMSE, Taylor diagrams, time series, annual cycles, and more. Each plot is designed to highlight different aspects of the model performance.
Next steps¶
In order to solidify these basic ilamb3 concepts, we recommend that you attempt the following expansions on your own:
Add another model: This consists of creating another (or expanding the current) CSV file. To keep downloads small, you might choose to add
UKESM1-0-LLbecause it is also relatively coarse like CanESM5. If you create a second CSV file, when runningilamb3, you can just add another--model-dboption. We will concatenate these files together internally:
ilamb run basic_step1.yaml --model-db CanESM5.csv --model-db UKESM1-0-LL.csvAdd another benchmark: Locate the
ilamb3registry key for theFluxnet-2015gppdata from the Dataset Catalogs page. Add another benchmark block tomy_benchmark_study.yamlwhich is also under theGross Primary Productivityheading. Essentially, you need to duplicate theWECANN-1-0block and then replace the WECANN-specific information with Fluxnet2015-specific information.
- Collier, N., Grover, M., Stachelek, J., Huard, D., & Andela, B. (2026). intake-esgf. Zenodo. 10.5281/ZENODO.18378994