2.5. Testing the SRW App

The SRW App contains a set of end-to-end tests that exercise various workflow configurations of the SRW App. These are referred to as workflow end-to-end (WE2E) tests because they all use the Rocoto workflow manager to run their individual workflows from start to finish. The purpose of these tests is to ensure that new changes to the App do not break existing functionality and capabilities. However, these WE2E tests also provide users with additional sample cases and data beyond the basic config.community.yaml case.

Note that the WE2E tests are not regression tests—they do not check whether current results are identical to previously established baselines. They also do not test the scientific integrity of the results (e.g., they do not check that values of output fields are reasonable). These tests only check that the tasks within each test’s workflow complete successfully. They are, in essence, tests of the workflow generation, task execution (J-jobs, ex-scripts), and other auxiliary scripts to ensure that these scripts function correctly. Tested functions include creating and correctly arranging and naming directories and files, ensuring that all input files are available and readable, calling executables with correct namelists and/or options, etc. Currently, it is up to the external repositories that the App clones (Section 1.2.2) to check that changes to those repositories do not change results, or, if they do, to ensure that the new results are acceptable. (At least two of these external repositories—UFS_UTILS and ufs-weather-model—do have such regression tests.)

2.5.1. WE2E Test Categories

WE2E tests are grouped into two categories that are of interest to code developers: fundamental and comprehensive tests. “Fundamental” tests are a lightweight but wide-reaching set of tests designed to function as a cheap “smoke test” for changes to the UFS SRW App. The fundamental suite of tests runs common combinations of workflow tasks, physical domains, input data, physics suites, etc. The comprehensive suite of tests covers a broader range of combinations of capabilities, configurations, and components, ideally including all capabilities that can be run on a given platform. Because some capabilities are not available on all platforms (e.g., retrieving data directly from NOAA HPSS), the suite of comprehensive tests varies from machine to machine. The list of fundamental and comprehensive tests can be viewed in the ufs-srweather-app/tests/WE2E/machine_suites/ directory and are described in more detail in this table.


There are two additional test suites, coverage (designed for automated testing) and all (includes all tests, including those known to fail). Running these suites is not recommended.

For convenience, the WE2E tests are currently grouped into the following categories (under ufs-srweather-app/tests/WE2E/test_configs/):

  • aqm

    This category tests the AQM configuration of the SRW App.

  • custom_grids

    This category tests custom grids aside from those specified in ufs-srweather-app/ush/predef_grid_params.yaml. These tests help ensure a wide range of domain sizes, resolutions, and locations will work as expected. These test files can also serve as examples for how to set your own custom domain.

  • default_configs

    This category tests example config files provided for user reference. They are symbolically linked from the ufs-srweather-app/ush/ directory.

  • grids_extrn_mdls_suites_community

    This category of tests ensures that the SRW App workflow running in community mode (i.e., with RUN_ENVIR set to "community") completes successfully for various combinations of predefined grids, physics suites, and input data from different external models. Note that in community mode, all output from the Application is placed under a single experiment directory.

  • grids_extrn_mdls_suites_nco

    This category of tests ensures that the workflow running in NCO mode (i.e., with RUN_ENVIR set to "nco") completes successfully for various combinations of predefined grids, physics suites, and input data from different external models. Note that in NCO mode, an operational run environment is used. This involves a specific directory structure and variable names (see Section 3.1.4).

  • ufs_case_studies

    This category tests that the workflow running in community mode completes successfully when running cases derived from the ufs-case-studies repository.

  • verification

    This category specifically tests the various combinations of verification capabilities using METPlus.

  • wflow_features

    This category of tests ensures that the workflow completes successfully with particular features/capabilities activated.


Users should be aware that some tests assume HPSS access.

  • custom_ESGgrid_Great_Lakes_snow_8km and MET_verification_only_vx_time_lag require HPSS access, as well as rstprod access on both RDHPCS and HPSS.

  • On certain machines, the community test assumes HPSS access. If the ush/machine/*.yaml file contains the following lines, and these paths are different from what is provided in TEST_EXTRN_MDL_SOURCE_BASEDIR, users will need to have HPSS access or modify the tests to point to another data source:


Some tests are duplicated among the above categories via symbolic links, both for legacy reasons (when tests for different capabilities were consolidated) and for convenience when a user would like to run all tests for a specific category (e.g., verification tests).

2.5.2. Running the WE2E Tests The Test Script (run_WE2E_tests.py)

The script to run the WE2E tests is named run_WE2E_tests.py and is located in the directory ufs-srweather-app/tests/WE2E. Each WE2E test has an associated configuration file named config.${test_name}.yaml, where ${test_name} is the name of the corresponding test. These configuration files are subsets of the full range of config.yaml experiment configuration options. (See Section 3.1 for all configurable options and Section for information on configuring config.yaml.) For each test, the run_WE2E_tests.py script reads in the test configuration file and generates from it a complete config.yaml file. It then calls the generate_FV3LAM_wflow() function, which in turn reads in config.yaml and generates a new experiment for the test. The name of each experiment directory is set to that of the corresponding test, and a copy of config.yaml for each test is placed in its experiment directory.


The full list of WE2E tests is extensive, and some larger, high-resolution tests are computationally expensive. Estimates of walltime and core-hour cost for each test are provided in this table. Using the Test Script


These instructions assume that the user has already built the SRW App (as described in Section 2.3.4) and loaded the appropriate python environment (as described in Section

The test script has three required arguments: machine, account, and tests.

  • Users must indicate which machine they are on using the --machine or -m option. See ush/machine or Section 3.1.1 for valid values.

  • Users must submit a valid account name using the --account or -a option to run submitted jobs. On systems where an account name is not required, users may simply use -a none.

  • Users must specify the set of tests to run using the --tests or -t option. Users may pass (in order of priority):

    1. The name of a single test or list of tests to the test script.

    2. A test suite name (e.g., “fundamental”, “comprehensive”, “coverage”, or “all”).

    3. The name of a subdirectory under ufs-srweather-app/tests/WE2E/test_configs/

    4. The name of a text file (full or relative path), such as my_tests.txt, which contains a list of the WE2E tests to run (one per line).

Users may run ./run_WE2E_tests.py -h for additional (optional) usage instructions. Examples


  • Users will need to adjust the machine name and account in these examples to run tests successfully.

  • These commands assume that the user is working from the WE2E directory (ufs-srweather-app/tests/WE2E/).

To run the custom_ESGgrid and pregen_grid_orog_sfc_climo tests on Jet, users could run:

./run_WE2E_tests.py -t custom_ESGgrid pregen_grid_orog_sfc_climo -m jet -a hfv3gfs

Alternatively, to run the entire suite of fundamental tests on Hera, users might run:

./run_WE2E_tests.py -t fundamental -m hera -a nems

To run the tests custom_ESGgrid and grid_RRFS_CONUScompact_25km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v16 on NOAA Cloud, users would enter the following commands:

echo "custom_ESGgrid" > my_tests.txt
echo "grid_RRFS_CONUScompact_25km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v16" >> my_tests.txt
./run_WE2E_tests.py -t my_tests.txt -m noaacloud -a none

By default, the experiment directory for a WE2E test has the same name as the test itself, and it is created in ${HOMEdir}/../expt_dirs, where HOMEdir is the top-level directory for the ufs-srweather-app repository (usually set to something like /path/to/ufs-srweather-app). Thus, the custom_ESGgrid experiment directory would be located in ${HOMEdir}/../expt_dirs/custom_ESGgrid.

A More Complex Example: To run the fundamental suite of tests on Orion in parallel, charging computational resources to the “gsd-fv3” account, and placing all the experiment directories into a directory named test_set_01, run:

./run_WE2E_tests.py -t fundamental -m orion -a gsd-fv3 --expt_basedir "test_set_01" -q -p 2
  • --expt_basedir: Useful for grouping sets of tests. If set to a relative path, the provided path will be appended to the default path. In this case, all of the fundamental tests will reside in ${HOMEdir}/../expt_dirs/test_set_01/. It can also take a full path as an argument, which will place experiments in the given location.

  • -q: Suppresses the output from generate_FV3LAM_wflow() and prints only important messages (warnings and errors) to the screen. The suppressed output will still be available in the log.run_WE2E_tests file.

  • -p 2: Indicates the number of parallel proceeses to run. By default, job monitoring and submission is serial, using a single task. Therefore, the script may take a long time to return to a given experiment and submit the next job when running large test suites. Depending on the machine settings, running in parallel can substantially reduce the time it takes to run all experiments. However, it should be used with caution on shared resources (such as HPC login nodes) due to the potential to overwhelm machine resources. Workflow Information

For each specified test, run_WE2E_tests.py will generate a new experiment directory and, by default, launch a second function monitor_jobs() that will continuously monitor active jobs, submit new jobs, and track the success or failure status of the experiment in a .yaml file. Finally, when all jobs have finished running (successfully or not), the function print_WE2E_summary() will print a summary of the jobs to screen, including the job’s success or failure, timing information, and (if on an appropriately configured platform) the number of core hours used. An example run would look like this:

$ ./run_WE2E_tests.py -t my_tests.txt -m hera -a gsd-fv3 -q
Checking that all tests are valid
Will run 2 tests:
Calling workflow generation function for test custom_ESGgrid
Workflow for test custom_ESGgrid successfully generated in

Calling workflow generation function for test grid_RRFS_CONUScompact_25km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v16
Workflow for test grid_RRFS_CONUScompact_25km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v16 successfully generated in

calling function that monitors jobs, prints summary
Writing information for all experiments to WE2E_tests_20230418174042.yaml
Checking tests available for monitoring...
Starting experiment custom_ESGgrid running
Updating database for experiment custom_ESGgrid
Starting experiment grid_RRFS_CONUScompact_25km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v16 running
Updating database for experiment grid_RRFS_CONUScompact_25km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v16
Setup complete; monitoring 2 experiments
Use ctrl-c to pause job submission/monitoring
Experiment custom_ESGgrid is COMPLETE
Took 0:19:29.877497; will no longer monitor.
Experiment grid_RRFS_CONUScompact_25km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v16 is COMPLETE
Took 0:29:38.951777; will no longer monitor.
All 2 experiments finished
Calculating core-hour usage and printing final summary
Experiment name                                                  | Status    | Core hours used
custom_ESGgrid                                                     COMPLETE              18.02
grid_RRFS_CONUScompact_25km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v16   COMPLETE              15.52
Total                                                              COMPLETE              33.54

Detailed summary written to /user/home/expt_dirs/WE2E_summary_20230418181025.txt

All experiments are complete
Summary of results available in WE2E_tests_20230418174042.yaml

As the script runs, detailed debug output is written to the file log.run_WE2E_tests. This can be useful for debugging if something goes wrong. Adding the -d flag will print all this output to the screen during the run, but this can get quite cluttered.

The progress of monitor_jobs() is tracked in a file WE2E_tests_{datetime}.yaml, where {datetime} is the date and time (in yyyymmddhhmmss format) that the file was created. The final job summary is written by the print_WE2E_summary(); this prints a short summary of experiments to the screen and prints a more detailed summary of all jobs for all experiments in the indicated .txt file.

$ cat /user/home/expt_dirs/WE2E_summary_20230418181025.txt
Experiment name                                                  | Status    | Core hours used
custom_ESGgrid                                                     COMPLETE              18.02
grid_RRFS_CONUScompact_25km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v16   COMPLETE              15.52
Total                                                              COMPLETE              33.54

Detailed summary of each experiment:

Detailed summary of experiment custom_ESGgrid
in directory /user/home/expt_dirs/custom_ESGgrid
                                        | Status    | Walltime   | Core hours used
make_grid_201907010000                    SUCCEEDED          13.0           0.09
get_extrn_ics_201907010000                SUCCEEDED          10.0           0.00
get_extrn_lbcs_201907010000               SUCCEEDED           6.0           0.00
make_orog_201907010000                    SUCCEEDED          65.0           0.43
make_sfc_climo_201907010000               SUCCEEDED          39.0           0.52
make_ics_mem000_201907010000              SUCCEEDED         120.0           1.60
make_lbcs_mem000_201907010000             SUCCEEDED         201.0           2.68
run_fcst_mem000_201907010000              SUCCEEDED         340.0          11.33
run_post_mem000_f000_201907010000         SUCCEEDED          11.0           0.15
run_post_mem000_f001_201907010000         SUCCEEDED          13.0           0.17
run_post_mem000_f002_201907010000         SUCCEEDED          16.0           0.21
run_post_mem000_f003_201907010000         SUCCEEDED          16.0           0.21
run_post_mem000_f004_201907010000         SUCCEEDED          16.0           0.21
run_post_mem000_f005_201907010000         SUCCEEDED          16.0           0.21
run_post_mem000_f006_201907010000         SUCCEEDED          16.0           0.21
Total                                     COMPLETE                         18.02

Detailed summary of experiment grid_RRFS_CONUScompact_25km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v16
in directory /user/home/expt_dirs/grid_RRFS_CONUScompact_25km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v16
                                        | Status    | Walltime   | Core hours used
make_grid_201907010000                    SUCCEEDED           8.0           0.05
get_extrn_ics_201907010000                SUCCEEDED           5.0           0.00
get_extrn_lbcs_201907010000               SUCCEEDED          11.0           0.00
make_orog_201907010000                    SUCCEEDED          49.0           0.33
make_sfc_climo_201907010000               SUCCEEDED          41.0           0.55
make_ics_mem000_201907010000              SUCCEEDED          83.0           1.11
make_lbcs_mem000_201907010000             SUCCEEDED         199.0           2.65
run_fcst_mem000_201907010000              SUCCEEDED         883.0           9.81
run_post_mem000_f000_201907010000         SUCCEEDED          10.0           0.13
run_post_mem000_f001_201907010000         SUCCEEDED          11.0           0.15
run_post_mem000_f002_201907010000         SUCCEEDED          10.0           0.13
run_post_mem000_f003_201907010000         SUCCEEDED          11.0           0.15
run_post_mem000_f004_201907010000         SUCCEEDED          11.0           0.15
run_post_mem000_f005_201907010000         SUCCEEDED          11.0           0.15
run_post_mem000_f006_201907010000         SUCCEEDED          12.0           0.16
Total                                     COMPLETE                         15.52

One might have noticed the line during the experiment run that reads “Use ctrl-c to pause job submission/monitoring”. The monitor_jobs() function (called automatically after all experiments are generated) is designed to be easily paused and re-started if necessary. To stop actively submitting jobs, simply quit the script using ctrl-c to stop the function, and a short message will appear explaining how to continue the experiment:

Setup complete; monitoring 1 experiments
Use ctrl-c to pause job submission/monitoring

User interrupted monitor script; to resume monitoring jobs run:

./monitor_jobs.py -y=WE2E_tests_20230418174042.yaml -p=1

2.5.3. Checking Test Status and Summary

By default, ./run_WE2E_tests.py will actively monitor jobs, printing to console when jobs are complete (either successfully or with a failure), and printing a summary file WE2E_summary_{datetime.now().strftime("%Y%m%d%H%M%S")}.txt. However, if the user is using the legacy crontab option (by submitting ./run_WE2E_tests.py with the --launch cron option), or if the user would like to summarize one or more experiments that either are not complete or were not handled by the WE2E test scripts, this status/summary file can be generated manually using WE2E_summary.py. In this example, an experiment was generated using the crontab option and has not yet finished running. We use the -e option to point to the experiment directory and get the current status of the experiment:

  ./WE2E_summary.py -e /user/home/PR_466/expt_dirs/
Updating database for experiment grid_RRFS_CONUScompact_25km_ics_HRRR_lbcs_HRRR_suite_RRFS_v1beta
Updating database for experiment grid_RRFS_CONUS_25km_ics_GSMGFS_lbcs_GSMGFS_suite_GFS_v16
Updating database for experiment grid_RRFS_CONUS_3km_ics_FV3GFS_lbcs_FV3GFS_suite_HRRR
Updating database for experiment specify_template_filenames
Updating database for experiment grid_RRFS_CONUScompact_25km_ics_HRRR_lbcs_RAP_suite_HRRR
Updating database for experiment grid_RRFS_CONUScompact_3km_ics_HRRR_lbcs_RAP_suite_RRFS_v1beta
Updating database for experiment grid_RRFS_CONUS_25km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_2017_gfdlmp_regional
Updating database for experiment grid_SUBCONUS_Ind_3km_ics_HRRR_lbcs_RAP_suite_HRRR
Updating database for experiment grid_RRFS_CONUS_3km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v16
Updating database for experiment grid_RRFS_SUBCONUS_3km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v16
Updating database for experiment specify_DOT_OR_USCORE
Updating database for experiment custom_GFDLgrid__GFDLgrid_USE_NUM_CELLS_IN_FILENAMES_eq_FALSE
Updating database for experiment grid_RRFS_CONUScompact_25km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v16
Experiment name                                             | Status    | Core hours used
grid_RRFS_CONUScompact_25km_ics_HRRR_lbcs_HRRR_suite_RRFS_v1  COMPLETE              49.72
grid_RRFS_CONUS_25km_ics_GSMGFS_lbcs_GSMGFS_suite_GFS_v16     DYING                  6.51
grid_RRFS_CONUS_3km_ics_FV3GFS_lbcs_FV3GFS_suite_HRRR         COMPLETE             411.84
specify_template_filenames                                    COMPLETE              17.36
grid_RRFS_CONUScompact_25km_ics_HRRR_lbcs_RAP_suite_HRRR      COMPLETE              16.03
grid_RRFS_CONUScompact_3km_ics_HRRR_lbcs_RAP_suite_RRFS_v1be  COMPLETE             318.55
grid_RRFS_CONUS_25km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_2017_g  COMPLETE              17.79
grid_SUBCONUS_Ind_3km_ics_HRRR_lbcs_RAP_suite_HRRR            COMPLETE              17.76
grid_RRFS_CONUS_3km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v16      RUNNING                0.00
grid_RRFS_SUBCONUS_3km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v16   RUNNING                0.00
specify_DOT_OR_USCORE                                         QUEUED                 0.00
custom_GFDLgrid__GFDLgrid_USE_NUM_CELLS_IN_FILENAMES_eq_FALS  QUEUED                 0.00
grid_RRFS_CONUScompact_25km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS  QUEUED                 0.00
Total                                                         RUNNING              855.56

Detailed summary written to WE2E_summary_20230306173013.txt

As with all python scripts in the SRW App, additional options for this script can be viewed by calling with the -h argument.

The “Status” as specified by the above summary is explained below:


    The experiment directory has been created, but the monitor script has not yet begun submitting jobs. This is immediately overwritten at the beginning of the “monitor_jobs” function, so this status should not be seen unless the experiment has not yet been started.


    All jobs are in status SUBMITTING or SUCCEEDED (as reported by the Rocoto workflow manager). This is a normal state; we will continue to monitor this experiment.


    One or more tasks have died (status “DEAD”), so this experiment has had an error. We will continue to monitor this experiment until all tasks are either status DEAD or status SUCCEEDED (see next entry).

  • DEAD

    One or more tasks are at status DEAD, and the rest are either DEAD or SUCCEEDED. We will no longer monitor this experiment.


    Could not read the rocoto database file. This will require manual intervention to solve, so we will no longer monitor this experiment.


    One or more jobs are at status RUNNING, and the rest are either status QUEUED, SUBMITTED, or SUCCEEDED. This is a normal state; we will continue to monitor this experiment.


    One or more jobs are at status QUEUED, and some others may be at status SUBMITTED or SUCCEEDED. This is a normal state; we will continue to monitor this experiment.


    All jobs are status SUCCEEDED; we will monitor for one more cycle in case there are unsubmitted jobs remaining.


    All jobs are status SUCCEEDED, and we have monitored this job for an additional cycle to ensure there are no un-submitted jobs. We will no longer monitor this experiment.

2.5.4. WE2E Test Information File

If the user wants to see consolidated test information, they can generate a file that can be imported into a spreadsheet program (Google Sheets, Microsoft Excel, etc.) that summarizes each test. This file, named WE2E_test_info.txt by default, is delimited by the | character and can be created either by running the ./print_test_info.py script, or by generating an experiment using ./run_WE2E_tests.py with the --print_test_info flag.

The rows of the file/sheet represent the full set of available tests (not just the ones to be run). The columns contain the following information (column titles are included in the CSV file):

Column 1
The primary test name followed (in parentheses) by the category subdirectory where it is located.
Column 2
Any alternate names for the test followed by their category subdirectories (in parentheses).
Column 3
The test description.
Column 4
The relative cost of running the dynamics in the test. This gives an idea of how expensive the test is relative to a reference test that runs a single 6-hour forecast on the RRFS_CONUS_25km predefined grid using its default time step (DT_ATMOS: 40). To calculate the relative cost, the absolute cost (abs_cost) is first calculated as follows:
abs_cost = nx*ny*num_time_steps*num_fcsts
Here, nx and ny are the number of grid points in the horizontal (x and y) directions, num_time_steps is the number of time steps in one forecast, and num_fcsts is the number of forecasts the test runs (see Column 5 below). (Note that this cost calculation does not (yet) differentiate between different physics suites.) The relative cost rel_cost is then calculated using:
rel_cost = abs_cost/abs_cost_ref
where abs_cost_ref is the absolute cost of running the reference forecast described above, i.e., a single (num_fcsts = 1) 6-hour forecast (FCST_LEN_HRS = 6) on the RRFS_CONUS_25km grid (which currently has nx = 219, ny = 131, and DT_ATMOS = 40 sec (so that num_time_steps = FCST_LEN_HRS*3600/DT_ATMOS = 6*3600/40 = 540). Therefore, the absolute cost reference is calculated as:
abs_cost_ref = 219*131*540*1 = 15,492,060
Column 5
The number of times the forecast model will be run by the test. This is calculated using quantities such as the number of cycle dates (i.e., forecast model start dates) and the number of ensemble members (which is greater than 1 if running ensemble forecasts and 1 otherwise). The number of cycle dates and/or ensemble members is derived from the quantities listed in Columns 6, 7, ….
Columns 6, 7, …
The values of various experiment variables (if defined) in each test’s configuration file. Currently, the following experiment variables are included:

2.5.5. Modifying the WE2E System

Users may wish to modify the WE2E testing system to suit specific testing needs. Modifying an Existing Test

To modify an existing test, simply edit the configuration file for that test by changing existing variable values and/or adding new variables to suit the requirements of the modified test. Such a change may also require modifications to the test description in the header of the file. Adding a New Test

To add a new test named, e.g., new_test01, to one of the existing test categories, such as wflow_features:

  1. Choose an existing test configuration file that most closely matches the new test to be added. It could come from any one of the category directories.

  2. Copy that file to config.new_test01.yaml and, if necessary, move it to the wflow_features category directory.

  3. Edit the header comments in config.new_test01.yaml so that they properly describe the new test.

  4. Edit the contents of config.new_test01.yaml by modifying existing experiment variable values and/or adding new variables such that the test runs with the intended configuration.