11. Workflow End-to-End (WE2E) Tests¶
The SRW App contains a set of end-to-end tests that exercise various workflow configurations of the SRW App. These are referred to as workflow end-to-end (WE2E) tests because they all use the Rocoto workflow manager to run their individual workflows. The purpose of these tests is to ensure that new changes to the App do not break existing functionality and capabilities.
Note that the WE2E tests are not regression tests—they do not check whether
current results are identical to previously established baselines. They also do
not test the scientific integrity of the results (e.g., they do not check that values
of output fields are reasonable). These tests only check that the tasks within each test’s workflow complete successfully. They are, in essence, tests of the workflow generation, task execution (j-jobs,
ex-scripts), and other auxiliary scripts to ensure that these scripts function correctly. Tested functions
include creating and correctly arranging and naming directories and files, ensuring
that all input files are available and readable, calling executables with correct namelists and/or options, etc. Currently, it is up to the external repositories that the App clones (Section 1.4) to check that changes to those repositories do not change results, or, if they do, to ensure that the new results are acceptable. (At least two of these external repositories—UFS_UTILS
and ufs-weather-model
—do have such regression tests.)
WE2E tests fall into one of two categories: fundamental or comprehensive. The list of supported fundamental and comprehensive tests can be viewed in ufs-srweather-app/tests/WE2E/machine_suites/
. Fundamental tests are a lightweight set of tests that can be automated and run regularly on each Level 1 platform. These tests verify that there are no major, obvious faults in the underlying code when running common combinations of grids, input data, and physics suites. The remaining tests are called comprehensive tests because they cover a broader range of capabilities, configurations, and components. The complete set of tests (fundamental and comprehensive) can be viewed in this table.
For convenience, the WE2E tests are currently grouped into the following categories (under ufs-srweather-app/tests/WE2E/test_configs
):
grids_extrn_mdls_suites_community
This category of tests ensures that the SRW App workflow running in community mode (i.e., with
RUN_ENVIR
set to"community"
) completes successfully for various combinations of predefined grids, physics suites, and input data from different external models. Note that in community mode, all output from the application is placed under a single experiment directory.
grids_extrn_mdls_suites_nco
This category of tests ensures that the workflow running in NCO mode (i.e., with
RUN_ENVIR
set to"nco"
) completes successfully for various combinations of predefined grids, physics suites, and input data from different external models. Note that in NCO mode, an operational run environment is used. This involves a specific directory structure and variable names (see Section 9.4).
wflow_features
This category of tests ensures that the workflow completes successfully with particular features/capabilities activated. To reduce computational cost, most tests in this category use coarser grids.
The test configuration files for these categories are located in the following directories, respectively:
ufs-srweather-app/tests/WE2E/test_configs/grids_extrn_mdls_suites_community
ufs-srweather-app/tests/WE2E/test_configs/grids_extrn_mdls_suites_nco
ufs-srweather-app/tests/WE2E/test_configs/wflow_features
The script to run the WE2E tests is named run_WE2E_tests.sh
and is located in the directory ufs-srweather-app/tests/WE2E
. Each WE2E test has an associated configuration file named config.${test_name}.yaml
, where ${test_name}
is the name of the corresponding test. These configuration files are subsets of the full range of config.yaml
experiment configuration options. (See Chapter 9 for all configurable options and Section 5.3.2.2 for information on configuring config.yaml
.) For each test, the run_WE2E_tests.sh
script reads in the test configuration file and generates from it a complete config.yaml
file. It then calls generate_FV3LAM_wflow.py
, which in turn reads in config.yaml
and generates a new experiment for the test. The name of each experiment directory is set to that of the corresponding test, and a copy of config.yaml
for each test is placed in its experiment directory.
Since run_WE2E_tests.sh
calls generate_FV3LAM_wflow.py
for each test, the
Python modules required for experiment generation must be loaded before run_WE2E_tests.sh
can be called. See Section 5.3.1 for information on loading the Python
environment on supported platforms. Note also that run_WE2E_tests.sh
assumes that all of
the executables have been built (see Section 4.4). If they have not, then run_WE2E_tests.sh
will still generate the experiment directories, but the workflows will fail.
11.1. Supported Tests¶
The full list of WE2E tests is extensive; it is not recommended to run all the tests, as some are computationally expensive. A subset of the full WE2E test suite is supported for the latest release of the SRW Application. Supported test cases can be viewed in this table.
11.2. Running the WE2E Tests¶
Users may specify the set of tests to run by creating a text file, such as my_tests.txt
, which contains a list of the WE2E tests to run (one per line). Then, they pass the name of that file to run_WE2E_tests.sh
. For example, to run the tests custom_ESGgrid
and grid_RRFS_CONUScompact_25km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v16
(from the wflow_features
and grids_extrn_mdls_suites_community
categories, respectively), users would enter the following commands from the WE2E
working directory (ufs-srweather-app/tests/WE2E/
):
cat > my_tests.txt
custom_ESGgrid
grid_RRFS_CONUScompact_25km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v16
(and Ctrl + D
to exit). For each test in my_tests.txt
, run_WE2E_tests.sh
will generate a new experiment directory and, by default, create a new cron job in the user’s cron table that will (re)launch the workflow every 2 minutes. This cron job calls the workflow launch script (launch_FV3LAM_wflow.sh
) until the workflow either completes successfully (i.e., all tasks SUCCEEDED) or fails (i.e., at least one task fails).
The cron job is then removed from the user’s cron table.
The examples below demonstrate several common ways that run_WE2E_tests.sh
can be called with the my_tests.txt
file above. These examples assume that the user has already built the SRW App and loaded the regional workflow as described in Section 5.3.1.
To run the tests listed in
my_tests.txt
on Hera and charge the computational resources used to the “rtrr” account, use:./run_WE2E_tests.sh tests_file="my_tests.txt" machine="hera" account="rtrr"
This will create the experiment subdirectories for the two sample WE2E tests in the directory
${HOMEdir}/../expt_dirs
, whereHOMEdir
is the top-level directory for the ufs-srweather-app repository (usually set to something like/path/to/ufs-srweather-app
). Thus, the following two experiment directories will be created:${HOMEdir}/../expt_dirs/custom_ESGgrid ${HOMEdir}/../expt_dirs/grid_RRFS_CONUScompact_25km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v16
In addition, by default, cron jobs will be added to the user’s cron table to relaunch the workflows of these experiments every 2 minutes.
To change the frequency with which the cron relaunch jobs are submitted from the default of 2 minutes to 1 minute, use:
./run_WE2E_tests.sh tests_file="my_tests.txt" machine="hera" account="rtrr" cron_relaunch_intvl_mnts="01"
To disable use of cron (which implies that the worfkow for each test will have to be relaunched manually from within each experiment directory), use:
./run_WE2E_tests.sh tests_file="my_tests.txt" machine="hera" account="rtrr" use_cron_to_relaunch="FALSE"
In this case, the user will have to go into each test’s experiment directory and either manually run the
launch_FV3LAM_wflow.sh
script or use the Rocoto commands described in Chapter 10 to (re)launch the workflow. Note that if using the Rocoto commands directly, the log filelog.launch_FV3LAM_wflow
will not be created; in this case, the status of the workflow can be checked using therocotostat
command (see Section 5.4.1.3 or Section 10.2).To place the experiment subdirectories in a subdirectory named
test_set_01
under${HOMEdir}/../expt_dirs
(instead of immediately underexpt_dirs
), use:./run_WE2E_tests.sh tests_file="my_tests.txt" machine="hera" account="rtrr" expt_basedir="test_set_01"
In this case, the full paths to the experiment directories will be:
${HOMEdir}/../expt_dirs/test_set_01/custom_ESGgrid ${HOMEdir}/../expt_dirs/test_set_01/grid_RRFS_CONUScompact_25km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v16
This is useful for grouping various sets of tests.
To use a test list file (again named
my_tests.txt
) located in a custom location instead of in the same directory asrun_WE2E_tests.sh
and to have the experiment directories be placed in a specific, non-default location (e.g.,/path/to/custom/expt_dirs
), use:./run_WE2E_tests.sh tests_file="/path/to/custom/location/my_tests.txt" machine="hera" account="rtrr" expt_basedir="/path/to/custom/expt_dirs"
The full usage statement for run_WE2E_tests.sh
is as follows:
./run_WE2E_tests.sh \
tests_file="..." \
machine="..." \
account="..." \
[expt_basedir="..."] \
[exec_subdir="..."] \
[use_cron_to_relaunch="..."] \
[cron_relaunch_intvl_mnts="..."] \
[verbose="..."] \
[generate_csv_file="..."] \
[machine_file="..."] \
[stmp="..."] \
[ptmp="..."] \
[compiler="..."] \
[build_env_fn="..."]
The arguments in brackets are optional. A complete description of these arguments can be obtained by issuing:
./run_WE2E_tests.sh --help
from within the ufs-srweather-app/tests/WE2E
directory.
11.3. The WE2E Test Information File¶
In addition to creating the WE2E tests’ experiment directories and optionally creating
cron jobs to launch their workflows, the run_WE2E_tests.sh
script generates a CSV (Comma-Separated Value) file named WE2E_test_info.csv
that contains information
on the full set of WE2E tests. This file serves as a single location where relevant
information about the WE2E tests can be found. It can be imported into Google Sheets
using the “|” (pipe symbol) character as the custom field separator. If the user does not want run_WE2E_tests.sh
to generate this CSV file the first time it runs,
this functionality can be explicitly disabled by including the generate_csv_file="FALSE"
flag as an argument when running this script.
The rows of the file/sheet represent the full set of available tests (not just the ones to be run). The columns contain the following information (column titles are included in the CSV file):
RRFS_CONUS_25km
predefined grid using
its default time step (DT_ATMOS: 40
). To calculate the relative cost, the absolute cost
(abs_cost
) is first calculated as follows:abs_cost = nx*ny*num_time_steps*num_fcsts
nx
and ny
are the number of grid points in the horizontal
(x
and y
) directions, num_time_steps
is the number of time
steps in one forecast, and num_fcsts
is the number of forecasts the
test runs (see Column 5 below). [Note that this cost calculation does
not (yet) differentiate between different physics suites.] The relative
cost rel_cost
is then calculated usingrel_cost = abs_cost/abs_cost_ref
abs_cost_ref
is the absolute cost of running the reference forecast
described above, i.e., a single (num_fcsts = 1
) 6-hour forecast
(FCST_LEN_HRS = 6
) on the RRFS_CONUS_25km grid
(which currently has
nx = 219
, ny = 131
, and DT_ATMOS = 40 sec
(so that num_time_steps
= FCST_LEN_HRS*3600/DT_ATMOS = 6*3600/40 = 540
). Therefore, the absolute cost reference is calculated as:abs_cost_ref = 219*131*540*1 = 15,492,060
PREDEF_GRID_NAME
CCPP_PHYS_SUITE
EXTRN_MDL_NAME_ICS
EXTRN_MDL_NAME_LBCS
DATE_FIRST_CYCL
DATE_LAST_CYCL
INCR_CYCL_FREQ
FCST_LEN_HRS
DT_ATMOS
LBC_SPEC_INTVL_HRS
NUM_ENS_MEMBERS
Additional fields (columns) may be added to the CSV file in the future.
Note that the CSV file is not part of the ufs-srweather-app
repository and therefore is
not tracked by the repository. The run_WE2E_tests.sh
script will generate a CSV
file if the generate_csv_file
flag to this script has not explicitly been
set to false and if either one of the following is true:
The CSV file doesn’t already exist.
The CSV file does exist, but changes have been made to one or more of the category subdirectories (e.g., test configuration files modified, added, or deleted) since the creation of the CSV file.
Thus, unless the generate_csv_file
flag is set to "FALSE"
, the
run_WE2E_tests.sh
will create a CSV file the first time it is run in a
fresh git clone of the SRW App. The generate_csv_file
flag is provided
because the CSV file generation can be slow, so users may wish to skip this
step since it is not a necessary part of running the tests.
11.4. Checking Test Status¶
If cron jobs are used to periodically relaunch the tests, the status of each test can be checked by viewing the end of the log file (log.launch_FV3LAM_wflow
). Otherwise (or alternatively), the rocotorun
/rocotostat
combination of commands can be used. (See Section 5.4.1.3 for details.)
The SRW App also provides the script get_expts_status.sh
in the directory
ufs-srweather-app/tests/WE2E
, which can be used to generate
a status summary for all tests in a given base directory. This script updates
the workflow status of each test by internally calling launch_FV3LAM_wflow.sh
. Then, it prints out the status of the various tests in the command prompt. It also creates
a status report file named expts_status_${create_date}.txt
(where create_date
is a time stamp in YYYYMMDDHHmm
format corresponding to the creation date/time
of the report) and places it in the experiment base directory. By default, this status file
contains the last 40 lines from the end of the log.launch_FV3LAM_wflow
file. This number can be adjusted via the num_log_lines
argument. These lines include the experiment status as well as the task status table generated by rocotostat
so that, in case of failure, it is convenient to pinpoint the task that failed.
For details on the usage of get_expts_stats.sh
, issue the following command from the WE2E
directory:
./get_expts_status.sh --help
Here is an example of how to call get_expts_status.sh
from the WE2E
directory:
./get_expts_status.sh expts_basedir=/path/to/expt_dirs/set01
The path for expts_basedir
should be an absolute path.
Here is an example of output from the get_expts_status.sh
script:
Checking for active experiment directories in the specified experiments
base directory (expts_basedir):
expts_basedir = "/path/to/expt_dirs/set01"
...
The number of active experiments found is:
num_expts = 2
The list of experiments whose workflow status will be checked is:
'custom_ESGgrid'
'grid_RRFS_CONUScompact_25km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v16'
======================================
Checking workflow status of experiment "custom_ESGgrid" ...
Workflow status: SUCCESS
======================================
======================================
Checking workflow status of experiment "grid_RRFS_CONUScompact_25km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v16" ...
Workflow status: IN PROGRESS
======================================
A status report has been created in:
expts_status_fp = "/path/to/expt_dirs/set01/expts_status_202204211440.txt"
DONE.
The “Workflow status” field of each test indicates the status of its workflow. The values that this can take on are “SUCCESS”, “FAILURE”, and “IN PROGRESS”.
11.5. Modifying the WE2E System¶
This section describes various ways in which the WE2E testing system can be modified to suit specific testing needs.
11.5.1. Modifying an Existing Test¶
To modify an existing test, simply edit the configuration file for that test by changing existing variable values and/or adding new variables to suit the requirements of the modified test. Such a change may also require modifications to the test description in the header of the file.
11.5.2. Adding a New Test¶
To add a new test named, e.g., new_test01
, to one of the existing test categories, such as wflow_features
:
Choose an existing test configuration file in any one of the category directories that matches most closely the new test to be added. Copy that file to
config.new_test01.yaml
and, if necessary, move it to thewflow_features
category directory.Edit the header comments in
config.new_test01.yaml
so that they properly describe the new test.Edit the contents of
config.new_test01.yaml
by modifying existing experiment variable values and/or adding new variables such that the test runs with the intended configuration.
11.5.3. Adding a New WE2E Test Category¶
To create a new test category called, e.g., new_category
:
In the directory
ufs-srweather-app/tests/WE2E/test_configs
, create a new directory namednew_category
.In the file
get_WE2Etest_names_subdirs_descs.sh
, add the element"new_category"
to the arraycategory_subdirs
, which contains the list of categories/subdirectories in which to search for test configuration files. Thus,category_subdirs
becomes:category_subdirs=( \ "." \ "grids_extrn_mdls_suites_community" \ "grids_extrn_mdls_suites_nco" \ "wflow_features" \ "new_category" \ )
New tests can now be added to new_category
using the procedure described in Section 11.5.2.
11.5.4. Creating Alternate Names for a Test¶
To prevent proliferation of WE2E tests, users might want to use the same test for multiple purposes. For example, consider the test
grid_RRFS_CONUScompact_25km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v16
in the grids_extrn_mdls_suites_community
category. This checks for the successful
completion of the Rocoto workflow running a combination of the RRFS_CONUScompact_25km
grid, the FV3GFS
model data for ICs and LBCs, and the FV3_GFS_v16
physics suite. If this test also happens to use the inline post capability of the UFS Weather Model (it currently doesn’t; this is only a hypothetical example), then this test can also be used to ensure that the inline post feature of the SRW App/Weather Model (which is activated in the SRW App by setting WRITE_DOPOST: true
) is working properly. Since this test will serve two purposes, it should have two names — one per purpose.
To set the second (alternate) name to activate_inline_post
, the user needs to create a symlink named config.activate_inline_post.yaml
in the wflow_features
category directory that points to the original configuration file (config.grid_RRFS_CONUScompact_25km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v16.yaml
) in the grids_extrn_mdls_suites_community
category directory:
ln -fs --relative </path/to/grids_extrn_mdls_suites_community/config.grid_RRFS_CONUScompact_25km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v16.yaml> </path/to/wflow_features/config.activate_inline_post.yaml>
In this situation, the primary name for the test is grid_RRFS_CONUScompact_25km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v16
(because config.grid_RRFS_CONUScompact_25km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v16.yaml
is an actual file, not a symlink), and activate_inline_post
is an alternate name. This approach of allowing multiple names for the same test makes it easier to identify the multiple purposes that a test may serve.
Note
A primary test can have more than one alternate test name (by having more than one symlink pointing to the test’s configuration file).
The symlinks representing the alternate test names can be in the same or a different category directory.
The
--relative
flag makes the symlink relative (i.e., within/below thetests
directory) so that it stays valid when copied to other locations. (Note, however, that this flag is platform-dependent and may not exist on some platforms.)To determine whether a test has one or more alternate names, a user can view the CSV file
WE2E_test_info.csv
generated by therun_WE2E_tests.sh
script. Recall from Section 11.3 that column 1 of this CSV file contains the test’s primary name (and its category) while column 2 contains any alternate names (and their categories).With this primary/alternate test naming convention, a user can list either the primary test name or one of the alternate test names in the experiments list file (e.g.,
my_tests.txt
) read in byrun_WE2E_tests.sh
. If more than one name is listed for the same test (e.g., the primary name and and an alternate name, two alternate names, etc.),run_WE2E_tests.sh
will exit with a warning message and will not run any tests.