-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Expanded preprocessing tools
Additional options and tools to work with jshd files.
Background
Data can be preprocessed for improved performance through the use of PreprocessCommand. These can be used in the online editor (see SandboxInputOutputLayer) or when run locally (see JvmInputOutputLayer). These can be used in simulations via the external command (see external in examples/guide/grass_shrub_fire_cli.josh). Users have been inspecting these files by loading them into the editor and inspecting the values through the online visualizations.
These files can created from geotiffs and netCDF files. Originally built for geotiffs, there is an --append option which, instead of creating a new jshd file, adds to an existing file. This is used to accumulate data from multiple files with the intention that each file contains a single time period like a year. See StreamToPrecomputedGridUtil and GridCombiner.
Note that there is a transformation from earth space (see EarthShape) to grid space (see EnginePoint). However, operations generally are performed in grid space during actual simulation execution. The projection is based on the grid defined in the Josh script (see ExtentsTransformer). This provides important performance benefits by doing the expensive transformation operations in advance.
Objective
We want to make it easier to build up these jshd files and inspect them afterwards. While the current functionality is focused on using append to build up data across timepoints (different files with different temporal coverage), we should ensure that it works to build up across different spatial extents (different files with different spatial coverage). This is a slightly riskier operation so there are supporting pieces we should also add:
- Check that joining jshd files works through pre-compute as well.
- To catch mistakes, we should allow users to specify a default value to use if no matching data from external files were found.
- We should allow the user to inspect
jshdfiles using the jar outside of a simulation. - Ensure testing covers the append case (either temporal or spatial is OK) with an inspection.
Implementation
This requires a few different component to achieve.
Component 1: Allow specifying a default value
We should add a --default-value option to PreprocessCommand. We should add a fill method to UniformPrecomputedGrid that takes a value to apply across all spaces in the grid that are then overwritten where data are available from the source being copied into the grid / jshd file. In other words, the fill should happen after making the new grid but before data from the source file are copied into it. For append, this fill should not be applied.
Component 2: Ignore default values when copying
If a default value (see component 1) is given, then we should be able to provide it to StreamToPrecomputedGridUtil inside an Optional (empty if no default value) such that values are not copied into the grid if they equal that value. If it is possible, let's have a tolerance of +/- 0.000001 when determining equality to the default value.
Component 3: Allow pre-compute to use jshd as input
We should validate if we can use a jshd file as input to PreprocessCommand similar to how a netcdf can be given. Like in component 2, values matching the default value should not be copied over.
If possible, this should check that the grid is the same size and copy values over into the other jshd file given. If taking this optional step, they do not need to have the same temporal bounds but they need to have the same spatial bounds as values are likely given in grid space.
Component 4: Add inspectJshd command
Inside org.joshsim.command, we should add an InspectJshdCommand similar to PreprocessCommand that takes in a jshd file, a variable name, a time point, an x coordinate, and a y coordinate. It should print the value with units found at that time and location or an exception if not found.
Component 5: Expand on test of basic preprocess
With the addition of the inspectJshd command on the jar, we can expand upon the test of preprocessing seen in testPreprocessTutorial. Specifically, let's rename this to testPreprocess with name Test Preprocessing. Then, as we have known points in GeotiffExternalDataReaderTest, let's use the same geotiff file with a new josh script added to examples/test on which we can run preprocess. We can then we can inspect that jshd file using inspectJshd. This can all happen from a new shell script in examples.
Component 6: Expand on temporal test
Let's expand test_preprocess.sh which already combines multiple years of geotiffs to use inspectJshd at the end to check values across multiple years.
Component 7: Add spatial combine test
Let's try combining the precipitation datasets from the Grass Shrub Fire tutorial (see grass_shrub_fire.html) and Two Trees (see two_trees.html) for a single year. These have overlapping but non-identical spatial extents. This should use a default value of -1000. See test_preprocess.sh.
In other words, we can convert one file from both sets to jshd. We can do an inspect at one point in the first at a location that is not in the second and then inspect at one point the second at a location that is not in the first. We should then combine and validate that both points are found with the same value in the resulting combined jshd file.
Please first work on the files individually to inspect to get the expected values. THen, convert the CHC-CMIP6 geotiff data from Grass Shrub Fire to a new jshd file with preprocess. Then, please add the netCDF from Two Trees to that new jshd file we just made from Grass Shrub Fire with append. This avoids combining a jshd + jshd file combination which would not be valid for spatial non-overlap.