Skip to content

v0.7.0

Choose a tag to compare

@rouille rouille released this 23 Dec 18:37
· 2 commits to main since this release
9bfd60e

This is a new major release that includes new data for 2024 and several enhancements compared to v0.6.1

2024 Data Release

OGE now includes data for 2024, based on the final release data from EIA (Forms 860, 923, and 930) and EPA (Continuous Emissions Monitoring System data. Along with new 2024 data, the existing 2005-2023 OGE data has been updated with the latest methodological improvements.

Improvements

EIA-930 data

In previous years, we utilized raw data downloads from EIA Form 930 (Hourly and Daily Balancing Authority Operations Report) in our pipeline, but this data source has now been integrated into PUDL, so we have switched to use PUDL's version of this data. As for our other data inputs, we rely on PUDL for cleaning and organizing EIA-930 data into well-modeled tables that facilitates the downstream analysis.

As you may know, EIA recently updated its Form 930 data to add new, more detailed fuel/generation categories, especially for renewables and storage, distinguishing between battery storage, solar (with/without integrated battery), wind (with/without integrated battery), geothermal, pumped storage (separate from hydro), and other storage, to better track hourly grid integration of diverse resources, providing crucial data for grid management and analysis. However, to date, only certain balancing areas have started using these new fuel categories. For this reason, in this release we map the new fuel types to the existing ones.

Enhancements for non-local data inputs

For projects that use oge as a dependency or use functions relying on PUDL data, it was previously necessary to download a local version of PUDL's (multi-GB) sqlite database since dataframes cannot be directly read from remote sqlite databases. However, PUDL has now made parquet versions of its tables available, which means that these can now be read directly from the cloud without having to download a local version. This version of OGE now includes options to read these input files from the cloud (#411)

Fix misallocation of generation and fuel to individual generators

Our data pipeline relies on a process to allocate generation and fuel data reported in EIA-923 to individual generators at each plant. We discovered and fixed a bug that affected plants with generators retiring or coming online in the report year that was resulting in misallocations of generation and fuel to individual generators at a plant. See catalyst-cooperative/pudl#4789 for more details (note: we currently use a forked version of this code to run this pipeline, so while this fix has not yet been merged in pudl, it has been fixed in our fork).

Consumed emission calculation enhancements

In addition to improving the data cleaning of the EIA-930 data that is used as an input to the consumed emissions calculation (#430), we also made a small update to the methodology used to calculate monthly and annual consumed emissions rates. Previously, we had used implied demand (generation minus interchange) for weighting the hourly emission rates when calculating monthly and annual aggregations. However, this approach led to higher occurrences of missing data. With this release, we now use the directly reported demand data for each BA from EIA-930 (#422)

Expanded subplant crosswalk

We had previously not created subplant IDs for proposed generators that were not far along enough in construction. However, we have found ourselves interacting with more data that requires information about these generators, so we decided to expand our subplant crosswalk to include more proposed generators (#428). While we currently use a separate pipeline from PUDL for assigning subplant IDs, we hope to harmonize these processes in the future and rely on PUDL's subplant IDs in a future release (catalyst-cooperative/pudl#3691)

Optimized memory use of data pipeline

Each new generator that gets added to the grid increases the amount of hourly data that we work with each year. We found that we were having trouble running the full OGE pipeline without running into memory (RAM) errors for more recent years on certain computers, so refactored some of our code to use memory more efficiently (#419). One larger change we implemented was to drop data for hours when generators were not operating (#432). We found that observations where all operational data (fuel consumption, generation, emissions) were zero accounted for over 2.5GB of data in our pipeline! Removing this data required some additional downstream changes to ensure data completeness in our outputs.

What's Changed

Full Changelog: v0.6.1...v0.7.0