Comparison values (e.g. FCN) in the paper do not match up

Hey folks, first congrats to your paper & hard work!

I am wondering where the comparison values in the paper (e.g. FourCastNet RMSE) is coming from. They seem to be very different than what the authors describe in their papers and what we could experimentally verify last year:

- you report an RMSE for FCN of 1.28K for t2m at 6h, while the authors describe roughly 0.74K (which I can verify is correct)
- you report an RMSE for FCN of 1.68K for t2m at 24h, while the authors describe roughly 0.94K (which I can also verify)

These numbers would obv change the conclusion of the paper. Before I go deeper into checking also with other variables and ClimaX, I wanted to reach out to check if I am missing something. 

To compare your scores easily with other open AI weather models (you can choose target resolutions), I can highly recommend [WeatherBench's web UI](https://sites.research.google/weatherbench/deterministic-scores/)

One more comment:
- [GraphCast](https://github.com/google-deepmind/graphcast) and [Pangu](https://github.com/198808xc/Pangu-Weather) are open & can be used for comparison (in contrast to your statement in the paper)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comparison values (e.g. FCN) in the paper do not match up #1

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Comparison values (e.g. FCN) in the paper do not match up #1

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions