[ENH] add feature scaling support for `EncoderDecoderDataModule` by PranavBhatP · Pull Request #1983 · sktime/pytorch-forecasting

PranavBhatP · 2025-10-12T06:23:24Z

Reference Issues/PRs

fixes #1979

What does this implement/fix? Explain your changes.

Implements feature scaling for continuous targets and features in the D2 layer using input scalers and target_normalizer params inside the

Current state of this PR:

Extend functionality of existing preprocess_data to handle scaling of target and continuous features using TorchNormalizer, , EncoderNormalizer, StandardScaler and RobustScaler .
Fits all scalers and normalizers during setup and generates a preprocessed version of the dataset to feed into the respective train-val-test splits of the actual dataset.
NOTE: The EncoderNormalizer unlike the others, needs to be fit on every encoding sequence and transform the sequence "on-the-fly". The logic for fitting and normalizing here is separate and is done inside __getitem__ when the D2 dataset feeds data into the model during training.

Issues (EDIT: resolved):

Issues arises with usage of sklearn scalers. sklearn scalers accept only numpy or pandas data. However when we cast the torch data (from the TimeSeries D1 object) into pandas/numpy data the computation graph is broken. Refer comment.

UPDATE: THIS HAS BEEN FIXED. We move preprocessing to the datamodule level (during setup()) instead of calling the transform inside the d2 dataset getitem. This ensures the we don't break the computation by completely moving the normalization pipeline away from the actual training stage, but perform it during the data module's setup stage.

DRAWBACK: We cannot work with larger-than-memory data formats with this method. The trade-off here is loosing compatiblity with sklearn scalers.

What should a reviewer concentrate their feedback on?

PR checklist

The PR title starts with either [ENH], [MNT], [DOC], or [BUG]. [BUG] - bugfix, [MNT] - CI, test framework, [ENH] - adding or improving code, [DOC] - writing or improving documentation or docstrings.
Added/modified tests
Used pre-commit hooks when committing to ensure that code is compliant with hooks. Install hooks with pre-commit install.
To run hooks independent of commit, execute pre-commit run --all-files

codecov · 2025-10-16T18:59:13Z

Codecov Report

❌ Patch coverage is 87.93970% with 24 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (main@7125289). Learn more about missing BASE report.

Files with missing lines	Patch %	Lines
pytorch_forecasting/data/data_module.py	87.79%	21 Missing ⚠️
pytorch_forecasting/base/_base_pkg.py	88.88%	3 Missing ⚠️

Additional details and impacted files

@@           Coverage Diff           @@
##             main    #1983   +/-   ##
=======================================
  Coverage        ?   86.63%           
=======================================
  Files           ?      165           
  Lines           ?     9919           
  Branches        ?        0           
=======================================
  Hits            ?     8593           
  Misses          ?     1326           
  Partials        ?        0

Flag	Coverage Δ
cpu	`86.63% <87.93%> (?)`
pytest	`86.63% <87.93%> (?)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

PranavBhatP · 2025-10-17T12:42:21Z

Tests are failing on the notebook. Seems like its an issue caused by the training script in the notebook. Will try changing accelator = "cpu". Might resolve the issue.

review-notebook-app · 2025-10-17T12:43:14Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

…rsisons

…g script

pytorch_forecasting/data/data_module.py

phoeenniixx · 2025-10-21T06:08:52Z

I think we could look at v1 implementation? Maybe that could help us with this issue, how the sklearn scalers are handled there?

PranavBhatP · 2025-10-21T07:07:19Z

Yes we will need a design doc to analyse v1 implementation. This is the best way forward.

fkiraly · 2025-10-22T18:25:26Z

agree - it seems like one more rabbit hole...

Is there perhaps something we can learn from pytorch-tabular?

PranavBhatP · 2025-11-02T16:41:41Z

I've replaced the loss function used in the notebook with nn.L1Loss(). I cannot think of a fix to the current issue with PTF metrics. Any thoughts yet, about what is causing it? @phoeenniixx @agobbifbk @fkiraly? Do we proceed with this PR by running the model in the notebook with nn metrics?

fkiraly

May I request an explanation why the notebook is running prior to the changes but now failing?

PranavBhatP · 2025-11-03T14:48:25Z

May I request an explanation why the notebook is running prior to the changes but now failing?

I did not get your question. Are you referring to why the tests were failing priority to the changes in the most recent commits? It is passing now...For context, the tests were failing on the notebook because training simply does not seem to work when we use in-house PTF metrics. More context on the initial failing test - https://github.com/sktime/pytorch-forecasting/actions/runs/18570677570/job/52943552548

However, the pipeline works perfectly fine when we use nn metrics. That is why we replace the metric parameter for the TFT model with nn.L1Loss() instead MAE().

agobbifbk · 2025-12-17T07:41:27Z

Hello @PranavBhatP I looked at the scalers issue that we cant fit sklearn scalers in the datamodule as this breaks the computation graph. I have some comments, please lmk if you think this could work.

Currently we call _preprocess_data inside __getitem__ (see here). This means whenever we perform scaling (inside _preprocess_data), we perform the scaling INSIDE the nested dataset class, which is then passed to the dataloaders. So, actually scaling is happing DURING training, when dataloader calls the __getitem__ of the dataset. And when we try calling sklearn scalers, it breaks the computation graph.

So, I looked at pytorch-tabular - how they actually solve this issue:

They perform scaling BEFORE training as preprocess_data (where scaling etc is happening) is called not in __getitem__ but in setup. (see here)

The training dataloaders are not created at this time, they just cache the datasets at this time (see here). These cached datasets have the scaled normalised data - they dont scale one __getitem__ at a time, rather the whole train-split, val-split etc are scaled, normalised at once.

These cached datasets are then saved in the respective train_dataset, val_dataset properties. (see here), which are then sent to the dataloaders (see here)

These dataloaders are then called directly during the training (see here)

I think we should do something similar, do the scaling BEFORE the training, not DURING the training.

What are your thoughts about it? Also FYI @agobbifbk @fkiraly

If we want to be able to work with data that can not fit in memory the scalers transform method must happen in the __getitem__ function, during the training. Precomputing the scaled data in the different data loader can be done only if we are sure that the whole data can fit in memory. I know that this is time consuming and inefficient but if we want to keep this feature I don't see any other possible solution. We can rewrite the partial fit for the StandardScaler and MinMaxScaler in pure torch so that we don't need to pass through numpy but still the transformation should happen in the __getitem__ . We can decide that this d2 layer is only compliant with the d1 layer with a single dataframe that fits in memory and postpone the problem but probably with a little effort we can manage to do both with a single flag (easier also to maintain). @fkiraly what do you think?

PranavBhatP · 2025-12-19T18:37:59Z

@agobbifbk I say we merge with whatever we have (after review) and extend scaling via partial_fit to larger-than-memory datasets in a new PR?

agobbifbk · 2025-12-22T08:06:31Z

Ok!

PranavBhatP · 2025-12-28T11:04:01Z

@fkiraly @phoeenniixx @agobbifbk would appreciate another review, this PR is complete from my side, do take a look whenever ur free. Thanks.

phoeenniixx

Thanks!
Please add the complete docstrings in the functions and also remove the todos that are met by this PR.
I also have some suggestions:

I think we should also add the support of GroupNormalizer, MultiNormalizer of ptf (see the comment below).
I am a little worried about the test stage of datamodule, here, if we are NOT using the same datamodule that we used for fitting, the code may break as fitted scalers etc are not present?
We should also think of saving the scalers, transformers in this case? I think this is where pkg layer comes in? I think we need some discussion about this?

What do you think? Also FYI @fkiraly @agobbifbk

pytorch_forecasting/data/data_module.py

tests/test_data/test_data_module.py

phoeenniixx · 2026-01-01T12:14:51Z

pytorch_forecasting/data/data_module.py


        elif stage == "test":
            if not hasattr(self, "test_dataset"):
+                self._test_preprocessed = self._preprocess_all_data(self._test_indices)


If we are creating a new datamodule for testing (not using the dm that we used for fit), will this break as we dont have the fitted scalers etc here?

Yes it will break, thanks for pointing that out, we need to figure out a way to persist the scalers..

I have fixed it, it works as intended in the latest version, pls review it when possible.

PranavBhatP · 2026-01-01T17:10:37Z

I think we should also add the support of GroupNormalizer, MultiNormalizer of ptf (see the comment below).

MultiNormalizer is something we could take a look at for sure, but the data from it would be unusable at the model layer since we dont support multi-target forecast in the D2.

GroupNormalizer would be much more harder to implement imo. It would demand much more change to the current architecture since it requires handling with pandas and so on, I think supporting groups will be separate rework/PR, it would require more time.

We should also think of saving the scalers, transformers in this case? I think this is where pkg layer comes in? I think we need some discussion about this?

Yes that is a valid concern, we need to globalise the transforms/scalers used. We need to figure out how to save the state also...across data modules. For now it throws a RuntimeError (see below commit)

PranavBhatP · 2026-01-01T18:09:52Z

Looks like the lack of fitting will lead to failures..we have to add persistence before we proceed.

PranavBhatP · 2026-01-05T13:13:47Z

We should also think of saving the scalers, transformers in this case? I think this is where pkg layer comes in? I think we need some discussion about this?

@phoeenniixx Could you give a rough mental idea about how the package layer would work on this? I am trying to come up with something and a reference would be great!

…ss separate datamodules

PranavBhatP · 2026-01-22T09:07:07Z

i have added saving and loading of scalers. The PR is complete with the respect to current requirements.

agobbifbk · 2026-01-22T09:36:41Z

Nice one!
Just a comment: a more efficient and less memory demanding approach can rely on partial_fit like here. Since there are different scalers in this PR, and not all has the partial_fit method implemented I think we can solve it in a different issue. Can we also add MinMaxScaler or any scaler in sklearn? (so the user can pass as argument sklearn.preprocessing.MinMaxScaler(feature_range=(0.05, 0.95)) directly and evaluate the string afterwards (eval(scaler))?
THX!

…stage

…rt scaling and normalization

PranavBhatP · 2026-01-24T13:02:31Z

Since there are different scalers in this PR, and not all has the partial_fit method implemented I think we can solve it in a different issue. Can we also add MinMaxScaler or any scaler in sklearn? (so the user can pass as argument sklearn.preprocessing.MinMaxScaler(feature_range=(0.05, 0.95)) directly and evaluate the string afterwards (eval(scaler))?

Maybe we need a torch version of partial_fit? That's for another issue I think. We can merge this one and work on that next or someone can take it up.

Adding MinMaxScaler also makes perfect sense, it would be a good addition in a future PR (good first issue?)

PranavBhatP · 2026-01-25T05:34:44Z

@phoeenniixx @fkiraly I would like a review.

phoeenniixx

Thanks! I have added few suggestions related to loading and saving scalers.

phoeenniixx · 2026-01-25T07:16:16Z

pytorch_forecasting/base/_base_pkg.py

+            if self.datamodule is not None:
+                self.datamodule.time_series_dataset = data
+                self.datamodule.setup(stage="predict")
+                return self.datamodule.predict_dataloader()


Why aren't we using the dm from the next line (line 181) here? I think we could directly use that to setup predict stage? (See line 193)

The reason for this (according to my intrepretation) - the dm in line 181 is a new data module which is created from scratch using the input data, which in this case is the data you want to generate predictions for. But if the datamodule has already been setup for training and has fitted scalers, there is no necessity to load scalers from a saved state, you simply set the dataset to the data you want to predict on, and call setup(stage="predict") on this, right? It also makes the process more efficient without having to create new dataloaders, everytime we reuse the same datamodule in predict stage.

But there is also a tiny concern - since we are replacing the training data entirely, so we wont have access to the old data. But I don't see how this would cause any issues.

I hope this logic seems alright, if there is anything wrong do let me know.

Maybe instead of overriding the data completely, I could create a copy and then pass the scaler states to the new datamodule, without need for a pickle file.

phoeenniixx · 2026-01-25T07:17:39Z

pytorch_forecasting/base/_base_pkg.py

        save_ckpt: bool = True,
        ckpt_dir: str | Path = "checkpoints",
        ckpt_kwargs: dict[str, Any] | None = None,
+        save_scalers: bool = True,


Should we expose this param? I mean if the user is checkpointing the model, they will always save the scalers, no? Are there any cases when the user only uses the models and not the scalers?

I am not 100% sure about whether this param should be exposed, I was thinking maybe due to some reason the user might not want to save their scalers after training due to memory constraints or simply because they know that they are not going to be testing or using the scalers afterwards in a new session, hence won't require any kind of persistence?

phoeenniixx · 2026-01-25T07:21:19Z

pytorch_forecasting/data/data_module.py


        return windows

+    def save_scalers(self, path: str | Path):


Should we separate the concerns here?

The datamodule just does the data loading and the preprocessing as intended

The pkg layer actually saves and load these scalers?

What I mean is the logic of saving and loading should be moved to pkg layer. The datamodule just returns the instances of the scalers and accepts the instances the scalers. The pkg layer saves and load the scalers and then send them to the datamodule as instances?

What do you think?

PranavBhatP added 5 commits October 12, 2025 11:46

add feature scaling to d2

49c3ab4

Merge branch 'main' into feature-scaling

1942383

fix incorrect orig_idx index

403e0f2

fix incorrect attibute

5661be4

handle unfitted scalers

38cefe4

PranavBhatP marked this pull request as ready for review October 17, 2025 11:08

PranavBhatP requested review from benHeid, fkiraly, fnhirwa, jdb78, phoeenniixx and yarnabrina as code owners October 17, 2025 11:08

change accelerator to cpu in v2 notebook cell 10

f242290

PranavBhatP added 3 commits October 18, 2025 14:16

use torch.from_numpy instead of torch.tensor for numpy to torch conve…

54da1c4

…rsisons

revert accelerator mode to auto from cpu for example notebook trainin…

c145e9b

…g script

potential fix for issue in trainingof v2

ec4cf03

phoeenniixx reviewed Oct 21, 2025

View reviewed changes

pytorch_forecasting/data/data_module.py Outdated Show resolved Hide resolved

phoeenniixx mentioned this pull request Oct 28, 2025

[ENH] Add predict to v2 models #1984

Merged

6 tasks

PranavBhatP added 2 commits November 2, 2025 22:07

replace MAE() with nn.L1Loss() to fix notebook test failures

18f2b2a

Merge branch 'main' into feature-scaling

5c99959

fkiraly requested changes Nov 2, 2025

View reviewed changes

PranavBhatP requested a review from phoeenniixx November 3, 2025 14:51

PranavBhatP added 2 commits December 28, 2025 16:31

add encoder normalizer support for data module

44e0545

Merge branch 'main' into feature-scaling

96669f1

remove contentious line for testing normalizer behavior

e927ec2

phoeenniixx requested changes Jan 1, 2026

View reviewed changes

add validation for preprocessing in test and predict dataset

b45279a

PranavBhatP and others added 2 commits January 1, 2026 23:41

skip check for fitting before preprocessing

67aba1b

Merge branch 'main' into feature-scaling

f10fbaf

phoeenniixx added enhancement New feature or request module:datasets&dataloaders labels Jan 14, 2026

PranavBhatP and others added 3 commits January 21, 2026 18:42

Merge branch 'main' into feature-scaling

94ca96a

add loading and saving of normalizer and scaler metadata for use acro…

93159fb

…ss separate datamodules

improve test suite for normalizer and scalers

3665577

PranavBhatP requested a review from phoeenniixx January 22, 2026 09:06

PranavBhatP added 3 commits January 22, 2026 18:54

save and load scaler in base_pkg class

3cb479b

fix saving state of scalers when save_ckpt is false in model_pkg fit …

f0969d7

…stage

avoid call to save and fit scalers in data modules which do not suppo…

9355c01

…rt scaling and normalization

Merge branch 'main' into feature-scaling

67b1f59

phoeenniixx requested changes Jan 25, 2026

View reviewed changes

Merge branch 'main' into feature-scaling

adc46c2

Conversation

PranavBhatP commented Oct 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reference Issues/PRs

What does this implement/fix? Explain your changes.

What should a reviewer concentrate their feedback on?

PR checklist

Uh oh!

codecov bot commented Oct 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

PranavBhatP commented Oct 17, 2025

Uh oh!

review-notebook-app bot commented Oct 17, 2025

Uh oh!

Uh oh!

phoeenniixx commented Oct 21, 2025

Uh oh!

PranavBhatP commented Oct 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

fkiraly commented Oct 22, 2025

Uh oh!

PranavBhatP commented Nov 2, 2025

Uh oh!

fkiraly left a comment

Choose a reason for hiding this comment

Uh oh!

PranavBhatP commented Nov 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

agobbifbk commented Dec 17, 2025

Uh oh!

PranavBhatP commented Dec 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

agobbifbk commented Dec 22, 2025

Uh oh!

PranavBhatP commented Dec 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

phoeenniixx left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

phoeenniixx Jan 1, 2026

Choose a reason for hiding this comment

Uh oh!

PranavBhatP Jan 1, 2026

Choose a reason for hiding this comment

Uh oh!

PranavBhatP Jan 24, 2026

Choose a reason for hiding this comment

Uh oh!

PranavBhatP commented Jan 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

PranavBhatP commented Jan 1, 2026

Uh oh!

PranavBhatP commented Jan 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

PranavBhatP commented Jan 22, 2026

Uh oh!

agobbifbk commented Jan 22, 2026

Uh oh!

PranavBhatP commented Jan 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

PranavBhatP commented Jan 25, 2026

Uh oh!

phoeenniixx left a comment

Choose a reason for hiding this comment

Uh oh!

phoeenniixx Jan 25, 2026

PranavBhatP commented Oct 12, 2025 •

edited

Loading

codecov bot commented Oct 16, 2025 •

edited

Loading

PranavBhatP commented Oct 21, 2025 •

edited

Loading

PranavBhatP commented Nov 3, 2025 •

edited

Loading

PranavBhatP commented Dec 19, 2025 •

edited

Loading

PranavBhatP commented Dec 28, 2025 •

edited

Loading

phoeenniixx left a comment •

edited

Loading

PranavBhatP commented Jan 1, 2026 •

edited

Loading

PranavBhatP commented Jan 5, 2026 •

edited

Loading

PranavBhatP commented Jan 24, 2026 •

edited

Loading

PranavBhatP Jan 27, 2026 •

edited

Loading

PranavBhatP Jan 27, 2026 •

edited

Loading