-
-
Notifications
You must be signed in to change notification settings - Fork 374
Open
Labels
enhancementNew feature or requestNew feature or request
Description
Describe the bug
pandas DataFrames not consistently typed with pandas 3.0.0. Even when a dataframe is a pandera-typed dataframe within a function, it reverts to a standard pandas dataframe in the calling context.
- I have checked that this issue has not already been reported.
- I have confirmed this bug exists on the latest version of pandera.
- (optional) I have confirmed this bug exists on the main branch of pandera.
Code Sample, a copy-pastable example
The following code no longer works with pandas 3.0. This will also affect tests using pandas.testing.assert_frame_equal(), which checks dataframe type by default.
import pandas as pd
import pandera.typing as pat
from pandera.pandas import DataFrameModel, check_types
DATA_DICT = {
'a': [1, 2, 3],
'b': [4, 5, 6]
}
class TestSchema(DataFrameModel):
"""A simple pandera schema for testing."""
a: int
b: int
@check_types
def generate_test_dataframe() -> pat.DataFrame[TestSchema]:
"""Generate a test DataFrame conforming to TestSchema."""
df = pd.DataFrame(DATA_DICT)
typed_df = df.pipe(pat.DataFrame[TestSchema])
print(f'Return type: {type(typed_df)}')
return typed_df
expected_df = pat.DataFrame[TestSchema](DATA_DICT)
return_df = generate_test_dataframe()
assert isinstance(return_df, type(expected_df)), f'Expected {type(expected_df)}, got {type(return_df)}'
print('Successful completion')pandas 2.3.2 and pandera 0.28.1 ✅
- No mypy warnings/errors
-
generate_test_dataframe()printsReturn type: <class 'pandera.typing.pandas.DataFrame'> -
Successful completionis printed
pandas 2.3.3 and pandera 0.28.1 ✅
- No mypy warnings/errors
-
generate_test_dataframe()printsReturn type: <class 'pandera.typing.pandas.DataFrame'> -
Successful completionis printed
pandas 3.0.0 and pandera 0.28.1 ❌
- No mypy warnings/errors
-
generate_test_dataframe()printsReturn type: <class 'pandera.typing.pandas.DataFrame'>
File prints:
Traceback (most recent call last):
File "c:\Users\m277249\Documents\56649_Speech_AI\CASLIM\pandera_typing_check.py", line 33, in <module>
assert isinstance(return_df, type(expected_df)), f'Expected {type(expected_df)}, got {type(return_df)}'
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError: Expected <class 'pandera.typing.pandas.DataFrame'>, got <class 'pandas.DataFrame'>Expected behavior
I would expect the DataFrame type to be consistent within the function, when leaving the function (especially when checked via check_types), and in the calling context.
Desktop (please complete the following information):
- OS: Windows 11
- Version: pandera 0.28.1 and pandas 3.0.0
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request