Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DataFrame.drop with columns=set(...) is unspecified #1008

Open
cmp0xff opened this issue Sep 25, 2024 · 1 comment
Open

DataFrame.drop with columns=set(...) is unspecified #1008

cmp0xff opened this issue Sep 25, 2024 · 1 comment
Labels
pandas_docs For issues where there is a conflict in behavior with pandas docs and stubs that needs resolution

Comments

@cmp0xff
Copy link
Contributor

cmp0xff commented Sep 25, 2024

Describe the bug

DataFrame.drop with columns=set(...) is unspecified.

To Reproduce

  1. Provide a minimal runnable pandas example that is not properly checked by the stubs.
import pandas as pd

df = pd.DataFrame({1: [2], 3: [4]})
df = df.drop(columns={1})
  1. I am using mypy type checker.
  2. The error message received from that type checker.
df_drop.py:4:6: error: No overload variant of "drop" of "NDFrame" matches argument type "set[int]"  [call-overload]
df_drop.py:4:6: note: Possible overload variants:
df_drop.py:4:6: note:     def drop(self, labels: None = ..., *, axis: Literal['index', 0] | Literal['columns', 1] = ..., index: Hashable | Sequence[Hashable] | Index[Any] = ..., columns: Hashable | Sequence[Hashable] | Index[Any], level: Hashable | int | None = ..., inplace: Literal[True], errors: Literal['ignore', 'raise'] = ...) -> None
df_drop.py:4:6: note:     def drop(self, labels: None = ..., *, axis: Literal['index', 0] | Literal['columns', 1] = ..., index: Hashable | Sequence[Hashable] | Index[Any], columns: Hashable | Sequence[Hashable] | Index[Any] = ..., level: Hashable | int | None = ..., inplace: Literal[True], errors: Literal['ignore', 'raise'] = ...) -> None
df_drop.py:4:6: note:     def drop(self, labels: Hashable | Sequence[Hashable] | Index[Any], *, axis: Literal['index', 0] | Literal['columns', 1] = ..., index: None = ..., columns: None = ..., level: Hashable | int 
| None = ..., inplace: Literal[True], errors: Literal['ignore', 'raise'] = ...) -> None
df_drop.py:4:6: note:     def drop(self, labels: None = ..., *, axis: Literal['index', 0] | Literal['columns', 1] = ..., index: Hashable | Sequence[Hashable] | Index[Any] = ..., columns: Hashable | Sequence[Hashable] | Index[Any], level: Hashable | int | None = ..., inplace: Literal[False] = ..., errors: Literal['ignore', 'raise'] = ...) -> DataFrame
df_drop.py:4:6: note:     def drop(self, labels: None = ..., *, axis: Literal['index', 0] | Literal['columns', 1] = ..., index: Hashable | Sequence[Hashable] | Index[Any], columns: Hashable | Sequence[Hashable] | Index[Any] = ..., level: Hashable | int | None = ..., inplace: Literal[False] = ..., errors: Literal['ignore', 'raise'] = ...) -> DataFrame
df_drop.py:4:6: note:     def drop(self, labels: Hashable | Sequence[Hashable] | Index[Any], *, axis: Literal['index', 0] | Literal['columns', 1] = ..., index: None = ..., columns: None = ..., level: Hashable | int 
| None = ..., inplace: Literal[False] = ..., errors: Literal['ignore', 'raise'] = ...) -> DataFrame
df_drop.py:4:6: note:     def drop(self, labels: None = ..., *, axis: Literal['index', 0] | Literal['columns', 1] = ..., index: Hashable | Sequence[Hashable] | Index[Any] = ..., columns: Hashable | Sequence[Hashable] | Index[Any], level: Hashable | int | None = ..., inplace: bool = ..., errors: Literal['ignore', 'raise'] = ...) -> DataFrame | None
df_drop.py:4:6: note:     def drop(self, labels: None = ..., *, axis: Literal['index', 0] | Literal['columns', 1] = ..., index: Hashable | Sequence[Hashable] | Index[Any], columns: Hashable | Sequence[Hashable] | Index[Any] = ..., level: Hashable | int | None = ..., inplace: bool = ..., errors: Literal['ignore', 'raise'] = ...) -> DataFrame | None
df_drop.py:4:6: note:     def drop(self, labels: Hashable | Sequence[Hashable] | Index[Any], *, axis: Literal['index', 0] | Literal['columns', 1] = ..., index: None = ..., columns: None = ..., level: Hashable | int 
| None = ..., inplace: bool = ..., errors: Literal['ignore', 'raise'] = ...) -> DataFrame | None
Found 1 error in 1 file (checked 1 source file)

Please complete the following information

  • OS: Windows
  • OS Version:
[System.Environment]::OSVersion.Version
Major  Minor  Build  Revision
-----  -----  -----  --------
10     0      19045  0
  • python version: Python 3.11.9
  • version of type checker: mypy 1.11.2 (compiled: yes)
  • version of installed pandas-stubs: pandas-stubs==2.2.2.240807

Additional context

Nope

@Dr-Irv
Copy link
Collaborator

Dr-Irv commented Sep 25, 2024

First, your example is not correct. It should be:

import pandas as pd

df = pd.DataFrame({1: [2], 3: [4]})   # Fix is here
df = df.drop(columns={1})

Secondly, the pandas documentation says that the argument for columns is "single label or list-like". While your code works, it is not clear that it should. The stubs follows what is documented and a set is not "list-like".

I've added a reference to a pandas issue pandas-dev/pandas#59890 to see what the pandas developers say there.

@Dr-Irv Dr-Irv added the pandas_docs For issues where there is a conflict in behavior with pandas docs and stubs that needs resolution label Jan 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pandas_docs For issues where there is a conflict in behavior with pandas docs and stubs that needs resolution
Projects
None yet
Development

No branches or pull requests

2 participants