From b22fc9fa66b26dcbaff0712b6baf73e88a14059c Mon Sep 17 00:00:00 2001 From: Shashwat1001 Date: Wed, 9 Jul 2025 23:46:12 +0530 Subject: [PATCH 1/4] DOC: Clarify broadcasting behavior when using lists in DataFrame arithmetic (GH18857) --- doc/source/user_guide/basics.rst | 5 +++++ doc/source/user_guide/dsintro.rst | 13 +++++++++++++ 2 files changed, 18 insertions(+) diff --git a/doc/source/user_guide/basics.rst b/doc/source/user_guide/basics.rst index 8155aa0ae03fa..fbfdec6af8759 100644 --- a/doc/source/user_guide/basics.rst +++ b/doc/source/user_guide/basics.rst @@ -209,6 +209,11 @@ either match on the *index* or *columns* via the **axis** keyword: df.sub(column, axis="index") df.sub(column, axis=0) +Be careful when using raw Python lists in binary operations with DataFrames. +Unlike NumPy arrays or Series, lists are not broadcast across rows or columns. +Instead, pandas attempts to match the entire list against a single axis, which may lead to confusing results such as Series of arrays. +To ensure proper broadcasting behavior, use a NumPy array or Series with explicit index or shape. +See also: :ref:`numpy broadcasting ` Furthermore you can align a level of a MultiIndexed DataFrame with a Series. .. ipython:: python diff --git a/doc/source/user_guide/dsintro.rst b/doc/source/user_guide/dsintro.rst index 89981786d60b5..c635d9157f557 100644 --- a/doc/source/user_guide/dsintro.rst +++ b/doc/source/user_guide/dsintro.rst @@ -650,6 +650,19 @@ row-wise. For example: df - df.iloc[0] +When using a Python list in arithmetic operations with a DataFrame, the behavior is not element-wise broadcasting. +Instead, the list is treated as a single object and the operation is performed column-wise, resulting in unexpected output (e.g. arrays inside each cell). + +.. ipython:: python + + df = pd.DataFrame(np.arange(6).reshape(2, 3), columns=["A", "B", "C"]) + + df + [1, 2, 3] # Returns a Series of arrays, not a DataFrame + + df + np.array([1, 2, 3]) # Correct broadcasting + + df + pd.Series([1, 2, 3], index=["A", "B", "C"]) # Also correct + For explicit control over the matching and broadcasting behavior, see the section on :ref:`flexible binary operations `. From 7bcf683e63fbfb69601236e41efd5c0a495a2843 Mon Sep 17 00:00:00 2001 From: Shashwat1001 Date: Thu, 10 Jul 2025 00:01:03 +0530 Subject: [PATCH 2/4] DOC: Fix external link formatting in basics.rst --- doc/source/user_guide/basics.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/doc/source/user_guide/basics.rst b/doc/source/user_guide/basics.rst index fbfdec6af8759..e25dca0da91c9 100644 --- a/doc/source/user_guide/basics.rst +++ b/doc/source/user_guide/basics.rst @@ -213,7 +213,7 @@ Be careful when using raw Python lists in binary operations with DataFrames. Unlike NumPy arrays or Series, lists are not broadcast across rows or columns. Instead, pandas attempts to match the entire list against a single axis, which may lead to confusing results such as Series of arrays. To ensure proper broadcasting behavior, use a NumPy array or Series with explicit index or shape. -See also: :ref:`numpy broadcasting ` +See also: `numpy broadcasting `_ Furthermore you can align a level of a MultiIndexed DataFrame with a Series. .. ipython:: python From ec00318d45485321b5fd91a4afcdf715fca30510 Mon Sep 17 00:00:00 2001 From: Shashwat1001 Date: Thu, 10 Jul 2025 00:03:55 +0530 Subject: [PATCH 3/4] DOC: Removed external link in basics.rst --- doc/source/user_guide/basics.rst | 1 - 1 file changed, 1 deletion(-) diff --git a/doc/source/user_guide/basics.rst b/doc/source/user_guide/basics.rst index e25dca0da91c9..d35cc06db9f06 100644 --- a/doc/source/user_guide/basics.rst +++ b/doc/source/user_guide/basics.rst @@ -213,7 +213,6 @@ Be careful when using raw Python lists in binary operations with DataFrames. Unlike NumPy arrays or Series, lists are not broadcast across rows or columns. Instead, pandas attempts to match the entire list against a single axis, which may lead to confusing results such as Series of arrays. To ensure proper broadcasting behavior, use a NumPy array or Series with explicit index or shape. -See also: `numpy broadcasting `_ Furthermore you can align a level of a MultiIndexed DataFrame with a Series. .. ipython:: python From 0e718bc6827ee13335fdf3dc8c4fa58ba177531a Mon Sep 17 00:00:00 2001 From: Shashwat1001 Date: Tue, 15 Jul 2025 13:23:57 +0530 Subject: [PATCH 4/4] Comment changes --- doc/source/user_guide/dsintro.rst | 8 +------- 1 file changed, 1 insertion(+), 7 deletions(-) diff --git a/doc/source/user_guide/dsintro.rst b/doc/source/user_guide/dsintro.rst index c635d9157f557..59938a66bb507 100644 --- a/doc/source/user_guide/dsintro.rst +++ b/doc/source/user_guide/dsintro.rst @@ -655,13 +655,7 @@ Instead, the list is treated as a single object and the operation is performed c .. ipython:: python - df = pd.DataFrame(np.arange(6).reshape(2, 3), columns=["A", "B", "C"]) - - df + [1, 2, 3] # Returns a Series of arrays, not a DataFrame - - df + np.array([1, 2, 3]) # Correct broadcasting - - df + pd.Series([1, 2, 3], index=["A", "B", "C"]) # Also correct + df + np.array([1, 2, 3]) For explicit control over the matching and broadcasting behavior, see the section on :ref:`flexible binary operations `.