ENH: Allow third-party packages to register IO engines #61642

datapythonista · 2025-06-12T21:58:17Z

xref Proposal to allow third-party engines for readers and writers #61584
Tests added and passed if fixing a bug or adding a new feature
All code checks passed.
Added type annotations to new arguments/methods/functions.
Added an entry in the latest doc/source/whatsnew/vX.X.X.rst file if fixing a bug or adding a new feature.

Added the new system to the Iceberg connection only to keep this smaller. The idea is to add the decorator to all other connectors, happy to do it here or in a follow up PR.

mroeschke · 2025-06-16T17:24:42Z

doc/source/development/extending.rst

+method on it with the arguments provided by the user (except the ``engine`` parameter).
+
+To avoid conflicts in the names of engines, we keep an "IO engines" section in our
+[Ecosystem page](https://pandas.pydata.org/community/ecosystem.html#io-engines).


This will need different formatting since rst hyperlink syntax is different from md

True, thanks for the heads up. I updated it.

mroeschke · 2025-06-16T17:34:53Z

pandas/io/iceberg.py

@@ -52,6 +56,10 @@ def read_iceberg(
    scan_properties : dict of {str: obj}, optional
        Additional Table properties as a dictionary of string key value pairs to use
        for this scan.
+    engine : str, optional


Should the read_* and to_* signatures also have an engine_kwargs: dict[str, Any] | None argument to allow specific engine arguments to be passes per implementation?

Very good point. In read_parquet we already have a **kwargs for engine specific arguments. In map, apply... it's a normal engine_kwargs since **kwargs is used in some cases for the udf keyword arguments. I think for IO readers/writers **kwargs as read_parquet does is fine.

I didn't want to add the engine to all connectors in this PR to keep it simpler, but I'm planning to follow up with another PR that adds it, and adds **kwargs for connectors where it's not there already. Surely happy to add both things here if you prefer, just thought it would make reviewing simpler to keep the implementation separate from all the changes to parameters.

…_engines

datapythonista · 2025-06-16T22:39:04Z

/preview

github-actions · 2025-06-16T22:39:25Z

Website preview of this PR available at: https://pandas.pydata.org/preview/pandas-dev/pandas/61642/

datapythonista added 3 commits June 12, 2025 16:23

New third-party IO engines

f33778c

Add tests and fix bugs

555459b

Finishing docs and tests

1ca77c1

datapythonista requested a review from noatamir as a code owner June 12, 2025 21:58

datapythonista added IO Data IO issues that don't fit into a more specific label API Design labels Jun 12, 2025

datapythonista added 2 commits June 13, 2025 00:31

typo in doc label and typing issues

d388101

Merge branch 'main' into io_engines

e333510

mroeschke reviewed Jun 16, 2025

View reviewed changes

datapythonista added 3 commits June 16, 2025 22:29

Fix link in markdown

cb82ffb

Merge branch 'io_engines' of github.com:datapythonista/pandas into io…

088e5de

…_engines

Merge main

9e71a9d

Fix link

ebfc20c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

ENH: Allow third-party packages to register IO engines #61642

ENH: Allow third-party packages to register IO engines #61642

datapythonista commented Jun 12, 2025

Uh oh!

mroeschke Jun 16, 2025

Uh oh!

datapythonista Jun 16, 2025

Uh oh!

mroeschke Jun 16, 2025

Uh oh!

datapythonista Jun 16, 2025

Uh oh!

datapythonista commented Jun 16, 2025

Uh oh!

github-actions bot commented Jun 16, 2025

Uh oh!

Uh oh!

Uh oh!

ENH: Allow third-party packages to register IO engines #61642

Are you sure you want to change the base?

ENH: Allow third-party packages to register IO engines #61642

Conversation

datapythonista commented Jun 12, 2025

Uh oh!

mroeschke Jun 16, 2025

Choose a reason for hiding this comment

Uh oh!

datapythonista Jun 16, 2025

Choose a reason for hiding this comment

Uh oh!

mroeschke Jun 16, 2025

Choose a reason for hiding this comment

Uh oh!

datapythonista Jun 16, 2025

Choose a reason for hiding this comment

Uh oh!

datapythonista commented Jun 16, 2025

Uh oh!

github-actions bot commented Jun 16, 2025

Uh oh!

Uh oh!