-
Notifications
You must be signed in to change notification settings - Fork 162
feat(python/adbc_driver_manager): Implement import_schema for Polars DBAPI backend
#3556
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
| # The version that Polars added support to initialize a schema via the | ||
| # __arrow_c_schema__ interface | ||
| required_version = (1, 32, 2) | ||
| polars_version = polars.__version__ | ||
| if tuple(int(v) for v in polars_version.split(".")) >= required_version: | ||
| return polars.Schema(handle) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively, could not do the version check and just return polars.Schema(handle) immediately.
Would raise with the below before the required version. IMO providing the user a nicer error message is preferable though.
Traceback (most recent call last):
File "/home/henry/development/arrow-adbc/python/adbc_driver_manager/try_import_schema.py", line 11, in <module>
schema = conn.adbc_get_table_schema("test_table_schema")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/henry/development/arrow-adbc/python/adbc_driver_manager/adbc_driver_manager/dbapi.py", line 521, in adbc_get_table_schema
return self._backend.import_schema(handle)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/henry/development/arrow-adbc/python/adbc_driver_manager/adbc_driver_manager/_dbapi_backend.py", line 183, in import_schema
return polars.Schema(handle)
^^^^^^^^^^^^^^^^^^^^^
File "/home/henry/.cache/uv/archive-v0/SC8MnmcDLgKYq5uEoGc4z/lib/python3.11/site-packages/polars/schema.py", line 102, in __init__
for name, tp in input:
TypeError: 'adbc_driver_manager._lib.ArrowSchemaHandle' object is not iterable
| address: int | ||
| is_valid: bool | ||
| def release(self) -> None: ... | ||
| def __arrow_c_schema__(self) -> object: ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes the type checker not complain about polars.Schema(handle).
Figured it would be worth adding the other __arrow_c_... methods to the stubs too.
lidavidm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! LGTM, though I wonder about how to handle the type annotations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I think this works but would be inconvenient for people (I'm guessing this will require type assertions all over the place). But I'm not sure of a better way to do this for now. (IIRC associated types are unsupported? Else we could template Cursor on the backend type and make the return type dependent on the backend type's associated type.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I agree it will be inconvenient. FWIW, I think the user manually type hinting the variable assignment works too.
Else we could template Cursor on the backend type and make the return type dependent on the backend type's associated type.
I will give something a crack
|
@henryharbeck it appears polars broke their Python docs? https://docs.pola.rs/api/python/stable/reference/objects.inv needs to exist (and actually index polars.Schema) for the API crosslinking to work. |
|
Seems like the crosslinking is an existing issue pola-rs/polars#7027 And looks it is not linking correctly in the current docs either (e.g., fetch_polars). I was able to find the (an?) objects.inv file here - https://docs.pola.rs/api/python/stable/objects.inv |
Implements
import_schema()for the Polars DBAPI backend.This means
Connection.adbc_get_table_schema,Cursor.adbc_execute_partitions,Cursor.adbc_execute_schema, andCursor.adbc_preparecan now return a PyArrow Schema or a Polars Schema. Unfortunately it does not seem like the return type can be known statically, so I haven't added any overloads. If the ambiguous return type isn't desired, please feel free to suggest alternatives or close this PR.Support for initializing a
polars.Schemafrom the__arrow_c_schema__interface was added in Polars 1.32.1 (pola-rs/polars#23879) (which was yanked, so really 1.32.2)I've left a few comments where I'm unsure of whether some alternative code may preferred
Closes #3421