Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Table.to_dicts() #9185

Open
NickCrews opened this issue May 13, 2024 · 1 comment · May be fixed by #10697
Open

feat: Table.to_dicts() #9185

NickCrews opened this issue May 13, 2024 · 1 comment · May be fixed by #10697

Comments

@NickCrews
Copy link
Contributor

In many places in my code I do something like table.to_pandas().to_records(). It could be nice to have a method that goes straight there. IDK, not a huge deal, the existing solution is really not that bad. But just wanted to get an issue up for discussion.

Based on @mw3i in #5391 (comment)_:

Sorry for the late reply. Since ibis doesn't have upsert, our team uses the dataset library for inserts/updates/upserts. We do all our normal code in ibis, and then updates/upserts are done with:

import dataset

params = f"{dialect}://{user}:{password}@{url}:{port}/{database}" # <-- sqlalchemy database uri string
with dataset.connect(params) as dbx:
    dbx[tablename].upsert_many(
        list_of_dictionaries, # <-- each dict is a db entry
        column_to_use_as_identifier,
    )
@NickCrews NickCrews changed the title feat: Table.to_records() feat: Table.to_dicts() Jan 21, 2025
@NickCrews
Copy link
Contributor Author

NickCrews commented Jan 21, 2025

I think this implementation would work:

    def to_dicts(self, chunk_size: int = 1_000_000) -> Iterable[dict[str, Any]:
        """Iterate through each row as a `dict` of column_name -> value."""
        for batch in self.to_pyarrow_batches(chunk_size=chunk_size):
            yield from batch.to_pylist()
  1. it uses pyarrow, which I think we want to prioritize more than pandas
  2. it allows streaming, not materializing the whole thing at once. Users can

NickCrews added a commit to NickCrews/ibis that referenced this issue Jan 21, 2025
NickCrews added a commit to NickCrews/ibis that referenced this issue Jan 21, 2025
NickCrews added a commit to NickCrews/ibis that referenced this issue Jan 21, 2025
@NickCrews NickCrews linked a pull request Jan 21, 2025 that will close this issue
NickCrews added a commit to NickCrews/ibis that referenced this issue Jan 21, 2025
NickCrews added a commit to NickCrews/ibis that referenced this issue Jan 21, 2025
NickCrews added a commit to NickCrews/ibis that referenced this issue Jan 21, 2025
NickCrews added a commit to NickCrews/ibis that referenced this issue Jan 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: backlog
Development

Successfully merging a pull request may close this issue.

1 participant