WIP: async stream of Arrow record batches from Parquet file #258

kylebarron · 2024-11-13T22:28:06Z

This takes just 1.1s for the stream to start and then 1.0s more for the first record batch to be fetched. While it's >60s for the full file to download on my internet.

from time import time

t0 = time()
url = "https://overturemaps-us-west-2.s3.amazonaws.com/release/2024-03-12-alpha.0/theme=buildings/type=building/part-00217-4dfc75cd-2680-4d52-b5e0-f4cc9f36b267-c000.zstd.parquet"
store = HTTPStore.from_url(url)
stream = await read_parquet_async("", store=store)
t1 = time()
first = await stream.__anext__()
t2 = time()

print(t1 - t0) # 1.1302871704101562
print(t2 - t1) # 1.0420188903808594

kylebarron · 2025-03-25T22:31:59Z

superseded by #313

WIP: async stream of Arrow record batches from Parquet file

2dcf90a

kylebarron enabled auto-merge (squash) November 13, 2024 22:28

kylebarron disabled auto-merge November 13, 2024 22:28

kylebarron marked this pull request as draft November 13, 2024 22:28

Working example

20738cb

kylebarron mentioned this pull request Mar 25, 2025

Expand Parquet API: async, filtering, projection pushdown #313

Open

13 tasks

kylebarron closed this Mar 25, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

WIP: async stream of Arrow record batches from Parquet file #258

WIP: async stream of Arrow record batches from Parquet file #258

Uh oh!

kylebarron commented Nov 13, 2024 •

edited

Loading

Uh oh!

kylebarron commented Mar 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

WIP: async stream of Arrow record batches from Parquet file #258

WIP: async stream of Arrow record batches from Parquet file #258

Uh oh!

Conversation

kylebarron commented Nov 13, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kylebarron commented Mar 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

kylebarron commented Nov 13, 2024 •

edited

Loading