Split batching of remove calls into 10 MB or 100 request chunks #495

alxmrs · 2022-08-31T19:28:30Z

The GCS API supports batching requests to minimize the number of simultaneous client connections. This could help solve the issue related to #493.

https://cloud.google.com/storage/docs/batch#overview

alxmrs · 2022-08-31T20:14:48Z

Let me specifically recreate @dmaring's recommendation about batching (the initial version of this issue is miss-titled).

It looks like there is batching of requests for certain calls, like rm! See:

gcsfs/gcsfs/core.py

Line 922 in 7e8058f

body = "".join(

However, according to the docs linked above, certain batched requests will be too big. If the payload is larger than 10 MBs or there are more than 100 calls (e.g. like we have when working with certain geospatial datasets), then it will fail.

The fix here includes chunking batch calls within these limits.

martindurant · 2022-08-31T20:19:45Z

The fix here ...

Is there a linked PR?

alxmrs · 2022-08-31T20:53:35Z

:) Not yet! I'm about to write one. Maybe this was a poor choice of works over text; I mean to describe the approach for a fix.

This is the next this I am working on today.

dmaring · 2022-08-31T20:55:42Z

A few other things to consider with large delete batch requests:

GCP Cloud Storage buckets autoscale based on usage. The initial IO capacity of a storage bucket is 1000 object write requests which includes deleting objects.

"Consequently, if the request rate on your bucket increases faster than Cloud Storage can perform this redistribution, you may run into temporary limits, specifically higher latency and error rates."

https://cloud.google.com/storage/docs/request-rate#auto-scaling

The batch API will return 200 even if subrequests in the batch request fail. However, the 200 response will include the responses of the subrequests. You can check these if desired.

https://cloud.google.com/storage/docs/batch#batch-example-response

Hotspotting can occur if file names are sequential. If possible avoid sequential file names and use a hash as a prefix as shown in the following link.

https://cloud.google.com/storage/docs/request-rate#naming-convention

Fixes fsspec#495.

slevang · 2023-01-15T17:02:18Z

I think we could do better here. Still running into this issue when I have multiple workers rm-ing files on a bucket that can't scale up quickly enough. The traceback is:

    self.fs.rm(path, recursive=True)
  File "/opt/conda/lib/python3.9/site-packages/fsspec/asyn.py", line 113, in wrapper
    return sync(self.loop, func, *args, **kwargs)
  File "/opt/conda/lib/python3.9/site-packages/fsspec/asyn.py", line 98, in sync
    raise return_result
  File "/opt/conda/lib/python3.9/site-packages/fsspec/asyn.py", line 53, in _runner
    result[0] = await coro
  File "/opt/conda/lib/python3.9/site-packages/gcsfs/core.py", line 1069, in _rm
    raise exs[0]
  File "/opt/conda/lib/python3.9/site-packages/gcsfs/core.py", line 1034, in _rm_files
    raise OSError(errors)
OSError: ['We encountered an internal error. Please try again.']

This looks like the 503 here:
https://cloud.google.com/storage/docs/json_api/v1/status-codes#503_Service_Unavailable

But, I can't figure out why it isn't being caught as an HttpError in validate_response if so.

Relates to #493, #496, #406

martindurant · 2023-01-16T15:03:57Z

Yes, I agree that looks like something that should be backoff-retriable. I can't immediately see why this isn't in the existing HttpError catch.

alxmrs changed the title ~~Automatically batch requests~~ Split batching of remove calls into 10 MB or 100 request chunks Aug 31, 2022

alxmrs added a commit to alxmrs/gcsfs that referenced this issue Aug 31, 2022

Splitting remove files requests into 100-chunk batched reqeust.

b2fab4a

Fixes fsspec#495.

alxmrs mentioned this issue Aug 31, 2022

Splitting remove files requests into 100-chunk batched requests. #496

Merged

martindurant closed this as completed in 5255214 Sep 1, 2022

martindurant mentioned this issue Sep 6, 2022

Extend Retry logic to address OSErrors. #493

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Split batching of remove calls into 10 MB or 100 request chunks #495

Split batching of remove calls into 10 MB or 100 request chunks #495

alxmrs commented Aug 31, 2022

alxmrs commented Aug 31, 2022

martindurant commented Aug 31, 2022

alxmrs commented Aug 31, 2022

dmaring commented Aug 31, 2022

slevang commented Jan 15, 2023 •

edited

Loading

martindurant commented Jan 16, 2023

Split batching of remove calls into 10 MB or 100 request chunks #495

Split batching of remove calls into 10 MB or 100 request chunks #495

Comments

alxmrs commented Aug 31, 2022

alxmrs commented Aug 31, 2022

martindurant commented Aug 31, 2022

alxmrs commented Aug 31, 2022

dmaring commented Aug 31, 2022

slevang commented Jan 15, 2023 • edited Loading

martindurant commented Jan 16, 2023

slevang commented Jan 15, 2023 •

edited

Loading