-
Notifications
You must be signed in to change notification settings - Fork 147
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Split batching of remove calls into 10 MB or 100 request chunks #495
Comments
Let me specifically recreate @dmaring's recommendation about batching (the initial version of this issue is miss-titled). It looks like there is batching of requests for certain calls, like Line 922 in 7e8058f
However, according to the docs linked above, certain batched requests will be too big. If the payload is larger than 10 MBs or there are more than 100 calls (e.g. like we have when working with certain geospatial datasets), then it will fail. The fix here includes chunking batch calls within these limits. |
Is there a linked PR? |
:) Not yet! I'm about to write one. Maybe this was a poor choice of works over text; I mean to describe the approach for a fix. This is the next this I am working on today. |
A few other things to consider with large delete batch requests:
"Consequently, if the request rate on your bucket increases faster than Cloud Storage can perform this redistribution, you may run into temporary limits, specifically higher latency and error rates." https://cloud.google.com/storage/docs/request-rate#auto-scaling
https://cloud.google.com/storage/docs/batch#batch-example-response
https://cloud.google.com/storage/docs/request-rate#naming-convention |
I think we could do better here. Still running into this issue when I have multiple workers
This looks like the But, I can't figure out why it isn't being caught as an |
Yes, I agree that looks like something that should be backoff-retriable. I can't immediately see why this isn't in the existing HttpError catch. |
The GCS API supports batching requests to minimize the number of simultaneous client connections. This could help solve the issue related to #493.
https://cloud.google.com/storage/docs/batch#overview
The text was updated successfully, but these errors were encountered: