Improve retrying downloads code and testing #1559

yarikoptic · 2024-12-20T21:54:16Z

Follow up to

Continue retrying downloads on retriable statuses #1558
retry on 429 as well
handle Retry-After as a Date not just seconds (https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Date)
adjust logging of sleeping upon retry - do in a single place

We are into holidays already, so it is understood if review does not come. I will reserve the "right" to merge if there is user demand but likely we could just wait. If anyone sees "clearly" how to improve -- feel welcome to push directly.

…leep

codecov · 2024-12-20T21:57:10Z

Codecov Report

Attention: Patch coverage is 90.09009% with 11 lines in your changes missing coverage. Please review.

Project coverage is 88.65%. Comparing base (746650f) to head (2482103).
Report is 25 commits behind head on master.

Files with missing lines	Patch %	Lines
dandi/dandiarchive.py	14.28%	6 Missing ⚠️
dandi/dandiapi.py	33.33%	2 Missing ⚠️
dandi/utils.py	92.30%	2 Missing ⚠️
dandi/download.py	88.88%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #1559      +/-   ##
==========================================
+ Coverage   88.34%   88.65%   +0.30%     
==========================================
  Files          78       78              
  Lines       10735    10829      +94     
==========================================
+ Hits         9484     9600     +116     
+ Misses       1251     1229      -22

Flag	Coverage Δ
unittests	`88.65% <90.09%> (+0.30%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

dandi/consts.py

dandi/download.py

… attempts

dandi/download.py

candleindark

At this point, _check_if_more_attempts_allowed is equally, if not more, about sleeping/waiting before sending out another requests. I think it would be beneficial to rename it, or in some other way, to indicate this nature.

If we want to improve the retrying further, I think we can consider incorporating some well-known and recommended algorithms for retrying requests to a server while avoiding overwhelming it. https://chatgpt.com/share/676f8290-86cc-800a-9aa8-1ad952282d27.

dandi/download.py

candleindark · 2024-12-28T04:28:35Z

dandi/download.py

+                        retry_after,
+                        exc,
+                        exc_ve,
+                    )
            lgr.debug(
                "%s - download failed due to response %d with "
                "Retry-After=%d: %s, will sleep and retry",


the "Retry-After=[]" in this log is now insistent and sleep_amount may not be a meaningful number if retry_after failed to be parsed into a datetime object

sorry, I do not get what [] you have in mind since it can't be a list... please rereview when I push changes -- may be the issue is addressed.

I think I meant that you have numerous logs with inconsistent labelling with "Retry-After=". Some of them you set it to the header value of the "Retry-After" (understandable), others you set it to sleep_amount which can originate from, but not exactly, the passed in header value for "Retry-After" and can default to -1.0.

please check if it is all consistent now -- seems to me, always seems to point to retry_after.

The Retry-After=[]s in log messages, where [] is used to indicate an expression of %d or %s, are consistent within the get_retry_after(), but this last one is still inconsistent with the ones in get_retry_after(). This last one prints the int value of sleep_amount, the result of get_retry_after(), and the others print whatever the value of the HTTP header Retry-After which can be a string of a specific date.

I think we can just remove the use of Retry-After= here. There is plenty of logged messages within get_retry_after()

I actually think this log statement should be removed entirely. Because of the containing if-statement, this log statement doesn't always get executed for all responses that have a Retry-After header. However, at the same time, I think the original intent is to execute this log statement for all responses with a Retry-After header.

We are better off if we just replace the entire if statement:

if (sleep_amount := get_retry_after(exc.response)) is not None: lgr.debug( "%s - download failed due to response %d with " "Retry-After=%d: %s, will sleep and retry", path, exc.response.status_code, sleep_amount, exc, )

with

sleep_amount = get_retry_after(exc.response)

and unindent the next log statement as I recommended below.

thanks! done that, pushed

dandi/download.py

datetime.UTC was introduced only in 3.11, so type checking which uses 3.9 ATM fails rightfully, as unittesting on 3.9 and 3.10

…ETRY_CODES are used It also add additional checks/protection against odd retry-after results (too long negatives or positives), which should address some concerns raised in prior code review

* origin/master: Update CHANGELOG.md [skip ci] Constrain version pin Further refine version pinning Pin `dandischema` to require the latest version Start Zarr download as soon as first page of entries is obtained Update CHANGELOG.md [skip ci] Use Python 3.10 to build docs Support dandischema v0.11.0 Fix typo in `dandi move` docstring Don't use version 0.25.5 of `responses` Update URL for DANDI Docs

…rm how long to sleep

dandi/download.py

dandi/utils.py

dandi/tests/test_download.py

Co-authored-by: John T. Wodder II <[email protected]>

yarikoptic · 2025-02-03T22:47:20Z

rerunning windows which puked with

FAILED dandi/tests/test_dandiarchive.py::test_parse_api_url_glob[https://gui.dandiarchive.org/#/dandiset/001001/draft/files?location=sub-RAT123/*.nwb-parsed_url1] - requests.exceptions.ConnectionError: ('Connection aborted.', ConnectionResetError(10054, 'An existing connection was forcibly closed by the remote host', None, 10054, None))

dandi/download.py

dandi/utils.py

candleindark · 2025-02-05T07:12:28Z

dandi/utils.py

+        elif sleep_amount < 0:
+            sleep_amount = None
+            lgr.warning(
+                "date in Retry-After=%r is in the past (current is %r). "


You may also want to do "Retry-After=%s" instead of "Retry-After=%r" if you want the HTTP header appears exactly the same as they are received, without the quotes.

dandi/utils.py

candleindark · 2025-02-05T07:26:05Z

dandi/download.py

+                        retry_after,
+                        exc,
+                        exc_ve,
+                    )
            lgr.debug(
                "%s - download failed due to response %d with "
                "Retry-After=%d: %s, will sleep and retry",


The Retry-After=[]s in log messages, where [] is used to indicate an expression of %d or %s, are consistent within the get_retry_after(), but this last one is still inconsistent with the ones in get_retry_after(). This last one prints the int value of sleep_amount, the result of get_retry_after(), and the others print whatever the value of the HTTP header Retry-After which can be a string of a specific date.

I think we can just remove the use of Retry-After= here. There is plenty of logged messages within get_retry_after()

candleindark · 2025-02-05T07:51:31Z

dandi/download.py

        lgr.debug(
-            "%s - download failed on attempt #%d: %s, will sleep a bit and retry",
+            "%s - download failed on attempt #%d: %s, will sleep %f and retry",
            path,
            attempt,
            exc,
+            sleep_amount,


I think just unindent these lines, and get the log statement out of the if-statement and remove the last log statement at line 1126. See the comments for the last log statement for more details.

ok, makes sense, dedenting the log so we always log how long we are to sleep, not only if random amount due to no Retry-After. I did remove that other log

candleindark · 2025-02-05T08:00:51Z

dandi/download.py

+                        retry_after,
+                        exc,
+                        exc_ve,
+                    )
            lgr.debug(
                "%s - download failed due to response %d with "
                "Retry-After=%d: %s, will sleep and retry",


I actually think this log statement should be removed entirely. Because of the containing if-statement, this log statement doesn't always get executed for all responses that have a Retry-After header. However, at the same time, I think the original intent is to execute this log statement for all responses with a Retry-After header.

We are better off if we just replace the entire if statement:

if (sleep_amount := get_retry_after(exc.response)) is not None: lgr.debug( "%s - download failed due to response %d with " "Retry-After=%d: %s, will sleep and retry", path, exc.response.status_code, sleep_amount, exc, )

with

sleep_amount = get_retry_after(exc.response)

and unindent the next log statement as I recommended below.

dandi/utils.py

… Retry-After= etc Co-authored-by: Isaac To <[email protected]> Was amended locally later due to a dirty suggestion applied

…nload

…ason or another

github-actions · 2025-02-12T19:58:37Z

🚀 PR was released in 0.66.7 🚀

yarikoptic added 2 commits December 20, 2024 15:34

Simplify: a single log.debug invocation for any case which leads to s…

4cc65c2

…leep

Retry-After is provided by 429 (too many) so retry on that too

88f3901

yarikoptic added the patch Increment the patch version when merged label Dec 20, 2024

yarikoptic requested review from jwodder, asmacdo and candleindark December 20, 2024 21:54

jwodder requested changes Dec 20, 2024

View reviewed changes

dandi/consts.py Show resolved Hide resolved

dandi/download.py Outdated Show resolved Hide resolved

dandi/download.py Outdated Show resolved Hide resolved

dandi/download.py Show resolved Hide resolved

Handle Retry-After as a date, and add tests for all logic of checking…

6bb3146

… attempts

candleindark reviewed Dec 27, 2024

View reviewed changes

dandi/download.py Outdated Show resolved Hide resolved

candleindark reviewed Dec 28, 2024

View reviewed changes

Replace UTC with use of timezone.utc for compatibility

8b3f546

datetime.UTC was introduced only in 3.11, so type checking which uses 3.9 ATM fails rightfully, as unittesting on 3.9 and 3.10

yarikoptic force-pushed the gh-1556 branch from 6b57379 to 4bbacd9 Compare January 30, 2025 17:42

RF: centralize retry-after parsing logic, add to other places where R…

47cc585

…ETRY_CODES are used It also add additional checks/protection against odd retry-after results (too long negatives or positives), which should address some concerns raised in prior code review

yarikoptic force-pushed the gh-1556 branch from 4bbacd9 to 47cc585 Compare January 31, 2025 22:40

yarikoptic added 4 commits February 3, 2025 08:53

ENH: remove stale comment (we retry only on selected), fix typo, info…

7d2a858

…rm how long to sleep

RF: rename function to reflect that it would sleep as well

948b7f7

Replace "elif" with "if" for clarity

6c9fa51

yarikoptic requested review from candleindark and jwodder February 3, 2025 14:04

jwodder requested changes Feb 3, 2025

View reviewed changes

yarikoptic and others added 6 commits February 3, 2025 16:37

Make code cleaner and more Pythonic (use None, not "-1")

b4b2ea1

Co-authored-by: John T. Wodder II <[email protected]>

Another pythonic tuneup

3883afc

Co-authored-by: John T. Wodder II <[email protected]>

Use try/except instead of isdecimal test

48eaa2b

Co-authored-by: John T. Wodder II <[email protected]>

RF+BF: dedent checking of sleep_amount, do not use if_unparsable

6aeaa61

Fix type info -- only Path is allowed

fa60dfd

RF: use "if" not "elif" after a statement with return in the body

0fe489a

yarikoptic force-pushed the gh-1556 branch from 1a653bb to 0fe489a Compare February 3, 2025 22:10

Add tests for odd times encountered

b2736e4

yarikoptic requested a review from jwodder February 5, 2025 01:22

candleindark reviewed Feb 5, 2025

View reviewed changes

jwodder approved these changes Feb 5, 2025

View reviewed changes

yarikoptic commented Feb 11, 2025

View reviewed changes

dandi/utils.py Outdated Show resolved Hide resolved

yarikoptic commented Feb 11, 2025

View reviewed changes

dandi/utils.py Outdated Show resolved Hide resolved

yarikoptic added 2 commits February 11, 2025 18:42

Apply suggestions from code review: improved docstring, use of %s for…

2b43e05

… Retry-After= etc Co-authored-by: Isaac To <[email protected]> Was amended locally later due to a dirty suggestion applied

RF: Do not bother logging yet another debug msg on Retry-After in dow…

9cd62a2

…nload

yarikoptic force-pushed the gh-1556 branch from d8bd104 to 9cd62a2 Compare February 11, 2025 23:44

yarikoptic added 2 commits February 11, 2025 18:48

Dedent log msg to always inform on how long we would sleep for one re…

7d236cf

…ason or another

Merge branch 'master' into gh-1556

2482103

yarikoptic added the release Create a release when this pr is merged label Feb 12, 2025

yarikoptic merged commit 527347c into master Feb 12, 2025
26 checks passed

yarikoptic deleted the gh-1556 branch February 12, 2025 19:58

github-actions bot added the released label Feb 12, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve retrying downloads code and testing #1559

Improve retrying downloads code and testing #1559

yarikoptic commented Dec 20, 2024

codecov bot commented Dec 20, 2024 •

edited

Loading

candleindark left a comment

candleindark Dec 28, 2024

yarikoptic Jan 29, 2025

candleindark Jan 29, 2025

yarikoptic Feb 3, 2025

candleindark Feb 5, 2025

candleindark Feb 5, 2025

yarikoptic Feb 11, 2025

yarikoptic commented Feb 3, 2025

candleindark Feb 5, 2025

candleindark Feb 5, 2025

candleindark Feb 5, 2025

yarikoptic Feb 11, 2025

candleindark Feb 5, 2025

github-actions bot commented Feb 12, 2025

Improve retrying downloads code and testing #1559

Improve retrying downloads code and testing #1559

Conversation

yarikoptic commented Dec 20, 2024

codecov bot commented Dec 20, 2024 • edited Loading

Codecov Report

candleindark left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yarikoptic commented Feb 3, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

github-actions bot commented Feb 12, 2025

codecov bot commented Dec 20, 2024 •

edited

Loading