You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Right now, running the link checker can fail if the docs have a whole lot of links to the same website, causing it trigger that website's rate-limiting behavior.
When receiving a response code 429 Too Many Requests on a remote link, the link checker should back off from checking links from the same (sub-)domain and try them again later in the run if possible. For example, it might cache the link that got the first 429 result along with any links to the same domain for about 1 minute, then retry all the cached links after that period, backing off again if it starts getting 429 responses again. The actual backoff behavior should follow industry best practices.
While waiting to retry rate-limited links, the checker should check links to other domains
Furthermore, to speed up most runs and reduce the chances of being rate-limited in the future, the link checker should be able to load a cache successfully checked links, with the timestamp of the previous check for that link. The link checker should automatically discard results that are older than a threshold, but keep the other results so that it doesn't have to check those links again during the current run. The suggested threshold for successfully checked links is 7 days, but should be configurable in the dactyl config file. Failed results should not be cached, so they are retried every run unless added to the (already existing) known broken links setting in the config.
The link checker should also be able to save the results of a run to such a file.
Local links should not be subject to any of these behaviors.
Bonus points: adapt the following GitHub Actions job to save & load the cached link checking results on subsequent runs.
Right now, running the link checker can fail if the docs have a whole lot of links to the same website, causing it trigger that website's rate-limiting behavior.
When receiving a response code 429 Too Many Requests on a remote link, the link checker should back off from checking links from the same (sub-)domain and try them again later in the run if possible. For example, it might cache the link that got the first 429 result along with any links to the same domain for about 1 minute, then retry all the cached links after that period, backing off again if it starts getting 429 responses again. The actual backoff behavior should follow industry best practices.
While waiting to retry rate-limited links, the checker should check links to other domains
Furthermore, to speed up most runs and reduce the chances of being rate-limited in the future, the link checker should be able to load a cache successfully checked links, with the timestamp of the previous check for that link. The link checker should automatically discard results that are older than a threshold, but keep the other results so that it doesn't have to check those links again during the current run. The suggested threshold for successfully checked links is 7 days, but should be configurable in the dactyl config file. Failed results should not be cached, so they are retried every run unless added to the (already existing) known broken links setting in the config.
The link checker should also be able to save the results of a run to such a file.
Local links should not be subject to any of these behaviors.
Bonus points: adapt the following GitHub Actions job to save & load the cached link checking results on subsequent runs.
The text was updated successfully, but these errors were encountered: