Add rate limit control #10

yusufozgur · 2025-08-25T14:02:56Z

Hello,

I needed to use wiki-rag with a rather large wiki. As I knew the wiki allows bot scraping, I added an option to remove rate limits, which was essentially a wait statement with a random number of seconds. The new option is controlled by an environment variable called ENABLE_RATE_LIMITING. By default it is enabled (ENABLE_RATE_LIMITING=true) to discourage users from abusing servers. When rate limiting is disabled, and tested on a random wikimedia site such as https://naruto.fandom.com/, it will reduce the estimated fetching time from 8 hours to 1.5 hours.

Best,

Yusuf

stronk7 · 2025-08-30T15:17:27Z

Hi @yusufozgur,

thanks for the contribution, it looks 99% perfect. The only details is that, surely, it's a good idea, for consistency, to also apply the new setting here: https://github.com/moodlehq/wiki-rag/blob/main/wiki_rag/load/util.py#L82 (that's when all the namespace pages metadata is fetched, to make the list of target pages).

Also, surely, it would be good if we apply the same 2-3 randomness there, instead of the current 2-5 one. Again, just to keep both rate-limits the same.

With that tiny change, I think that this can be applied without problems.

Thanks!

PS: Edited, also, worth commenting, one of the planned improvements is about to allow incremental loads, so only changed/new/deleted pages are fetched and re-processed. I wish that will dramatically reduce loading times. Later, I'd want to apply the same to the indexing, also making it incremental, but that's less critical because indexing is orders of magnitude quicker than loading.

yusufozgur added 2 commits August 25, 2025 14:20

Add option to control rate limiting

c5cddab

Fix a typo in env_template

242ec76

github-actions bot assigned yusufozgur Aug 25, 2025

Merge branch 'main' into add_rate_limit_control

4f6ebc4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add rate limit control #10

Add rate limit control #10

Uh oh!

yusufozgur commented Aug 25, 2025

Uh oh!

stronk7 commented Aug 30, 2025 •

edited

Loading

Uh oh!

Uh oh!

Add rate limit control #10

Are you sure you want to change the base?

Add rate limit control #10

Uh oh!

Conversation

yusufozgur commented Aug 25, 2025

Uh oh!

stronk7 commented Aug 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

stronk7 commented Aug 30, 2025 •

edited

Loading