-
-
Notifications
You must be signed in to change notification settings - Fork 212
Description
One integration I have with tldextract is in a webapp.
I realized I should prime internal cache before the server forks, so do the following (all operations use the _tldextractor) before a fork:
import tldextract
_tldextractor = tldextract.TLDExtract()
_tldextractor("https://github.com/john-kurkowski/tldextract")
That successfully loads the json cache file into memory.
I have a few questions based on a review of the code:
Question 1 - that should remain until I invoke update - correct?
Question 2- the library doesn't seem to store any metadata about the cache or support a cache expiry. correct?
If my understanding of #2 is correct, I think it might be a good feature to support the following: allow TLDExtract to be invoked with a cache expiry (seconds) kwarg; if the PSL cache is stale, attempt a new download/parse/cache. This would only happen if the cache_expiry were sent to __init__( or extract(, so long running processes would not refresh unless explicitly told.