Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use requests.Session for all .get downloads #48

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

mdavis-xyz
Copy link
Contributor

This change is to make all requests to nemweb use the same requests.Session. The benefit of this is that the underlying TCP connection is re-used, instead of being re-established each time. (And also the TLS connection?) The result is that it's far faster to download lots of small files. (Another benefit is that things like user agent only need to be set up once.)

I tried downloading DUDETAIL with the script below. It took 93s using a session, and 285s without a session.

(I also changed the few http:// urls to https://. I think this also helps with re-using the TCP+TLS connections. Also https is available, so we should use it.)

from nemosis import dynamic_data_compiler, defaults, cache_compiler
from tempfile import TemporaryDirectory
from time import time

with TemporaryDirectory() as tmp_dir:
    start_t = time()

    cache_compiler(
        start_time='2020/01/01 00:00:00', 
        end_time='2024/11/01 00:05:00', 
        table_name='DUDETAIL', 
        raw_data_location=tmp_dir
    )
    end_t = time()
    print(f"Took {end_t - start_t}s")

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant