Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Local downloads ignored/overwritten #385

Open
jmealo opened this issue May 16, 2024 · 4 comments
Open

Local downloads ignored/overwritten #385

jmealo opened this issue May 16, 2024 · 4 comments
Labels
enhancement New feature or request

Comments

@jmealo
Copy link

jmealo commented May 16, 2024

What version of PgOSM Flex are you using?

latest

Docker image

What did you do exactly?

Try to download north-america.

What did you expect to happen?

It to download in less than half an hour.

What did happen instead?

It stalled and barely got past 100mb.

What did you do to try analyzing the problem?

I downloaded the file using aria2c and then started pgosm-flex again.
Even though the file was chmod 777 it says "local file not found, downloading file and .md5"

I believe this issue may be if the .md5 isn't found, but the file is, it just blows out the file.

Update: Even with the .md5 and .osm.pbf file in the data directory, chmod 777 it still blows it out.

Either way, pgosm-flex will redownload everything.

@jmealo jmealo changed the title Issue with local data files (slow downloads + blows out existing local files) Local downloads ignored if .md5 file is missing May 16, 2024
@jmealo jmealo changed the title Local downloads ignored if .md5 file is missing Local downloads ignored/overwritten if .md5 file is missing May 16, 2024
@jmealo jmealo changed the title Local downloads ignored/overwritten if .md5 file is missing Local downloads ignored/overwritten May 16, 2024
@jmealo
Copy link
Author

jmealo commented May 16, 2024

I'm using --force I don't know if that's part of the behavior.

@rustprooflabs
Copy link
Owner

The current behavior doesn't care about files named -latest and will overwrite without doing any checks. In hindsight, probably not ideal!

To use a manually downloaded file replace latest with yyyy-mm-dd. Instead of us-latest.osm.pbf it'd be us-2024-05-16.osm.pbf and the same with the .md5 file. Then when running docker exec add --pgosm-date 2024-05-16. With the --pgosm-date along with region details it should find your available files and skip the download.

I thought I had this documented but a quick search didn't bring anything up. I'll work on adding that in soon.

@jmealo
Copy link
Author

jmealo commented May 17, 2024

  • Would it be possible to do a HEAD request to get the size... if the size on disk matches on disk, we do an md5 and check if matches the checksum, and if it does... we skip the download?

@rustprooflabs
Copy link
Owner

@jmealo I think that would work. Your suggestion got me thinking about possible side effects and the only negative impact I could think of was "the osm_date column would lie!" Hence, #388. I think this change could be made after that situation is improved.

I'll try to start working on #388 in the near-ish future. If you have time and want to submit a PR to check the file size as suggested against HEAD (assuming via requests), I'd be happy to review/merge.

The geofabrik.py module is the first place that will need adjusting. The logic around what happens when a download isn't needed will also need adjusting. Right now it assumes the correct file is named yyyy-mm-dd and will overwrite the -latest file via this code. Some of that logic looks like I wrote it quickly and never looked back!

There may be some other fidgeting required to make this work properly, but that should be a good path forward.

@rustprooflabs rustprooflabs added the enhancement New feature or request label May 23, 2024
@rustprooflabs rustprooflabs added this to the 1.0.2 milestone May 23, 2024
@rustprooflabs rustprooflabs modified the milestones: 1.0.2, 1.0.3 Jul 6, 2024
@rustprooflabs rustprooflabs removed this from the 1.1.0 milestone Aug 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants