Replies: 4 comments 1 reply
-
|
It's not being impacedt by repetitions because GDAL will cache the result, and you'll near-instant response in subsequent reads. I thought maybe we could turn that off with 'GDAL_CACHEMAX=0' but it doesnt affect it and there's other things going on (the os can cache and streamline subsequent reads as well afaik) There's a good article about the cache behaviour (by @ctoney of course!) https://usdaforestservice.github.io/gdalraster/articles/gdal-block-cache.html |
Beta Was this translation helpful? Give feedback.
-
|
with
I wonder if it's worth comparing results that use the default single threaded instead? See https://gdal.org/en/stable/drivers/raster/gtiff.html#open-options:
NUM_THREADS is also a creation option for GTiff: https://gdal.org/en/stable/drivers/raster/gtiff.html#creation-options I believe the same applies for COG. Setting GDAL_NUM_THREADS configuration option affects those open options/creation options even if you don't set them explicitly at the driver level (i.e, at dataset creation or dataset open). I'm wondering if you're seeing some effect of speeding up slow compress and/or decode with certain algorithms, while multi-threaded may have less effect with algorithms that are not so slow to begin with? |
Beta Was this translation helpful? Give feedback.
-
|
Thanks for the reply, I tried it with default settings (single threaded?) and it seems your thoughts are correct, at least for writing. While there are only minimal differences between single and multi-threaded for writing in some algorithms it is quite obvious for e.g. WEBP or ZSTD. Reading seems not very much affected by multi-threading (not sure though if the GDAL_NUM_THREADS setting is directly affecting this?). To me it seems that the similar reading times (which made me curious) are simply coming from the fact that better compressed images are read at a lower speed but the smaller file size negates this and in the end it is balanced (in my case). |
Beta Was this translation helpful? Give feedback.
-
|
I believe the changes you made removed the effect of the block cache as originally noted by @mdsumner. Also, the write function as you have it now results in closing the dataset when I don't think the block cache is affecting relative performance in the current version of your tests, but FWIW, it should be possible to disable it with
Also, with The initial read operation in these tests (during Starting with the source data in cloud storage does mimic the real-world use case. But write time in this test is actually measuring the combination of read from cloud plus write locally ( i.e., Please note that you could spend a lot of time on these suggestions and still not see much difference. I'm really not sure. Feel free to use or not accordingly. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
I would like to benchmark read and write times depending on image compression and I tried to achieve this with following code:
Is it valid here to simply measure the function
read_ds()?Created on 2025-03-19 with reprex v2.1.0
I did this with a larger orthophoto in different compressions and according to my benchmark the read time is appoximately the same for all compressions. I didtn expect that and now I am wondering if either the code above is not doing what I expected, or there is an error somewhere else in my code, or there really is no pronounced difference.
Beta Was this translation helpful? Give feedback.
All reactions