-
Notifications
You must be signed in to change notification settings - Fork 101
Remove reference to dataset from groups, dimensions and mdarrays #667
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
Thank you. I'm a bit on the fence about this. We generally try to match the C++ API (where indeed, you don't have to pass a reference to the dataset when computing the statistics, unlike in the C one). Are you using this API? I assume dropping the lifetime makes the user code a lot nicer? |
|
Thank you for taking a look at this! I wasn't aware that it isn't necessary to pass the dataset to the C++ API and if it's a design decision to model that API I understand your hesitation. Regarding my usage: I would like to use the this library to create new multi-dimensional datasets in a project, but this functionality is not yet implemented if I'm not mistaken. It was when I started to look into adding this functionality that I discovered that the references wasn't strictly necessary. Before I would make a PR to add methods that needed to consider the resulting lifetime annotations I wanted to raise this issue. I haven't found any technical problems with implementing the new methods with the references intact. With this said, I still think it would be a nicer API to get rid of the references and the lifetime annotations from the above mentioned structs. In my mind they impose a restriction and a cognitive load and I'm a strong proponent of keeping API's as simple as possible. But I fully respect if you consider backwards compatibility and/or compatibility with the C++ API more important 🙂 |
|
From https://gdal.org/en/stable/user/multidim_raster_data_model.html#objects-lifetime:
So they cannot be used if you have to pass them a dataset in order to work. Hence, this solution also doesn't cut it, yet.
So it seems to keep some kind of reference. Maybe we could use an |
Sorry, would you mind expanding on this? I don't really understand it.
That's a bit like what I'm doing in #677. GDAL can do reference counting, but it's not thread-safe (atomic). And in addition, I think calling But the phrasing in the MD docs is different ("can be used after the dataset has been closed"). |
Well, you cannot use if after the dataset is closed if you require the dataset to be a parameter of your function(s) of that struct. |
|
Yeah, I think the point is that you don't close it, but keep it around and still avoid the lifetimes on the MD types. |
This is how I imagined it. If it's only I tried to understand how the C++ API handles the dataset by browsing the GDAL repository but unfortunately my C++ is a little to rusty (no pun intended 😉). From what I could gather I also thought about a reference-counted pointer but even if it is the right choice I can't motivate it as easily as this PR. If we disregard |
CHANGES.mdif knowledge of this change could be valuable to users.Greetings!
I was looking into adding methods to create multi-dimensional rasters in this library when I stumbled upon the section about object lifetimes in the user-oriented documentation for GDAL's multidimensional raster data model. It states that it's perfectly legal to keep using e.g. groups and mdarrays even after closing the dataset they belong to. For this reason it surprised me that
Group,DimensionandMDArrayin this library all keep a reference to the dataset the where retrieved from.When I tried to remove these references I discovered that
MDArrayinstances keep a C pointer to a dataset so I guess this is one reason to keep the references around. The only place found it used however was in theget_statisticsmethod and since the alternative of passing the dataset to this method as an argument doesn't sound that bad I'm wondering if this was done since we hold the reference anyway.In this PR I removed the references and added some tests that indicate that it's OK to do so. There may be other reasons for keeping these references around, or the pointer in the mdarray instances is the reason, and in that case I will close this PR. But if they were added on intuition without really being necessary I think it can be worth considering to remove them for the improvement in usability (IMHO). It will of course be a break in backwards compatibility.
Some additional notes:
GroupOrDimensionandGroupOrArrayenums since the are no longer used.datasetargument toMDArray.get_statisticssince the mdarray instance no longer keep a pointer to the dataset around.MDArrayno longer keep a reference to neither a group nor a dimension I replaced thefrom_c_mdarray_and_groupandfrom_c_mdarray_and_dimensionmethods with a singlefrom_c_mdarraymethod._datasetargument fromGroup.from_c_group._parentargument fromDimension.from_c_dimension.Please let me know what you think! I realize this is quite a bold PR for a first time contributor 🙂 If you think this is a good idea I will make sure to update
CHANGES.md.