-
Notifications
You must be signed in to change notification settings - Fork 22
Pure Julia WKT2 to PROJJSON conversion #156
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
00449b1
to
f4d782e
Compare
f4d782e
to
24b4323
Compare
@Omar-Elrefaei this looks promising. Please let me know when it is ready for review. |
Some deep issues with the design here:
My design is more complex in part because it tackles these issues and more. |
Thank you for sharing @nsajko. Appreciate your inputs. If you can evolve your PR to a final version to show the technical advantages in practice, we will happily consider it. Both PRs are still work in progress, but it is really nice to see the high quality of the attempts already. |
These end up effecting the "datum" json items
I think this is functionally mostly there, save for few rare edge cases. few of the concerns brought up by nsajko are justifies, some are design tradeoffs, and some are a non-issue, I think. I'll take time to elaborate soon. |
It's kindof a hack, I won't disagree. But I think it is an elegant one. julia> """hello[123, "world"]""" |> Meta.parse
:(hello[123, "world"])
julia> """hello[123, "world"]""" |> Meta.parse |> eval
ERROR: UndefVarError: `hello` not defined in `Main`
...
julia> """hello[123, "world"]""" |> Meta.parse |> GeoIO.expr2dict
Dict{Symbol, Vector{Any}} with 1 entry:
:hello => [123, "world"] I'm not a security expert but I doubt there is any potential for a security vulnerability here
While your design is definitely more academically proper, I took a much more test-driven approach with it. Before starting with anything JSON related I ran all 7000+ WKT files in the dataset through that Definitely a different set of tradeoffs, it is up to @juliohm to decide which is more appropriate for his project. But I'm definitely impressed at how quickly you wrote up that full-fledged parser. It is partially my bad for not having any clarifications about design decisions up front in the PR. |
Looking forward to evaluate the pros and cons of each approach. Thank you all for the amazing work and considerations shared so far. It really helps! |
This reverts commit 33fac70. turns out ArchGDAL is still needed for other functionality
Yes. This is totally independent from the floating point discussion. |
I wonder if testing against PROJ instead of against GDAL might be simpler. |
@Omar-Elrefaei is there any way to adjust the comparison function to check for these alternative representations? I understand that GDAL is arbitrary in these choices, so we don't have much choice other than checking that any alternative matches. Please let me know if you need any additional input from me before making the final adjustments. The suggestion by @nsajko might be interesting to explore also. |
So you want us to be producing projjson that matches GDAL regarding these alternatives?
I'm not sure what are you asking exactly. That feels like complicating it beyond what is needed to be honest. If maybe you mean that we check that we produced at least of the alternative representations: that is what happens in the |
I believe we can always use inverse_flattening as you suggested. I'm just
wondering how tests will pass in this case if the GDAL output has something
else. My suggestion was to make the test comparison more evolved for the
ellipsoid parameters but I believe that is not trivial to do given the way
diffpaths simply scans the tree without special cases.
What about we delay this decision to after the other remaining fixes?
Em sex., 23 de mai. de 2025, 16:48, Omar Elrefaei ***@***.***>
escreveu:
… *Omar-Elrefaei* left a comment (JuliaEarth/GeoIO.jl#156)
<#156 (comment)>
So you want us to be producing projjson that matches GDAL regarding these
alternatives?
@Omar-Elrefaei <https://github.com/Omar-Elrefaei> is there any way to
adjust the comparison function to check for these alternative
representations? I understand that GDAL is arbitrary in these choices, so
we don't have much choice other than checking that any alternative matches.
I'm not sure what are you asking exactly.
We have to write our projjson totally independently from GDAL. WKT only
supplies semi_major_axis and inverse_flattening, so we will always be
writing them like that to projjson.
Now is your suggestion the following: during testing if GDAL provides
semi_major_axis and inverse_flattening, compare their values to ours. If
GDAL provides semi_major_axis and semi_minor_axis, we calculate a
semi_minor_axis from our values and make the comparison.
That feels like complicating it beyond what is needed to be honest.
If maybe you mean that we check that we produced at least of the
alternative representations: that is what happens in the isvalid schema
check. I as thinking of actually proposing add JSONSchema as an actual
project dependency so we can always do a schema check before returning
projjson to the user.
—
Reply to this email directly, view it on GitHub
<#156 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAZQW3KJE5R5OM7ONSJHOM32753PXAVCNFSM6AAAAAB3Z7JDT2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDSMBVGYZTSMJRGQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Yes we can delay the conversation.
I simply ignored So yes, if there is a hypothetical instance where the numerical value of our inverse_flattening is different from GDAL's inverse_flattening: that will not be caught with the tests because inverse_flattening is always ignored. While it arguable ought to be caught; I assumed that is a trivial price we are willing to pay. Is that what you want to avoid? Don't ignore it when we can do the comparison, and ignore it when we can't. |
Yes, something along your last sentence. If we are producing the wrong
ellipsoid parameters and the tests don't catch that, we might take too long
to uncover the bug.
But if it is something that requires too much more coding, please ignore
this idea for now. We can come back to it in a separate issue.
Em sex., 23 de mai. de 2025, 17:02, Omar Elrefaei ***@***.***>
escreveu:
… *Omar-Elrefaei* left a comment (JuliaEarth/GeoIO.jl#156)
<#156 (comment)>
Yes we can delay the conversation.
I'm just wondering how tests will pass in this case if the GDAL output has
something else.
I simply ignored inverse_flattening from the comparison.
So yes, if there is a hypothetical instance where the numerical value of
our inverse_flattening is different from GDAL's inverse_flattening: that
will not be caught with the tests because inverse_flattening is always
ignored. While it arguable ought to be caught; I assumed that is a trivial
price we are willing to pay.
*Is that what you want to avoid? Don't ignore it when we can do the
comparison, and ignore it when we can't.*
—
Reply to this email directly, view it on GitHub
<#156 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAZQW3M4V5FQU7YSR6JQWG32755DRAVCNFSM6AAAAAB3Z7JDT2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDSMBVGY4DMMZXHA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Thinking about it more carefully... If GDAL doesn't match the ellipsoid
parameters with or without conversion of inverse_flattening, then this is a
bug in GDAL. We are consuming the official WKT from the database so there
is no way the bug is in our side.
Please ignore my previous comments to try to compare the parameters like
this. I think we can simply ignore them.
If we were writing projjson from scratch then that would be a different
story. We are just converting strings assuming the input string is
undeniably correct.
Em sex., 23 de mai. de 2025, 17:05, Júlio Hoffimann <
***@***.***> escreveu:
… Yes, something along your last sentence. If we are producing the wrong
ellipsoid parameters and the tests don't catch that, we might take too long
to uncover the bug.
But if it is something that requires too much more coding, please ignore
this idea for now. We can come back to it in a separate issue.
Em sex., 23 de mai. de 2025, 17:02, Omar Elrefaei <
***@***.***> escreveu:
> *Omar-Elrefaei* left a comment (JuliaEarth/GeoIO.jl#156)
> <#156 (comment)>
>
> Yes we can delay the conversation.
>
> I'm just wondering how tests will pass in this case if the GDAL output
> has something else.
>
> I simply ignored inverse_flattening from the comparison.
>
> So yes, if there is a hypothetical instance where the numerical value of
> our inverse_flattening is different from GDAL's inverse_flattening: that
> will not be caught with the tests because inverse_flattening is always
> ignored. While it arguable ought to be caught; I assumed that is a trivial
> price we are willing to pay.
>
> *Is that what you want to avoid? Don't ignore it when we can do the
> comparison, and ignore it when we can't.*
>
> —
> Reply to this email directly, view it on GitHub
> <#156 (comment)>,
> or unsubscribe
> <https://github.com/notifications/unsubscribe-auth/AAZQW3M4V5FQU7YSR6JQWG32755DRAVCNFSM6AAAAAB3Z7JDT2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDSMBVGY4DMMZXHA>
> .
> You are receiving this because you were mentioned.Message ID:
> ***@***.***>
>
|
@Omar-Elrefaei I believe we only have two remaining issues to discuss before the merge:
|
I'll try.
I agree. I didn't add that. It was in the original code and it did annoy me with some silent failures at some point. |
I remember now that this try-catch block is handling the fallback |
Wow! That is annoying! I don't know why the strings are considered
different just because of their types. Maybe a bug in Julia? Maybe expected
behavior that is counter-intuitive?
Thanks for diving into it!
Em seg., 26 de mai. de 2025, 13:35, Omar Elrefaei ***@***.***>
escreveu:
… *Omar-Elrefaei* left a comment (JuliaEarth/GeoIO.jl#156)
<#156 (comment)>
Argh!!! 😫 🤯 Caught it.
(screenshot of debugging session. conclusion at the end. will send push
fixes in less than an hour)
image.png (view on web)
<https://github.com/user-attachments/assets/e3038d64-f7f5-4370-bc9e-c23c99a0673f>
image.png (view on web)
<https://github.com/user-attachments/assets/34a958a4-5072-4061-adeb-b6a8dd40ae49>
ie, at the end, the difference between pre and post jsonroundtrip is
image.png (view on web)
<https://github.com/user-attachments/assets/9cae691b-f361-4edb-a5f5-d0fe1e7cd1e2>
I had used split(...)[1] at some point in the code, which returns a
substring. just need to pass that to string
—
Reply to this email directly, view it on GitHub
<#156 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAZQW3O7JEH5MC43KIMHSG33AM7ELAVCNFSM6AAAAAB3Z7JDT2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDSMJQGIZTIOJVGM>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
…he jsondict I found the reason `jsonroundtrip` was needed for `JSONSchema.validate` to work properly. Turns out there is a bug in JSONSchema that makes it faultily deal with the underlying String behind a SubStrings
Not a bug in Julia, but in JSONSchema. a string is Peaking at their code, seems that they should use |
Thank you! Please link the issue here for future reference. |
Thank you again for another amazing contribution @Omar-Elrefaei. That was a tough one. Be assured that this will have a huge impact in our ecosystem ❤️ 🫶 |
Working towards #150
/claim #150
Todo:
semi_minor_axis
inverse_flattening
discrepancy (4275, 4267)find_diff_paths
workaround.test/jsonutils.jl
to project styleDecide whether tests can actually depend on DeepDiffsFound better alternative solution.Tools for live development and debugging