Support units in Table Schema #1002
Replies: 28 comments 8 replies
-
@rufuspollock before we move it here, it would be good to discuss how it is to be implemented. Is the proposal to implement this spec, as is, as part of Table Schema? |
Beta Was this translation helpful? Give feedback.
-
@pwalsh open to suggestions. I was thinking of keeping it separate but adding support for referencing it from table schema but not sure what is best. |
Beta Was this translation helpful? Give feedback.
-
I've drafted a pattern #607 and started a discussion on the forum https://discuss.okfn.org/t/table-schema-units-pattern/6573 |
Beta Was this translation helpful? Give feedback.
-
Notes from thread in the PR #607 @Stephen-Gates wrote:
@rufuspollock wrote back
@dr-shorthair wrote
|
Beta Was this translation helpful? Give feedback.
-
@Stephen-Gates (/cc @dr-shorthair) I think our aim is to see if we can extract a subset of UCUM that gives us 80/20 and then say if you want more go to UCUM. I have to say i think this could / should go in 2 stages:
This keeps things clean. wdyt? And if so would that mean we could merge #607 as it is? |
Beta Was this translation helpful? Give feedback.
-
@Stephen-Gates any luck here on progressing this? |
Beta Was this translation helpful? Give feedback.
-
Sorry been focussed a new release of Data Curator. Haven’t forgotten |
Beta Was this translation helpful? Give feedback.
-
Hi @Stephen-Gates (and also @rufuspollock + @pwalsh)! I wanted to let you know that @mbomhoff from Planet Microbe has been working with data packages for their oceanographic data and has been thinking about what units specs would work best for them. I wanted to tag Matt so he can keep updated on the specs units conversation, and also intro y'all in case you want to connect and discuss what units ideas Matt has. Thanks both 😄 |
Beta Was this translation helpful? Give feedback.
-
@lwinfree Thanks Lilly! Our data packages are in https://github.com/hurwitzlab/planet-microbe-datapackages. For the time being we added a custom property |
Beta Was this translation helpful? Give feedback.
-
@mbomhoff are you ok with the direction the draft PR was taking if we address the comments above? |
Beta Was this translation helpful? Give feedback.
-
@rufuspollock do you have any concerns about using UCUM given its licence? |
Beta Was this translation helpful? Give feedback.
-
Item 2. in the license is problematic:
Unfortunately UCUM now appears to be an infrastructure orphan - I've not been able to make contact with Guenther Schadow for a couple of years now. Possibly retired. I'll try again. |
Beta Was this translation helpful? Give feedback.
-
@Stephen-Gates It looks like the draft spec is capable of describing all of the units that we use in our project, but I think our application falls under the case of using an existing spec. One of the goals of our project is to use ontologies to unify disparate datasets from various sources. To describe a field we supply an Environment Ontology (ENVO, http://environmentontology.org/) purl in the
For us the UO purl provides stronger semantics and some additional info such as aliases (meter, metre) and a text description. |
Beta Was this translation helpful? Give feedback.
-
@Stephen-Gates any chance to look at this further. It sounds like we have to steer around UCUM atm. |
Beta Was this translation helpful? Give feedback.
-
UCUM only provides the terminal symbols, and a grammar to combine them into any UoM. So it is a mistake to talk about 80:20 provide by UCUM with respect to some finite set. UCUM probably provides 95%+, but by utilising the grammar. Meanwhile, I have now tracked down the owner of UCUM so it's not dead yet. |
Beta Was this translation helpful? Give feedback.
-
Lilly Winfree directed me to this discussion after I asked her how you handle units in your project in PyData Asutin. We have been dealing with Unit standardization for over a year and can connect you to some of unit specs - at least in the medical domain. |
Beta Was this translation helpful? Give feedback.
-
@Jacob-Barhak this is great info - if you could share your experience and links that would help esp any key pointers. Your tip re UCUM is also very helpful. We will look at https://clinicalunitmapping.com/ |
Beta Was this translation helpful? Give feedback.
-
So @rufuspollock , all documentation associated with the project is available in the about page: https://clinicalunitmapping.com/about |
Beta Was this translation helpful? Give feedback.
-
If adding support to units was done: frictionless-py could output the values with its units (maybe optionally) using https://pint.readthedocs.io/en/stable/ Obviously if the units used in the spec were available in the Pint library. |
Beta Was this translation helpful? Give feedback.
-
made a comment about units over here, acep-uaf/aetr-web-book-2024#40 but maybe this issue is a more appropriate place so I've copied below: another units good standard: https://www.qudt.org/doc/DOC_VOCAB-UNITS.html FYI in case its useful / interesting: |
Beta Was this translation helpful? Give feedback.
-
Are you still looking for solutions for units? You may want to check advances in clinicalunitmapping.com There is now AI behind this that is pretty good already. This project is still in beta and there is still work to do and its use is limited to demonstrate feasibility, yet it is getting better. You can find recent publications in: What are your unit needs? Why exactly do you need them? Will be happy to talk via video. |
Beta Was this translation helpful? Give feedback.
-
Also see https://si-digital-framework.org/ from BIPM who are the authority on SI. |
Beta Was this translation helpful? Give feedback.
-
hi, what is the status here? unit is crucial for our users. related to https://discuss.okfn.org/t/table-schema-units-pattern/6573 Another thing I wonder (which could also be an aspect of the rdfType) is if at this level a reference can be added to a procedure (for example a laboratory method) on how the value was estimated, and maybe an indication related to the accuracy of the value? I noticed that unit is included in the camtrap-dp |
Beta Was this translation helpful? Give feedback.
-
QUDT is well maintained - https://github.com/qudt/qudt-public-repo/releases If you need a new unit, then it is usually turned around in a week or so - https://github.com/qudt/qudt-public-repo/wiki/Unit-Vocabulary-Submission-Guidelines . |
Beta Was this translation helpful? Give feedback.
-
Hi Guys,
You are dealing with a much larger issue than you think - units are not
standardized despite what people think and despite multiple standards. Ucum
units are pretty limited. Not sure what is your scope, yet if you dive into
this problem enough, you will end up doing what I am doing already for
several years and it is far from trivial.
I mentioned my work before in this thread.
If you need a specific local solution, then it's easy, yet once you see the
real scope and open to other scenarios like it seems you are starting to
do, it will become much harder and a project on its own.
So try to keep it simple.
Good luck
…On Wed, Nov 6, 2024, 16:07 Jakob Voß ***@***.***> wrote:
What's the advantage of unit if it can be any string? This is already
possible:
{
"name": "speed",
"description": "speed in m/s",
"type": "number"
}
I'd welcome a unit field with defined semantics and either UCUM or unit
URIs (such as QUDT) or both seem to fit well:
Example with valid UCUM unit (m/s) *and* unit URI (redundant):
{
"name": "speed",
"unit": "m/s",
"unitType": "https://qudt.org/vocab/unit/M-PER-SEC"
"type": "number",
}
—
Reply to this email directly, view it on GitHub
<#1002 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAPE423NUGBLRMC7PZT7J73Z7IPCNAVCNFSM6AAAAABRILF67OVHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTCMJWG4YTOOI>
.
You are receiving this because you were mentioned.Message ID:
<frictionlessdata/datapackage/repo-discussions/1002/comments/11167179@
github.com>
|
Beta Was this translation helpful? Give feedback.
-
John and Roll,
Not sure where you are heading, yet since there is no one standard, the
unit and source should be good enough for now - in some cases you will not
have URI - yet it's important to have some level of context.
URI can change in time as well as standards that have versions. In some
extreme cases a unit may mean something different - for example M can be
understood as meter or a Mega like in Mb. So really there is plenty of
confusion and whatever you do will not answer the entire question.
How about using
{
"unit_symbol": "M",
"unit_uri": "https://qudt.org/vocab/unit/M",
"source": "qudt",
}
This will take you a long way and leave the real bad problems to someone
else to deal with. Its not perfect, yet you really dont want to deal with
the unit mess - if you do, please contact me - I have plenty of boring work
I need help with.
Also in this case if UI does not exist, you at least know the source so
someone can trace it back.
Yet you can also decide that the source is either a URI or plain text and
continue this way. Its your system - you choose.
Whatever you choose - good luck, units are not as trivial as everyone
thinks and you better know the limitations , this discussion is actually
important.
…On Thu, Nov 7, 2024 at 3:07 AM roll ***@***.***> wrote:
Currently, the recommended way for using existent dictionaries is like:
{"key: "value","dct:key": "value"
}
Can this approach be used with units?
—
Reply to this email directly, view it on GitHub
<#1002 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAPE4267S5S6GBQUPJAODBLZ7MUVXAVCNFSM6AAAAABRILF67OVHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTCMJXGUZTQNI>
.
You are receiving this because you were mentioned.Message ID:
<frictionlessdata/datapackage/repo-discussions/1002/comments/11175385@
github.com>
|
Beta Was this translation helpful? Give feedback.
-
@Jacob-Barhak This solution of three terms is exactly the generic solution we found for terms expected to come from controlled vocabularies. For example, if the root term in question is "eventType", then it could be populated with a string literal, accompanied by the equivalent of your "source" in a term called "eventTypeVocabulary". The third term, "eventTypeIRI" allows a resolvable controlled vocabulary term to be declared directly. This covers all the situations we need, 1) I made up this value for eventType, 2) this value for eventType can be found in that vocabulary (but they don't have resolvable controlled values), and 3) the exact value I am talking about can be found at this IRI. |
Beta Was this translation helpful? Give feedback.
-
Ok John,
If you figured it out already, you are on a good route. I really suggest
not to mess with units at a deeper level unless you want your entire
project to lose focus. If you have a simple solution, just go with it. You
can follow my work to see when a better solution appears specifically for
units. It's a slow process, yet it will get there eventually.
Good luck
…On Thu, Nov 7, 2024, 18:09 John Wieczorek ***@***.***> wrote:
@Jacob-Barhak <https://github.com/Jacob-Barhak> This solution of three
terms is exactly the generic solution we found for terms expected to come
from controlled vocabularies. For example, if the root term in question is
"eventType", then it could be populated with a string literal, accompanied
by the equivalent of your "source" in a term called "eventTypeVocabulary".
The third term, "eventTypeIRI" allows a resolvable controlled vocabulary
term to be declared directly. This covers all the situations we need, 1) I
made up this value for eventType, 2) this value for eventType can be found
in that vocabulary (but they don't have resolvable controlled values), and
3) the exact value I am talking about can be found at this IRI.
—
Reply to this email directly, view it on GitHub
<#1002 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAPE422ADY55GMQGDZ5WB5TZ7OGDNAVCNFSM6AAAAABRILF67OVHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTCMJXHE3TGMQ>
.
You are receiving this because you were mentioned.Message ID:
<frictionlessdata/datapackage/repo-discussions/1002/comments/11179732@
github.com>
|
Beta Was this translation helpful? Give feedback.
-
Move the units draft spec http://specs.okfnlabs.org/units/ back to FD specs
/cc @pwalsh
Beta Was this translation helpful? Give feedback.
All reactions