Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: class_urls could be used to point to the class definitions #19

Closed
markdoerr opened this issue Dec 2, 2024 · 3 comments · Fixed by #47
Closed

feat: class_urls could be used to point to the class definitions #19

markdoerr opened this issue Dec 2, 2024 · 3 comments · Fixed by #47
Assignees
Milestone

Comments

@markdoerr
Copy link

during transformation of the model to other targets, class_urls might help to point back to the original definition and could be added.

@dalito
Copy link
Member

dalito commented Jan 9, 2025

LinkML model docs for class_uri: URI of the class that provides a semantic interpretation of the element in a linked data context.

@markdoerr @HendrikBorgelt To which ontologies and vocabularies should we preferably link in class_uri, slot_uri, meaning: FOAF, PROV, DCAT, spar/datacite, ..., voc4cat? (I am not a fan of schema.org for science.)

How important is to map the classes to exactly matching owl ontology class definitions? Should the mapping be suitable for reasoning? What to do if there is no suitable mapping?

These are more general questions. We can nevertheless start with gradually adding class_uri, slot_uri, meaning to the schema.

@HendrikBorgelt
Copy link
Member

The simplest answer first, since the PID is not large enough of a data model to be valuable for its own logic (small pid4cat ontology), a mapping must be implemented to map from this the pid data schema into another terminology. As such it is helpful if we use Ontology classes, but depending on who we present these classes, people will not care for their precise definition and thus enter whatever suits them best.

format representation in a pid usefullness for pid creators
plain name class limited, because you need to find the definition somewhere in the documentation
curie name owl:class people might suspect that something special is going on but the are not required to read through the definition. In my mind the best version
plain URI http://www.w3.org/2002/07/owl#Class people know that they must read whats going on, and will be enticed to align the respective defintion. However for "Programming/Semantic beginners" this is pure overkill

regarding the question "To which ontologies and vocabularies should we preferably link...", I would yes, whatever is suited best. so the list you have mentioned sounds fine (FOAF, PROV, DCAT, spar/datacite, ..., voc4cat).

Since I am not perfectly sure which entries must have a class_uri, slot_uri, meaning, I would appreciate it if you could shortly point me to the MD file where I could find them. I would then make a list of feasible terminology based on best matches in LOV.

@dalito
Copy link
Member

dalito commented Jan 20, 2025

LinkML provides default class_uri and slot_uri if they are not provided explicitly in the schema. From the docs:

If class and slot uris are omitted, then they are still generated behind the scenes, using the default_prefix slot at the schema level.

The URIs are w3id.org-based and work now (#44). I believe this is enough for most classes and slots in pid4cat-model.

Mapping pid4cat enums values is also tricky, for example SAMPLE is a classifying category for the type of resource the PID is used for. This is not matching with the meaning of sample in SOSA. The category is not something on which observations are made. The link to SOSA is better expressed as a close_mapping.

In summary I suggest to

  • by default use the URIs for classes and slots that are automatically provided by LinkML.
  • use SKOS-style mappings to provide a mapping while avoiding to commit to completely reusing a linked data concept.
  • use meaning for enums members if a linked data concept can be reused else use a mapping. Both mappings and meaning are optional. More mappings can be provided over time. Adding them does not break compatibility with existing data (cf versioning).

It is interesting that the large biolink_model does not use class_uri at all and has only few slots with slot_uri.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants