Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch to RDF/JS representation #73

Closed
iddan opened this issue Dec 18, 2018 · 35 comments
Closed

Switch to RDF/JS representation #73

iddan opened this issue Dec 18, 2018 · 35 comments
Assignees

Comments

@iddan
Copy link
Contributor

iddan commented Dec 18, 2018

Currently RDFTerms (Literals, IRIs and prefixed IRIs) are represented as strings.
I suggest to use the RDF/JSON representation for RDF values.
Also, prefixed IRIs should be distinguished from IRIs.

import { Parser } from 'sparqljs';
const parser = new Parser();
parser.parse(`
PREFIX foaf:  <http://xmlns.com/foaf/0.1/>
SELECT *
WHERE {
    ?person foaf:name ?name .
    ?person foaf:mbox ?email .
}
`)

Should return

{
  "queryType": "SELECT",
  "variables": [
    "*"
  ],
  "where": [
    {
      "type": "bgp",
      "triples": [
        {
          "subject": {
              "type": "variable",
              "identifier": "person"
          },
          "predicate": {
              "type": "prefixed-iri",
              "prefix": "foaf",
              "value": "name"
          },
          "object": {
              "type": "variable",
              "identifier": "name"
          }
        },
        {
          "subject": {
              "type": "variable",
              "identifier": "person"
          },
          "predicate": {
              "type": "iri",
              "value": "http://xmlns.com/foaf/0.1/mbox"
          },
          "object": {
              "type": "variable",
              "identifier": "email"
          }
        }
      ]
    }
  ],
  "type": "query",
  "prefixes": {
    "foaf": "http://xmlns.com/foaf/0.1/"
  }
}

We'd be happy to offer a PR

@RubenVerborgh
Copy link
Owner

RubenVerborgh commented Dec 18, 2018 via email

@iddan
Copy link
Contributor Author

iddan commented Dec 18, 2018

Amazing one! Please do

@iddan
Copy link
Contributor Author

iddan commented Dec 18, 2018

Something I miss in it is an explicit representation for a Prefixed URI. Do you know if it includes it?

@RubenVerborgh
Copy link
Owner

RubenVerborgh commented Dec 18, 2018 via email

@iddan
Copy link
Contributor Author

iddan commented Dec 18, 2018

I think we should use the RDF/JSON as you suggested and additionally define a prefixed URI and variable types. What do you think?

@akuckartz
Copy link

JSON-LD ?

@RubenVerborgh
Copy link
Owner

I think we should use the RDF/JSON as you suggested and additionally define a prefixed URI and variable types.

RDF/JS does have variables in there already.

It purposely does not have a prefixed URI, because that is not a different term type.
That said, here's my solution for this:

  • The DataFactory interface has a method NamedNode namedNode(string value)
  • Call this method with factory.namedNode('http://xmlns.com/foaf/0.1/knows', { prefix: 'foaf', localName: 'knows' })
  • RDF/JS-compliant implementations ignore the second argument
  • Your implementation can use the second argument as it pleases

@RubenVerborgh
Copy link
Owner

JSON-LD ?

The main function of the parse tree is to use it within JavaScript, not exchanging it across implementations. For exchange, I'd recommend the SPARQL query itself.

@RubenVerborgh
Copy link
Owner

Proposed plan of implementation (AKA how I would do it, but your opinion might differ):

  • accept a { factory: … } options object to the SparqlParser constructor
  • make factory default to the N3.js data factory
  • in /test/, create an implementation of the factory that returns the current object
    • this avoids that you have to change tests themselves, only their setup
    • can easily be implemented by adding a wrapper around the N3.js factory: namedNode = value => n3.DataFactory.namedNode(value).id
  • wire up the tests to use the test factory
    • tests should keep on running
  • replace all code to create terms and triples by calls to N3.js
    • tests should keep on running

@RubenVerborgh RubenVerborgh changed the title Update SPARQL.js to use RDF/JSON representation for RDFTerm and Variable Switch to RDF/JS representation Dec 30, 2018
@RubenVerborgh
Copy link
Owner

FYI Linked to this in the README: 836017e

@namedgraph
Copy link
Contributor

How about using SPIN SPARQL syntax as RDF representation? https://www.w3.org/Submission/spin-sparql/

@iddan
Copy link
Contributor Author

iddan commented Jan 13, 2019

Because this is a JavaScript library it makes the most sense to use a spec made for JavaScript representation of RDF. Also, the main goal is to be able to easily distinguish between term types without further string parsing.

@iddan
Copy link
Contributor Author

iddan commented Jan 13, 2019

@RubenVerborgh does the change require changing the JISON syntax file as well?

@namedgraph
Copy link
Contributor

You'll still need a vocabulary for the JSON representation. Why invent the bike? Stand on the shoulders of giants, and all that...
You could use JSON-LD to write SPIN syntax -- would that achieve your objective?

@RubenVerborgh
Copy link
Owner

RubenVerborgh commented Jan 13, 2019

How about using SPIN SPARQL syntax as RDF representation?

We’re not looking for an RDF representation, but rather for a JavaScript in-memory representation.

@RubenVerborgh
Copy link
Owner

does the change require changing the JISON syntax file as well?

Yes, but only to its JavaScript parts that generate terms.

@iddan
Copy link
Contributor Author

iddan commented Jan 13, 2019

I'd need help with that. Unfortunately, I have never written something like BISON or JISON

@RubenVerborgh
Copy link
Owner

It's pretty easy, as all the Jison stuff has been written already and you don't really have to touch it for this. All you need to alter is the JavaScript embedded within the Jison. For example:

      $$ = resolveIRI(expansion + $1.substr(namePos + 1));

will become something like

      var uriString = resolveIRI(expansion + $1.substr(namePos + 1));
      $$ = factory.namedNode(uriString);

@namedgraph
Copy link
Contributor

How big of a task would this be? Is it only the s/p/o that need to be changed or more than that?
If someone could summarize that, I could take a crack at it (not promising anything :))

@RubenVerborgh
Copy link
Owner

Just the s/p/o; summary right above 🙂 Thanks for taking a stab at this.

@simonvbrae
Copy link
Collaborator

simonvbrae commented Jul 23, 2019

@RubenVerborgh
The way it is right now, the array of variables in a JSON representation sometimes contains a string ("*") and sometimes an array of TermTypes. @rubensworks would like to see this changed and has suggested to make it so "variables" is always a list of strings. That way users don't have to check what exactly it contains.
Personally I think that that would be too much work in our codebase for the reward, and it might complicate code in other places like SPARQLAlgebra where elements of the variables array sometimes have to contain other attributes like an expression.

What do you think? Is this worth it? and what solution do you recommend?

@RubenVerborgh
Copy link
Owner

Hmm, interesting case. @rubensworks is there something to be said for a superclass of Term (SparqlTerm?) of which WildcardTerm is an instance?
I’d like to see consistency in our use of Term; not sometimes strings.

@rubensworks
Copy link
Collaborator

I think if we go down the route of creating a new Term type, we may run into issues with packages downstream, as these will not know how to handle this special Term type.
But it should definitely be possible to do though, but this should be documented in detail then.

Simply making all values of the variables array to be strings would not require special handling and documentation, as it will be immediately clear to users that they are not working with Terms anymore.

In any case, there is no one clean solution I'm afraid, other than changing the RDFJS spec, which would also not be very clean.

@RubenVerborgh
Copy link
Owner

Slight difference perhaps: I wrote superclass of Term. It could still implement all Term methods and properties (mainly equals to false), but would (strictly) not be a Term.

@rubensworks
Copy link
Collaborator

Sure, we could try that.
Perhaps @joachimvh could share his insights on the usability of the superclass approach within SPARQLAlgebra.js.

@namedgraph
Copy link
Contributor

Keep in mind that changes will also propagate to TypeScript bindings:
https://github.com/DefinitelyTyped/DefinitelyTyped/blob/master/types/sparqljs/index.d.ts

Type structure changes will break code.

@RubenVerborgh
Copy link
Owner

RubenVerborgh commented Jul 23, 2019 via email

@namedgraph
Copy link
Contributor

namedgraph commented Jul 23, 2019

I am not sure actually. They're simply on a Git repo. From this it looks like versioning is not supported: DefinitelyTyped/DefinitelyTyped#7719

Maybe @AlexeyMz could help, he's the author of the typings. Currently they say:

// Type definitions for sparqljs 2.1

@AlexeyMz
Copy link
Contributor

AlexeyMz commented Jul 23, 2019

@namedgraph DefinitelyTyped covers this situation in the README:

If you intend to continue updating the older version of a library's type declarations, you may create a new subfolder (e.g. /v2/) named for the current (soon to be "old") version, and copy existing files from the current version to it.

https://github.com/DefinitelyTyped/DefinitelyTyped#if-a-library-is-updated-to-a-new-major-version-with-breaking-changes-how-should-i-update-its-type-declaration-package

So we should be fine with moving current typings into types/sparqljs/v2/ and put updated typings for [email protected].

@AlexeyMz
Copy link
Contributor

The way it is right now, the array of variables in a JSON representation sometimes contains a string ("*") and sometimes an array of TermTypes. @rubensworks would like to see this changed and has suggested to make it so "variables" is always a list of strings. That way users don't have to check what exactly it contains.

We extensively use Sparql.js in our projects and usually you don't want to treat *-projection the same way as an array of variables (where each one is a term or an expression), e.g. you can't append another term to variables if it's a *. So I think it make sense to treat it separately like in the grammar:

https://www.w3.org/TR/sparql11-query/#rSelectClause

@iddan
Copy link
Contributor Author

iddan commented Jul 24, 2019

This is very true. The tree should reflect the query

@joachimvh
Copy link
Collaborator

Perhaps @joachimvh could share his insights on the usability of the superclass approach within SPARQLAlgebra.js.

SPARQLAlgebra will need rewriting after this update anyway. But in the conversion I immediately convert '*' to the correct variables so I won't be using much of this new term anyway.

@RubenVerborgh
Copy link
Owner

Released as 3.0.0-beta.0 via #80.

@jacoscaz
Copy link
Contributor

jacoscaz commented Jun 22, 2020

Apologies for resurrecting an old issue but it seems to be the most appropriate place for me to point out what follows.

It looks like the typings at https://github.com/DefinitelyTyped/DefinitelyTyped/blob/master/types/sparqljs/index.d.ts have not been updated after the transition to RDF/JS. I double-checked with @rubensworks on gitter.im to make sure I wasn't missing anything and this does seem to be the case.

Has any work ever been put on this?

@RubenVerborgh
Copy link
Owner

@jacoscaz Not yet, tracking in #98

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants