Skip to content

Clarify the RDF Message Stream definition#18

Open
Ostrzyciel wants to merge 6 commits intomainfrom
GH-10/stream-definition
Open

Clarify the RDF Message Stream definition#18
Ostrzyciel wants to merge 6 commits intomainfrom
GH-10/stream-definition

Conversation

@Ostrzyciel
Copy link
Collaborator

@Ostrzyciel Ostrzyciel commented Mar 3, 2026

Based on the comments from the last RDF Messages TF meeting: https://www.w3.org/community/rsp/wiki/RDF_Messages_Task_Force/Meeting_2026-02-13

After we merge this, we will review it again at a TF meeting. If everyone is happy, we will close issue #10

+@jpcik
+@tobixdev
+@keski

@Ostrzyciel Ostrzyciel force-pushed the GH-10/stream-definition branch from 0bbf883 to 6c01530 Compare March 3, 2026 11:22
spec/messages.bs Outdated
## Scope of RDF Messages ## {#scope}

Issue: Add a diagram illustrating RDF Messages, an RDF Message Stream, stream producers, and stream consumers.
By default, we assume that [=RDF Messages=] in an [=RDF Message Stream=] are not in the same "world". In other words, what is asserted in one message, is not asserted in other messages.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Giving this a bit more thought: this means we will never be able to fallback to a parser not supporting RDF Messages, as the semantics are different. If we would fall back, the contexts would be merged automatically of course, which is not desired. The proposal however currently is to indeed not allow fallbacks in the serializations, as that was initially proposed by having a pragma in comments.

I think calling it a «world» is also not the most clear thing to do as it’s nowhere mentioned in the RDF semantics. I propose to refer to an implicit context:


Each RDF message has an implicit context.

Note: An RDF Messages Stream Consumer can make this context explicit by putting the triples from the message in a named graph (e.g., with a blank node), and annotate the graph with the implicit context explicitly, such as when it was retrieved, a link to the conceptual stream, etc.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not entirely up to date on this discussion but here are my 2 cents.

What about just calling it distinct datasets/messages?

By default, we assume that the [=RDF Messages=] in an [=RDF Message Stream=] are distinct and should therefore not be combined, unless an RDF Message Stream Profile overrules this default. This means that, by default, what is asserted in one message, is not asserted in other messages.

Cat Exaple

The implicit context is something to think about. How would we handle messages with named graphs in this scenario? Do we need an exception?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The proposal however currently is to indeed not allow fallbacks in the serializations, as that was initially proposed by having a pragma in comments.

Yes, that is the case. The new serializations are to use completely new (non-backward compatible) syntax.

Each RDF message has an implicit context.

I– I'm not sure if this is the right way. The whole named graphs thing pulls in a very large part of the RDF spec into the discussion here, that I'm not convinced is required to explain what we need to explain. It also invokes the demons of the ancient RDF 1.1 "dataset semantics" discussion. Let's apply Occam's razor here.

What about just calling it distinct datasets/messages?

Pieter has a point in that we are inventing some new terms here, so I also double-checked what do the existing specs / W3C notes say about this.

RDF 1.2 Semantics only discusses datasets and "interpretations". The phrase "interpretation scope" never occurs in this document, but in my opinion it would be at least understandable.

RDF 1.1: On Semantics of RDF Datasets uh... basically says nothing on the subject, at least I can't find anything.

So, I really don't have a better idea other than what you @tobixdev proposed. I will put that in the spec in a second.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done – what do you think of it now?

Copy link

@tobixdev tobixdev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for writing this! This is definetely an improvement.

I've left some remarks.

spec/messages.bs Outdated
## Scope of RDF Messages ## {#scope}

Issue: Add a diagram illustrating RDF Messages, an RDF Message Stream, stream producers, and stream consumers.
By default, we assume that [=RDF Messages=] in an [=RDF Message Stream=] are not in the same "world". In other words, what is asserted in one message, is not asserted in other messages.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not entirely up to date on this discussion but here are my 2 cents.

What about just calling it distinct datasets/messages?

By default, we assume that the [=RDF Messages=] in an [=RDF Message Stream=] are distinct and should therefore not be combined, unless an RDF Message Stream Profile overrules this default. This means that, by default, what is asserted in one message, is not asserted in other messages.

Cat Exaple

The implicit context is something to think about. How would we handle messages with named graphs in this scenario? Do we need an exception?

<div class="example">
An HTTP server exposes a file at `https://example.org/stream`. This file contains an [=RDF Message Log=] serialization of an [=RDF Message Stream=]. A client can consume the stream by sending an HTTP GET request to that URL, and parsing the response as an [=RDF Message Stream=].

In this example, the server is the **stream producer**, and the client is the **stream consumer**. The stream protocol is HTTP. The RDF Message Stream only exists over the course of the HTTP request.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The stream protocol is HTTP. The RDF Message Stream only exists over the course of the HTTP request.

If I understood you correctly, storing the HTTP GET result to disk and reading it from the log results in a different stream instance even though they are equivalent right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that is the idea. Should I add this clarification to the spec?

spec/messages.bs Outdated
- The IoT thermometer is a **stream producer** that produces an [=RDF Message Stream=] by publishing RDF Messages to the MQTT broker, which acts as the **stream consumer**.
- For each individual client, the MQTT broker is a **stream producer** that produces an [=RDF Message Stream=] by sending RDF Messages to the client, which acts as the **stream consumer**.

For example, for 5 clients subscribing to the topic, there would be 6 different [=RDF Message Streams=]: one from the IoT thermometer to the MQTT broker, and one from the MQTT broker to each of the 5 clients. Each of these streams may have different messages in them, due to the Quality-of-Service settings and the clients subscribing to the stream at different points in time.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're getting rather detailed here but I think this does not allow us to fully model the QoS 0 semantics of MQTT.

If messages get dropped during the communication, the producer has a different "view" of the stream than the consumer. The dropped message is included in the producer's view of the stream but not in the consumer's view of the stream. Using our definition of RDF Message Streams, even though it's the same stream, the sender see's a different sequence of messages than the producer. To avoid this "two views" problem we would need to have two message streams for each connection, one modeling the view of the producer, and one modeling the view of the consumer but that get's quickly complicated.

I don't know if we should mention that but maybe it's something to keep in mind.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think MQTT QoS 0 does not guarantee in-order delivery. This would be another point that leads to the "two views" of the same stream problem.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I was wondering about both of these issues, but tried not to overcomplicate the explanation. I added a note on both issues – could you have a look now?

@Ostrzyciel Ostrzyciel force-pushed the GH-10/stream-definition branch from 8eb2c4e to 88853d4 Compare March 10, 2026 13:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants