Transport protocol update take 2 #333

hannahhoward · 2022-06-03T23:45:03Z

What

This is an addition to #331, aiming to set us up for multiple transport protocols AS WELL AS protocol detection & transport detection.

How

#331 introduced a wrapped message format containing the transport information, which enables multiple transports to share the libp2p protocol for sending messages.

However, it does not actually help us do protocol negotiation with multiple transports, or transport versioning

Specifically:

I want to know what potential data transfer message formats another data transfer node supports (for an evolving message format)
I want to know whether a data transfer node can perform a transfer over a specific transport
I want to known what potential versions of a well known transport another node supports (for an evolving transport)
I want to connect to another node over a given transport with the latest possible message formats I support and latest transport versions I support, but fall back to those I support for compatibility

Here's what I implemented here:

Introduced a transport version

What is a transport version? A transport version indicates a way of using a transport protocol to facilitate data transfer. For example, in go-graphsync we've ended up switching graphsync extensions over time to support request resuming. At one time, we supported Do-Not-Send-Cids, then we switched to using Do-Not-Send-First-Blocks. It was actually this change that led to the unusual introduction of go-data-transfer 1.2.0 libp2p protocol even though there was no change in the message format (hence the confusingly name "message1_1prime" folder -- our messages haven't changed since 1.1.0, but we upped the libp2p protocol cause how the graphsync transport changed). A transport version doesn't say anything about the version of messages encoded within, but if there are specific mechanics to how a transport works that change, this allows us to version it independently.

While this might seem like another layer of versioning on top of already complex versioning scheme, it actually makes sense so that it's possible to produce clear specifications for how data transfer and a transport protocol interact to a potential implementer.

So "data transfer graphsync transport 1.0.0" (starting at 1.0.0 as the only supported version):

Use Do-Not-Send-First-Blocks for resumes
Use fil/data-transfer/incoming-request/1.1, fil/data-transfer/outgoing-block/1.1, and fil/data-transfer/1.1 extensions to encode messages (why these terrible names? oy cause awful haphazardly evolving design with no intention behind it)

Handle transport support and versioning negotiation by adding them to the protocol string for Libp2p

This is potentially controversial, as it technically means each transport gets its own libp2p stream per version, which implies a fairly rapid enumeration of protocols. In practice it seems to work pretty well given a very clear naming scheme /data-transfer/~message version~/~transport-name~/~transport-version~. This enables us to support an evolution of the Protocolmethod on the network interface that tells me:Given a peer and a transport, tell me if they support that transport, what their best message version is, and what their best transport version is"

The alternative here is just to make a seperate transport-query protocol. I'm torn on this. If we do that, we always need a message round trip between peers to tell us what is supported, which seems poorer for performance (in the other scheme, the peer store conveniently stores the important information about connection after first connection).

Keep Wrapping Messages Including Adding Transport Versions

While encoding transport information in the protocol name actually means we don't necessarily need wrapped messages, the cost to do so is minimal. Moreover, it has good usages for other potential transports that may want to implement how they operate without the libp2p protocol negotiation offered.

move the network to a subdirectory, also cleanup usage of all selector, move gstestdata to the itest directory

introduces a new wrapped protocol that encodes the transport id so that we can support multiple transports using the network protocol

Support multiple transports on the libp2p protocol, via different protocol naming, and using libp2p to do protocol negotiation

transport/helpers/network/libp2p_impl.go

rvagg · 2022-06-06T05:23:10Z

transport/helpers/network/libp2p_impl.go

+		messageVersion, _ := MessageVersion(dtProtocol)
+		if isLegacyProtocol(dtProtocol) {
+
+			if transportID == datatransfer.LegacyTransportID {


There's no check here for if LegacyTransportVersion is in the list of versions, so does that mean that any registration of "graphsync" will set up a legacy protocol regardless of what versions are provided? so there's no way to use this with "graphsync" without also wiring up for legacy? Probably not an immediate concern but we'll presumably want to phase that out at some point in the future.

rvagg

Seems reasonable to me, thanks for the detailed explanation. On your last point, of doubling up on the transport and versioning information being represented in messages and protocol negotiation - it's not entirely free, but can be minimal cost. But can you expand on this?

Moreover, it has good usages for other potential transports that may want to implement how they operate without the libp2p protocol negotiation offered.

Can you enumerate some usages that we may end up with, do you see any on the immediate horizon or are you just anticipating hypotheticals?

Regarding the message format, I wouldn't mind trimming it down a bit, currently it'll be like:

{
  "ID": "graphsync",
  "TV": "1.2.0",
  "Msg": { ... }
}

Perhaps we should go for the single-key union map style again for this, make it:

{
  "graphsync/1.2.0": { ... }
}

I mentioned this in multiformats/js-multiformats#161 too, which just has the version though. We could use union with a representation keyed to do this, so we get to enforce a single-key union with precise matches of transports and versions we support at the message decoding layer. We'd then be pushing the verbosity up into the Go type layer, so we'd have things like type Graphsync120Message struct { Message } (which we could make an interface for with Transport() and Version() methods that give us that information).

hannahhoward · 2022-06-07T19:30:57Z

@rvagg

Can you enumerate some usages that we may end up with, do you see any on the immediate horizon or are you just anticipating hypotheticals?

Honestly, in the back of my mind is always HTTP and whether we could make data transfer work with it.

hannahhoward · 2022-06-07T19:50:56Z

@rvagg I'm open to the keyed union approach, but can we check in about how bindnode works with keyed unions? If I'm reading the code right, if we have a schema like this:

type WrappedMessage union {
  | Bitswap100Message "bitswap/1.0.0"
  | Graphsync120Message "graphsync/1.2.0"
} representation keyed

what we end up in go with is something like this no:

type Bitswap100Message struct { Message }
type Graphsync120Message struct { Message }
type WrappedMessage struct {
    Graphsync120Message *Graphsync120Message
    Bitswap100Message *Bitswap100Message
 }

That seems mildly unwieldy, though perhaps we can just thank go's type system for not supporting unions well.

Anyway let's talk more about that.

Co-authored-by: Rod Vagg <[email protected]>

rvagg · 2022-06-09T07:59:10Z

That struct might be unwieldily as it is but I don't think it's hard to fix if we have strong enough assumptions about it being a single-property union:

type Bitswap100Message struct { Message }
type Graphsync120Message struct { Message }
type WrappedMessage struct {
    Graphsync120Message *Graphsync120Message
    Bitswap100Message *Bitswap100Message
 }

func (wm *WrappedMessage) Message() Message {
  if wm.Graphsync120Message != nil {
    return wm.Graphsync120Message.Message
  }
  return wm.Bitswap100Message.Message
}

func (wm *WrappedMessage) ProtocolName() string {
  if wm.Graphsync120Message != nil {
    return "graphsync"
  }
  return "bitswap"
}

func (wm *WrappedMessage) ProtocolVersion() string {
  // same thing
}

Possibly some panics inserted where the assumptions fail would be appropriate too.

Then we just evolve the parent through its methods as we add protocols and have to deal with compatibility concerns.

hannahhoward added 4 commits June 2, 2022 18:29

refactor(network): move network to subdirectory

04cf71f

move the network to a subdirectory, also cleanup usage of all selector, move gstestdata to the itest directory

feat(network): introduce wrapped protocol

0452d8f

introduces a new wrapped protocol that encodes the transport id so that we can support multiple transports using the network protocol

style(lint): fix imports

f90542e

fix(rebase): fix issues after rebase

4478974

hannahhoward requested review from rvagg and aschmahmann June 3, 2022 23:45

feat(network): transport versioning and detection

f4b0b9b

Support multiple transports on the libp2p protocol, via different protocol naming, and using libp2p to do protocol negotiation

hannahhoward force-pushed the feat/refactor-transport-protocol-update-part-2 branch from b82ce2e to f4b0b9b Compare June 3, 2022 23:51

rvagg reviewed Jun 6, 2022

View reviewed changes

transport/helpers/network/libp2p_impl.go Outdated Show resolved Hide resolved

rvagg reviewed Jun 6, 2022

View reviewed changes

transport/helpers/network/libp2p_impl.go Outdated Show resolved Hide resolved

rvagg reviewed Jun 6, 2022

View reviewed changes

rvagg approved these changes Jun 6, 2022

View reviewed changes

hannahhoward and others added 3 commits June 7, 2022 13:36

Update transport/helpers/network/libp2p_impl.go

dff6c88

Co-authored-by: Rod Vagg <[email protected]>

Update transport/helpers/network/libp2p_impl.go

3bd5161

Co-authored-by: Rod Vagg <[email protected]>

fix(network): add versions check for legacy transport

6b581f2

hannahhoward merged commit 6b581f2 into feat/refactor-transport-part-1 Jun 16, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Transport protocol update take 2 #333

Transport protocol update take 2 #333

hannahhoward commented Jun 3, 2022 •

edited

Loading

rvagg Jun 6, 2022

rvagg left a comment

hannahhoward commented Jun 7, 2022

hannahhoward commented Jun 7, 2022

rvagg commented Jun 9, 2022 •

edited

Loading

Transport protocol update take 2 #333

Transport protocol update take 2 #333

Conversation

hannahhoward commented Jun 3, 2022 • edited Loading

What

How

Introduced a transport version

Handle transport support and versioning negotiation by adding them to the protocol string for Libp2p

Keep Wrapping Messages Including Adding Transport Versions

rvagg Jun 6, 2022

Choose a reason for hiding this comment

rvagg left a comment

Choose a reason for hiding this comment

hannahhoward commented Jun 7, 2022

hannahhoward commented Jun 7, 2022

rvagg commented Jun 9, 2022 • edited Loading

hannahhoward commented Jun 3, 2022 •

edited

Loading

rvagg commented Jun 9, 2022 •

edited

Loading