@@ -334,10 +334,99 @@ the distance to retrieve more nodes from adjacent k-buckets on `B`:
334
334
Node ` A ` now sorts all received nodes by distance to the lookup target and proceeds by
335
335
repeating the lookup procedure on another, closer node.
336
336
337
+ ## Hole-punching Asymmetric NATs
338
+
339
+ This section explains the hole punching mechanism built into the protocol, which is
340
+ enabled by the [ RELAYINIT] and [ RELAYMSG] message types. Compared to other protocol
341
+ messages, these require deeper interaction with the session layer in order to ensure the
342
+ hole punching mechanism operates safely.
343
+
344
+ In the examples below, we assume that node ` A ` (Alice) has the goal of sending a request
345
+ message (e.g. FINDNODE) to node ` B ` (Bob).
346
+
347
+ Bob operates behind a network-adress-translation (NAT) layer, and is unable to receive UDP
348
+ packets from Alice initially. However, Bob has previously communicated with a third node
349
+ ` R ` (Relay), and is able to receive incoming packets from the Relay node. We further
350
+ assume Bob's NAT is 'asymmetric', i.e. the IP/Port of Bob's packets will be the same
351
+ regardless of the host they are sent to. The hole-punching mechanism does not work for
352
+ 'symmetric' NAT where every destination host has a unique mapping.
353
+
354
+ Node Alice may or may not behind a symmetric NAT.
355
+
356
+ Finally, it is assumed that a common lower bound on lifetime of NAT mappings is 20
357
+ seconds, and that mappings will be refreshed when any packet is sent through them. For
358
+ more background information about common NAT setups, please consult [ RFC4787] , [ RFC6146]
359
+ and [ this paper] [ natpaper ] .
360
+
361
+ ### Message flow
362
+
363
+ In the wire protocol, there are four packet types. Since the NAT-related messages require
364
+ deeper integration with the packet/session layer, the packet type is explicitly shown for
365
+ each message in the diagram below. We use these abbreviations:
366
+
367
+ - whoareyou - [ WHOAREYOU packet]
368
+ - ` m(X) ` : [ message packet] containing request ` X `
369
+ - ` H(X) ` : [ handshake message packet] containing request ` X `
370
+ - ` s(M) ` : [ session message packet] containing message ` M `
371
+
372
+ ![ Diagram] ( ./img/nat-hole-punching-flow.svg ) <!-- source: ./img/nat-hole-punching-flow.mermaid -->
373
+
374
+ Preconditions: Bob is behind NAT. Bob is contained in Relay's node table, they have an
375
+ established session and Bob has sent a packet to Relay in the last ~ 20 seconds hence Relay
376
+ can get through Bob's NAT.
377
+
378
+ As part of recursive query for peers, Alice sends a [ FINDNODE] request to Bob, who's ENR
379
+ it just received from the Relay. By making an outgoing request to Bob, if Alice is behind
380
+ NAT, Alice's NAT adds a mapping `(Alice's-LAN-ip, Alice's-LAN-port, Bob's-WAN-ip,
381
+ Bob's-WAN-port, entry-lifetime)`. This means a hole now is punched for Bob in Alice's NAT
382
+ for the duration of ` entry-lifetime ` . However, Alice's request is not delivered as Bob is
383
+ behind NAT.
384
+
385
+ Alice detects the timeout, and initiates an attempt to punch a hole in Bob's NAT via
386
+ Relay. Alice resets the request time-out on the timed out [ FINDNODE] message and wraps the
387
+ message's nonce in a [ RELAYINIT] notification and sends it to Relay. The notification also
388
+ contains its ENR and Bob's node ID.
389
+
390
+ The Relay node validates the [ RELAYINIT] notification and uses the ` target-id ` to look up
391
+ Bob's ENR in its node table. Bob is very likely to be a member of the Relay's table
392
+ because it was just sent to Alice in a [ NODES] response. Note that, if Bob is not
393
+ contained in the table, communication ends here.
394
+
395
+ The Relay sends a [ RELAYMSG] notification containing Alice's message nonce and ENR to Bob.
396
+
397
+ Bob disassembles the [ RELAYMSG] and uses the ` nonce ` to assemble a [ WHOAREYOU packet] ,
398
+ then sends it to Alice. Bob knows about Alice's endpoint from the ` initiator-enr ` given in
399
+ RELAYMSG.
400
+
401
+ Bob's NAT adds the mapping `(Bob's-LAN-ip, Bob's-LAN-port, Alice's-WAN-ip,
402
+ Alice's-WAN-port, entry-lifetime)`. A hole is punched in Bob's NAT for Alice for the
403
+ duration of ` entry-lifetime ` .
404
+
405
+ From here on it's business as usual. See [ Sessions] .
406
+
407
+ ### Redundancy of ENRs in NODES responses and connectivity status assumptions about Relay and Bob
408
+
409
+ Often the same peers get passed around in NODES responses by different peers. The chance
410
+ of seeing a peer received in a NODES response again in another NODES response is high as
411
+ k-buckets favour long lived connections to new ones. This makes the need for a storing
412
+ back up relays for peers small.
413
+
414
+ Apart from the state that is saved by not storing more than the last peer to send us an
415
+ ENR as its potential relay, the longer time that has passed since a peer sent us an ENR,
416
+ the less guarantee we have that the peer is in fact connected to the owner of that ENR and
417
+ hence of its ability to relay.
418
+
337
419
[ EIP-778 ] : ../enr.md
338
420
[ identity scheme ] : ../enr.md#record-structure
421
+ [ message packet ] : ./discv5-wire.md#ordinary-message-packet-flag--0
422
+ [ session message packet ] : ./discv5-wire.md#session-message-packet-flag--3
339
423
[ handshake message packet ] : ./discv5-wire.md#handshake-message-packet-flag--2
340
424
[ WHOAREYOU packet ] : ./discv5-wire.md#whoareyou-packet-flag--1
341
425
[ PING ] : ./discv5-wire.md#ping-request-0x01
342
426
[ PONG ] : ./discv5-wire.md#pong-response-0x02
343
427
[ FINDNODE ] : ./discv5-wire.md#findnode-request-0x03
428
+ [ NODES ] : ./discv5-wire.md#nodes-response-0x04
429
+ [ RELAYINIT ] : ./discv5-wire.md#relayinit-notification-0x07
430
+ [ RELAYMSG ] : ./discv5-wire.md#relaymsg-notification-0x08
431
+
432
+ [ Sessions ] : ./discv5-theory.md#sessions
0 commit comments