Skip to content

Commit e753d72

Browse files
Emilia Hanefjl
andauthored
discv5: NAT hole punch theory (#228)
Co-authored-by: Felix Lange <[email protected]>
1 parent 26e40cb commit e753d72

File tree

3 files changed

+104
-0
lines changed

3 files changed

+104
-0
lines changed

discv5/discv5-theory.md

Lines changed: 89 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -334,10 +334,99 @@ the distance to retrieve more nodes from adjacent k-buckets on `B`:
334334
Node `A` now sorts all received nodes by distance to the lookup target and proceeds by
335335
repeating the lookup procedure on another, closer node.
336336

337+
## Hole-punching Asymmetric NATs
338+
339+
This section explains the hole punching mechanism built into the protocol, which is
340+
enabled by the [RELAYINIT] and [RELAYMSG] message types. Compared to other protocol
341+
messages, these require deeper interaction with the session layer in order to ensure the
342+
hole punching mechanism operates safely.
343+
344+
In the examples below, we assume that node `A` (Alice) has the goal of sending a request
345+
message (e.g. FINDNODE) to node `B` (Bob).
346+
347+
Bob operates behind a network-adress-translation (NAT) layer, and is unable to receive UDP
348+
packets from Alice initially. However, Bob has previously communicated with a third node
349+
`R` (Relay), and is able to receive incoming packets from the Relay node. We further
350+
assume Bob's NAT is 'asymmetric', i.e. the IP/Port of Bob's packets will be the same
351+
regardless of the host they are sent to. The hole-punching mechanism does not work for
352+
'symmetric' NAT where every destination host has a unique mapping.
353+
354+
Node Alice may or may not behind a symmetric NAT.
355+
356+
Finally, it is assumed that a common lower bound on lifetime of NAT mappings is 20
357+
seconds, and that mappings will be refreshed when any packet is sent through them. For
358+
more background information about common NAT setups, please consult [RFC4787], [RFC6146]
359+
and [this paper][natpaper].
360+
361+
### Message flow
362+
363+
In the wire protocol, there are four packet types. Since the NAT-related messages require
364+
deeper integration with the packet/session layer, the packet type is explicitly shown for
365+
each message in the diagram below. We use these abbreviations:
366+
367+
- whoareyou - [WHOAREYOU packet]
368+
- `m(X)`: [message packet] containing request `X`
369+
- `H(X)`: [handshake message packet] containing request `X`
370+
- `s(M)`: [session message packet] containing message `M`
371+
372+
![Diagram](./img/nat-hole-punching-flow.svg) <!-- source: ./img/nat-hole-punching-flow.mermaid -->
373+
374+
Preconditions: Bob is behind NAT. Bob is contained in Relay's node table, they have an
375+
established session and Bob has sent a packet to Relay in the last ~20 seconds hence Relay
376+
can get through Bob's NAT.
377+
378+
As part of recursive query for peers, Alice sends a [FINDNODE] request to Bob, who's ENR
379+
it just received from the Relay. By making an outgoing request to Bob, if Alice is behind
380+
NAT, Alice's NAT adds a mapping `(Alice's-LAN-ip, Alice's-LAN-port, Bob's-WAN-ip,
381+
Bob's-WAN-port, entry-lifetime)`. This means a hole now is punched for Bob in Alice's NAT
382+
for the duration of `entry-lifetime`. However, Alice's request is not delivered as Bob is
383+
behind NAT.
384+
385+
Alice detects the timeout, and initiates an attempt to punch a hole in Bob's NAT via
386+
Relay. Alice resets the request time-out on the timed out [FINDNODE] message and wraps the
387+
message's nonce in a [RELAYINIT] notification and sends it to Relay. The notification also
388+
contains its ENR and Bob's node ID.
389+
390+
The Relay node validates the [RELAYINIT] notification and uses the `target-id` to look up
391+
Bob's ENR in its node table. Bob is very likely to be a member of the Relay's table
392+
because it was just sent to Alice in a [NODES] response. Note that, if Bob is not
393+
contained in the table, communication ends here.
394+
395+
The Relay sends a [RELAYMSG] notification containing Alice's message nonce and ENR to Bob.
396+
397+
Bob disassembles the [RELAYMSG] and uses the `nonce` to assemble a [WHOAREYOU packet],
398+
then sends it to Alice. Bob knows about Alice's endpoint from the `initiator-enr` given in
399+
RELAYMSG.
400+
401+
Bob's NAT adds the mapping `(Bob's-LAN-ip, Bob's-LAN-port, Alice's-WAN-ip,
402+
Alice's-WAN-port, entry-lifetime)`. A hole is punched in Bob's NAT for Alice for the
403+
duration of `entry-lifetime`.
404+
405+
From here on it's business as usual. See [Sessions].
406+
407+
### Redundancy of ENRs in NODES responses and connectivity status assumptions about Relay and Bob
408+
409+
Often the same peers get passed around in NODES responses by different peers. The chance
410+
of seeing a peer received in a NODES response again in another NODES response is high as
411+
k-buckets favour long lived connections to new ones. This makes the need for a storing
412+
back up relays for peers small.
413+
414+
Apart from the state that is saved by not storing more than the last peer to send us an
415+
ENR as its potential relay, the longer time that has passed since a peer sent us an ENR,
416+
the less guarantee we have that the peer is in fact connected to the owner of that ENR and
417+
hence of its ability to relay.
418+
337419
[EIP-778]: ../enr.md
338420
[identity scheme]: ../enr.md#record-structure
421+
[message packet]: ./discv5-wire.md#ordinary-message-packet-flag--0
422+
[session message packet]: ./discv5-wire.md#session-message-packet-flag--3
339423
[handshake message packet]: ./discv5-wire.md#handshake-message-packet-flag--2
340424
[WHOAREYOU packet]: ./discv5-wire.md#whoareyou-packet-flag--1
341425
[PING]: ./discv5-wire.md#ping-request-0x01
342426
[PONG]: ./discv5-wire.md#pong-response-0x02
343427
[FINDNODE]: ./discv5-wire.md#findnode-request-0x03
428+
[NODES]: ./discv5-wire.md#nodes-response-0x04
429+
[RELAYINIT]: ./discv5-wire.md#relayinit-notification-0x07
430+
[RELAYMSG]: ./discv5-wire.md#relaymsg-notification-0x08
431+
432+
[Sessions]: ./discv5-theory.md#sessions
Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
sequenceDiagram
2+
participant Alice
3+
participant Relay
4+
participant Bob
5+
6+
Relay-->>Alice: s(NODES[Bob's ENR])
7+
Alice->>Bob: m(nonce,FINDNODE)
8+
Note left of Alice:Hole punched in Alice's NAT for Bob
9+
Note left of Alice:FINDNODE timed out
10+
Alice->>Relay: s(RELAYINIT[nonce])
11+
Relay->>Bob: s(RELAYMSG[nonce])
12+
Bob-->>Alice: whoareyou(nonce)
13+
Note right of Bob: Hole punched in Bob's NAT for Alice
14+
Alice-->>Bob: H(FINDNODE)

0 commit comments

Comments
 (0)