Skip to content

Commit 3269915

Browse files
committed
discv5: remove topic stuff (for now)
This will be brought back later.
1 parent 21d1e95 commit 3269915

File tree

3 files changed

+11
-342
lines changed

3 files changed

+11
-342
lines changed

discv5/discv5-rationale.md

Lines changed: 4 additions & 80 deletions
Original file line numberDiff line numberDiff line change
@@ -176,8 +176,7 @@ discovery mechanism must be chosen.
176176
Another reason for UDP is communication latency: participants in the discovery protocol
177177
must be able to communicate with a large number of other nodes within a short time frame
178178
to establish and maintain the neighbor set and must perform regular liveness checks on
179-
their neighbors. For the topic advertisement system, registrants collect tickets and must
180-
use them as soon as the ticket expires to place an ad in a topic queue.
179+
their neighbors.
181180

182181
These protocol interactions are difficult to implement in a TCP setting where connections
183182
require multiple round-trips before application data can be sent and the connection
@@ -207,7 +206,7 @@ understandable while providing a distributed database that scales with the numbe
207206
participants. Our system also relies on the routing table to allow enumeration and random
208207
traversal of the whole network, i.e. all participants can be found. Most importantly,
209208
having a structured network with routing enables thinking about DHT 'address space' and
210-
'regions of address space'. These concepts are used to build the [topic-based node index].
209+
'regions of address space'.
211210

212211
Kademlia is often criticized as a naive design with obvious weaknesses. We believe that
213212
most issues with simple Kademlia can be overcome by careful programming and the benefits
@@ -219,8 +218,7 @@ The well-known 'sybil attack' is based on the observation that creating node ide
219218
essentially free. In any system using a measure of proximity among node identities, an
220219
adversary may place nodes close to a chosen node by generating suitable identities. For
221220
basic node discovery through network enumeration, the 'sybil attack' poses no significant
222-
challenge. Sybils are a serious issue for the topic-based node index, especially for
223-
topics provided by few participants, because the index relies on node distance.
221+
challenge.
224222

225223
An 'eclipse attack' is usually based on generating sybil nodes with the goal of polluting
226224
the victim node's routing table. Once the table is overtaken, the victim has no way to
@@ -307,7 +305,7 @@ Go implementation shows that the handshake computation takes 500µs on a 2014-er
307305
using the default secp256k1/keccak256 identity scheme. That's a lot, but note the cost
308306
amortizes because nodes commonly exchange multiple packets. Subsequent packets in the same
309307
conversation can be decrypted and authenticated in just 2µs. The most common protocol
310-
interaction is a FINDNODE or TOPICQUERY request on an unknown node with 4 NODES responses.
308+
interaction is a FINDNODE request on an unknown node with 4 NODES responses.
311309

312310
To put things into perspective: encryption and authentication in Discovery v5 is still a
313311
significant improvement over the authentication scheme used in Discovery v4, which
@@ -342,79 +340,6 @@ disturb the operation of the protocol. Session keys per node-ID/IP generally pre
342340
replay across sessions. The `request-id`, mirrored in response packets, prevents replay of
343341
responses within a session.
344342

345-
## The Topic Index
346-
347-
Using FINDNODE queries with appropriately chosen targets, the entire DHT can be sampled by
348-
a random walk to find all other participants. When building a distributed application, it
349-
is often desirable to restrict the search to participants which provide a certain service.
350-
A simple solution to this problem would be to simply split up the network and require
351-
participation in many smaller application-specific networks. However, such networks are
352-
hard to bootstrap and also more vulnerable to attacks which could isolate nodes.
353-
354-
The topic index provides discovery by provided service in a different way. Nodes maintain
355-
a single node table tracking their neighbors and advertise 'topics' on nodes found by
356-
randomly walking the DHT. While the 'global' topic index can be also spammed, it makes
357-
complete isolation a lot harder. To prevent nodes interested in a certain topic from
358-
finding each other, the entire discovery network would have to be overpowered.
359-
360-
To make the index useful, searching for nodes by topic must be efficient regardless of the
361-
number of advertisers. This is achieved by estimating the topic 'radius', i.e. the
362-
percentage of all live nodes which are advertising the topic. Advertisement and search
363-
activities are restricted to a region of DHT address space around the topic's 'center'.
364-
365-
We also want the index to satisfy another property: When a topic advertisement is placed,
366-
it should last for a well-defined amount of time. This ensures nodes may rely on their
367-
advertisements staying placed rather than worrying about keeping them alive.
368-
369-
Finally, the index should consume limited resources. Just as the node table is limited in
370-
number and size of buckets, the size of the index data structure on each node is limited.
371-
372-
### Why should advertisers wait?
373-
374-
Advertisers must wait a certain amount of time before they can be registered. Enforcing
375-
this time limit prevents misuse of the topic index because any topic must be important
376-
enough to outweigh the cost of waiting. Imagine a group phone call: announcing the
377-
participants of the call using topic advertisement isn't a good use of the system because
378-
the topic exists only for a short time and will have very few participants. The waiting
379-
time prevents using the index for this purpose because the call might already be over
380-
before everyone could get registered.
381-
382-
### Dealing with Topic Spam
383-
384-
Our model is based on the following assumptions:
385-
386-
- Anyone can place their own advertisements under any topics and the rate of placing ads
387-
is not limited globally. The number of active ads for any node is roughly proportional
388-
to the resources (network bandwidth, mostly) spent on advertising.
389-
- Honest actors whose purpose is to connect to other honest actors will spend an adequate
390-
amount of efforts on registering and searching for ads, depending on the rate of newly
391-
established connections they are targeting. If the given topic is used only by honest
392-
actors, a few registrations per minute will be satisfactory, regardless of the size of
393-
the subnetwork.
394-
- Dishonest actors may want to place an excessive amount of ads just to disrupt the
395-
discovery service. This will reduce the effectiveness of honest registration efforts by
396-
increasing the topic radius and/or topic queue waiting times. If the attacker(s) can
397-
place a comparable amount or more ads than all honest actors combined then the rate of
398-
new (useful) connections established throughout the network will reduce proportionally
399-
to the `honest / (dishonest + honest)` registration rates.
400-
401-
This adverse effect can be countered by honest actors increasing their registration and
402-
search efforts. Fortunately, the rate of established connections between them will
403-
increase proportionally both with increased honest registration and search efforts. If
404-
both are increased in response to an attack, the required factor of increased efforts from
405-
honest actors is proportional to the square root of the attacker's efforts.
406-
407-
### Detecting a useless registration attack
408-
409-
In the case of a symmetrical protocol, where nodes are both searching and advertising
410-
under the same topic, it is easy to detect when most of the found ads turn out to be
411-
useless and increase both registration and query frequency. It is a bit harder but still
412-
possible with asymmetrical (client-server) protocols, where only clients can easily detect
413-
useless registrations, while advertisers (servers) do not have a direct way of detecting
414-
when they should increase their advertising efforts. One possible solution is for servers
415-
to also act as clients just to test the server capabilities of other advertisers. It is
416-
also possible to implement a feedback system between trusted clients and servers.
417-
418343
# References
419344

420345
- Petar Maymounkov and David Mazières.
@@ -451,5 +376,4 @@ also possible to implement a feedback system between trusted clients and servers
451376
<https://eprint.iacr.org/2018/236.pdf>
452377

453378
[wire protocol]: ./discv5-wire.md
454-
[topic-based node index]: ./discv5-theory.md#topic-advertisement
455379
[node records]: ../enr.md

discv5/discv5-theory.md

Lines changed: 2 additions & 195 deletions
Original file line numberDiff line numberDiff line change
@@ -191,13 +191,13 @@ pending when WHOAREYOU is received, as in the following example:
191191

192192
A -> B FINDNODE
193193
A -> B PING
194-
A -> B TOPICQUERY
194+
A -> B TALKREQ
195195
A <- B WHOAREYOU (nonce references PING)
196196

197197
When this happens, all buffered requests can be considered invalid (the remote end cannot
198198
decrypt them) and the packet referenced by the WHOAREYOU `nonce` (in this example: PING)
199199
must be re-sent as a handshake. When the response to the re-sent is received, the new
200-
session is established and other pending requests (example: FINDNODE, TOPICQUERY) may be
200+
session is established and other pending requests (example: FINDNODE, TALKREQ) may be
201201
re-sent.
202202

203203
Note that WHOAREYOU is only ever valid as a response to a previously sent request. If
@@ -334,196 +334,6 @@ the distance to retrieve more nodes from adjacent k-buckets on `B`:
334334
Node `A` now sorts all received nodes by distance to the lookup target and proceeds by
335335
repeating the lookup procedure on another, closer node.
336336

337-
## Topic Advertisement
338-
339-
The topic advertisement subsystem indexes participants by their provided services. A
340-
node's provided services are identified by arbitrary strings called 'topics'. A node
341-
providing a certain service is said to 'place an ad' for itself when it makes itself
342-
discoverable under that topic. Depending on the needs of the application, a node can
343-
advertise multiple topics or no topics at all. Every node participating in the discovery
344-
protocol acts as an advertisement medium, meaning that it accepts topic ads from other
345-
nodes and later returns them to nodes searching for the same topic.
346-
347-
### Topic Table
348-
349-
Nodes store ads for any number of topics and a limited number of ads for each topic. The
350-
data structure holding advertisements is called the 'topic table'. The list of ads for a
351-
particular topic is called the 'topic queue' because it functions like a FIFO queue of
352-
limited length. The image below depicts a topic table containing three queues. The queue
353-
for topic `T₁` is at capacity.
354-
355-
![topic table](./img/topic-queue-diagram.png)
356-
357-
The queue size limit is implementation-defined. Implementations should place a global
358-
limit on the number of ads in the topic table regardless of the topic queue which contains
359-
them. Reasonable limits are 100 ads per queue and 50000 ads across all queues. Since ENRs
360-
are at most 300 bytes in size, these limits ensure that a full topic table consumes
361-
approximately 15MB of memory.
362-
363-
Any node may appear at most once in any topic queue, that is, registration of a node which
364-
is already registered for a given topic fails. Implementations may impose other
365-
restrictions on the table, such as restrictions on the number of IP-addresses in a certain
366-
range or number of occurrences of the same node across queues.
367-
368-
### Tickets
369-
370-
Ads should remain in the queue for a constant amount of time, the `target-ad-lifetime`. To
371-
maintain this guarantee, new registrations are throttled and registrants must wait for a
372-
certain amount of time before they are admitted. When a node attempts to place an ad, it
373-
receives a 'ticket' which tells them how long they must wait before they will be accepted.
374-
It is up to the registrant node to keep the ticket and present it to the advertisement
375-
medium when the waiting time has elapsed.
376-
377-
The waiting time constant is:
378-
379-
target-ad-lifetime = 15min
380-
381-
The assigned waiting time for any registration attempt is determined according to the
382-
following rules:
383-
384-
- When the table is full, the waiting time is assigned based on the lifetime of the oldest
385-
ad across the whole table, i.e. the registrant must wait for a table slot to become
386-
available.
387-
- When the topic queue is full, the waiting time depends on the lifetime of the oldest ad
388-
in the queue. The assigned time is `target-ad-lifetime - oldest-ad-lifetime` in this
389-
case.
390-
- Otherwise the ad may be placed immediately.
391-
392-
Tickets are opaque objects storing arbitrary information determined by the issuing node.
393-
While details of encoding and ticket validation are up to the implementation, tickets must
394-
contain enough information to verify that:
395-
396-
- The node attempting to use the ticket is the node which requested it.
397-
- The ticket is valid for a single topic only.
398-
- The ticket can only be used within the registration window.
399-
- The ticket can't be used more than once.
400-
401-
Implementations may choose to include arbitrary other information in the ticket, such as
402-
the cumulative wait time spent by the advertiser. A practical way to handle tickets is to
403-
encrypt and authenticate them with a dedicated secret key:
404-
405-
ticket = aesgcm_encrypt(ticket-key, ticket-nonce, ticket-pt, '')
406-
ticket-pt = [src-node-id, src-ip, topic, req-time, wait-time, cum-wait-time]
407-
src-node-id = node ID that requested the ticket
408-
src-ip = IP address that requested the ticket
409-
topic = the topic that ticket is valid for
410-
req-time = absolute time of REGTOPIC request
411-
wait-time = waiting time assigned when ticket was created
412-
cum-wait = cumulative waiting time of this node
413-
414-
### Registration Window
415-
416-
The image below depicts a single ticket's validity over time. When the ticket is issued,
417-
the node keeping it must wait until the registration window opens. The length of the
418-
registration window is 10 seconds. The ticket becomes invalid after the registration
419-
window has passed.
420-
421-
![ticket validity over time](./img/ticket-validity.png)
422-
423-
Since all ticket waiting times are assigned to expire when a slot in the queue opens, the
424-
advertisement medium may receive multiple valid tickets during the registration window and
425-
must choose one of them to be admitted in the topic queue. The winning node is notified
426-
using a [REGCONFIRMATION] response.
427-
428-
Picking the winner can be achieved by keeping track of a single 'next ticket' per queue
429-
during the registration window. Whenever a new ticket is submitted, first determine its
430-
validity and compare it against the current 'next ticket' to determine which of the two is
431-
better according to an implementation-defined metric such as the cumulative wait time
432-
stored in the ticket.
433-
434-
### Advertisement Protocol
435-
436-
This section explains how the topic-related protocol messages are used to place an ad.
437-
438-
Let us assume that node `A` provides topic `T`. It selects node `C` as advertisement
439-
medium and wants to register an ad, so that when node `B` (who is searching for topic `T`)
440-
asks `C`, `C` can return the registration entry of `A` to `B`.
441-
442-
Node `A` first attempts to register without a ticket by sending [REGTOPIC] to `C`.
443-
444-
A -> C REGTOPIC [T, ""]
445-
446-
`C` replies with a ticket and waiting time.
447-
448-
A <- C TICKET [ticket, wait-time]
449-
450-
Node `A` now waits for the duration of the waiting time. When the wait is over, `A` sends
451-
another registration request including the ticket. `C` does not need to remember its
452-
issued tickets since the ticket is authenticated and contains enough information for `C`
453-
to determine its validity.
454-
455-
A -> C REGTOPIC [T, ticket]
456-
457-
Node `C` replies with another ticket. Node `A` must keep this ticket in place of the
458-
earlier one, and must also be prepared to handle a confirmation call in case registration
459-
was successful.
460-
461-
A <- C TICKET [ticket, wait-time]
462-
463-
Node `C` waits for the registration window to end on the queue and selects `A` as the node
464-
which is registered. Node `C` places `A` into the topic queue for `T` and sends a
465-
[REGCONFIRMATION] response.
466-
467-
A <- C REGCONFIRMATION [T]
468-
469-
### Ad Placement And Topic Radius
470-
471-
Since every node may act as an advertisement medium for any topic, advertisers and nodes
472-
looking for ads must agree on a scheme by which ads for a topic are distributed. When the
473-
number of nodes advertising a topic is at least a certain percentage of the whole
474-
discovery network (rough estimate: at least 1%), ads may simply be placed on random nodes
475-
because searching for the topic on randomly selected nodes will locate the ads quickly enough.
476-
477-
However, topic search should be fast even when the number of advertisers for a topic is
478-
much smaller than the number of all live nodes. Advertisers and searchers must agree on a
479-
subset of nodes to serve as advertisement media for the topic. This subset is simply a
480-
region of the node ID address space, consisting of nodes whose Kademlia address is within a
481-
certain distance to the topic hash `sha256(T)`. This distance is called the 'topic
482-
radius'.
483-
484-
Example: for a topic `f3b2529e...` with a radius of 2^240, the subset covers all nodes
485-
whose IDs have prefix `f3b2...`. A radius of 2^256 means the entire network, in which case
486-
advertisements are distributed uniformly among all nodes. The diagram below depicts a
487-
region of the address space with topic hash `t` in the middle and several nodes close to
488-
`t` surrounding it. Dots above the nodes represent entries in the node's queue for the
489-
topic.
490-
491-
![diagram explaining the topic radius concept](./img/topic-radius-diagram.png)
492-
493-
To place their ads, participants simply perform a random walk within the currently
494-
estimated radius and run the advertisement protocol by collecting tickets from all nodes
495-
encountered during the walk and using them when their waiting time is over.
496-
497-
### Topic Radius Estimation
498-
499-
Advertisers must estimate the topic radius continuously in order to place their ads on
500-
nodes where they will be found. The radius mustn't fall below a certain size because
501-
restricting registration to too few nodes leaves the topic vulnerable to censorship and
502-
leads to long waiting times. If the radius were too large, searching nodes would take too
503-
long to find the ads.
504-
505-
Estimating the radius uses the waiting time as an indicator of how many other nodes are
506-
attempting to place ads in a certain region. This is achieved by keeping track of the
507-
average time to successful registration within segments of the address space surrounding
508-
the topic hash. Advertisers initially assume the radius is 2^256, i.e. the entire network.
509-
As tickets are collected, the advertiser samples the time it takes to place an ad in each
510-
segment and adjusts the radius such that registration at the chosen distance takes
511-
approximately `target-ad-lifetime / 2` to complete.
512-
513-
## Topic Search
514-
515-
Finding nodes that provide a certain topic is a continuous process which reads the content
516-
of topic queues inside the approximated topic radius. This is a much simpler process than
517-
topic advertisement because collecting tickets and waiting on them is not required.
518-
519-
To find nodes for a topic, the searcher generates random node IDs inside the estimated
520-
topic radius and performs Kademlia lookups for these IDs. All (intermediate) nodes
521-
encountered during lookup are asked for topic queue entries using the [TOPICQUERY] packet.
522-
523-
Radius estimation for topic search is similar to the estimation procedure for
524-
advertisement, but samples the average number of results from TOPICQUERY instead of
525-
average time to registration. The radius estimation value can be shared with the
526-
registration algorithm if the same topic is being registered and searched for.
527337

528338
[EIP-778]: ../enr.md
529339
[identity scheme]: ../enr.md#record-structure
@@ -532,6 +342,3 @@ registration algorithm if the same topic is being registered and searched for.
532342
[PING]: ./discv5-wire.md#ping-request-0x01
533343
[PONG]: ./discv5-wire.md#pong-response-0x02
534344
[FINDNODE]: ./discv5-wire.md#findnode-request-0x03
535-
[REGTOPIC]: ./discv5-wire.md#regtopic-request-0x07
536-
[REGCONFIRMATION]: ./discv5-wire.md#regconfirmation-response-0x09
537-
[TOPICQUERY]: ./discv5-wire.md#topicquery-request-0x0a

0 commit comments

Comments
 (0)