Skip to content

Conversation

Wi1l-B0t
Copy link
Contributor

@Wi1l-B0t Wi1l-B0t commented Oct 5, 2025

Description

Use reservoir sampling algorithm instead of random sorting.
Now it only needs to iterate once.

The previous implementation is inefficient.

Type of change

  • Optimization (the change is only an optimization)
  • Style (the change is only a code style for better maintenance or standard purpose)
  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • Any dependent changes have been merged and published in downstream modules

@Wi1l-B0t Wi1l-B0t changed the title Optimize: sampling peers ramdomly Optimize: sampling peers randomly Oct 5, 2025
@Wi1l-B0t Wi1l-B0t force-pushed the optimize.random-sample-peers branch from 919520b to e9472b2 Compare October 5, 2025 01:43
@Wi1l-B0t Wi1l-B0t changed the title Optimize: sampling peers randomly Optimize: random sampling peers Oct 5, 2025
@Wi1l-B0t Wi1l-B0t changed the title Optimize: random sampling peers Optimize: better random sampling for peers Oct 5, 2025
@vncoelho
Copy link
Member

vncoelho commented Oct 5, 2025

Some months ago I was discussing this on the Discord channel and forgot to open the issue.

"Currently, the Peer.OnTimer method selects new peers to connect to randomly from the UnconnectedPeers list. This can lead to a node connecting to many peers within the same network or hosting provider, which can make the node more susceptible to network-level attacks like eclipse attacks and reduce overall network topology health.

Suggestion:

Modify the peer selection logic to prioritize connections to peers from diverse network address ranges. By grouping peers by their IP address prefix (e.g., /16 for IPv4), the node can actively seek to connect to peers in networks it's not yet connected to."
I think that the selection logic is worse than random, no?
"Benefits:

Improved Decentralization: Creates a more resilient and geographically/topologically diverse network mesh.
Increased Resilience: Makes it harder for an attacker to isolate a node from the rest of the network (eclipse attack)."
Basically it uses this to creates groups when the node needs to request more peers

        private static string GetAddressGroup(IPAddress address)
        {
            var unmapped = address.UnMap();
            if (unmapped.AddressFamily == AddressFamily.InterNetwork) // IPv4
            {
                // Use /16 prefix for grouping to avoid connecting to too many peers from the same network.
                var bytes = unmapped.GetAddressBytes();
                return $"v4:{bytes[0]}.{bytes[1]}";
            }
            else // IPv6
            {
                // For IPv6, use /32 prefix.
                var bytes = unmapped.GetAddressBytes();
                return $"v6:{bytes[0]:X2}{bytes[1]:X2}:{bytes[2]:X2}{bytes[3]:X2}";
            }
        }

Copy link
Member

@vncoelho vncoelho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my side, if it is going to change it should change to some smarter logic, as I commented above.

@Wi1l-B0t
Copy link
Contributor Author

Wi1l-B0t commented Oct 6, 2025

Some months ago I was discussing this on the Discord channel and forgot to open the issue.

"Currently, the Peer.OnTimer method selects new peers to connect to randomly from the UnconnectedPeers list. This can lead to a node connecting to many peers within the same network or hosting provider, which can make the node more susceptible to network-level attacks like eclipse attacks and reduce overall network topology health.

Suggestion:

Modify the peer selection logic to prioritize connections to peers from diverse network address ranges. By grouping peers by their IP address prefix (e.g., /16 for IPv4), the node can actively seek to connect to peers in networks it's not yet connected to." I think that the selection logic is worse than random, no? "Benefits:

Improved Decentralization: Creates a more resilient and geographically/topologically diverse network mesh. Increased Resilience: Makes it harder for an attacker to isolate a node from the rest of the network (eclipse attack)." Basically it uses this to creates groups when the node needs to request more peers

        private static string GetAddressGroup(IPAddress address)
        {
            var unmapped = address.UnMap();
            if (unmapped.AddressFamily == AddressFamily.InterNetwork) // IPv4
            {
                // Use /16 prefix for grouping to avoid connecting to too many peers from the same network.
                var bytes = unmapped.GetAddressBytes();
                return $"v4:{bytes[0]}.{bytes[1]}";
            }
            else // IPv6
            {
                // For IPv6, use /32 prefix.
                var bytes = unmapped.GetAddressBytes();
                return $"v6:{bytes[0]:X2}{bytes[1]:X2}:{bytes[2]:X2}{bytes[3]:X2}";
            }
        }

This is another issue.
You can fix it in another PR.

@cschuchardt88
Copy link
Member

cschuchardt88 commented Oct 6, 2025

This is another issue. You can fix it in another PR.

That may be true. Changing this feature we should do it the right way instead of changing it this way. It wastes everyone's time. Just remember other developers have to stop their work to review code. So you should do what needs to be done or close this PR. So no more time isn't wasted.

{
if (currentIndex < reservoir.Length)
{
reservoir[currentIndex] = item;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we have the same length there is no rand order

Copy link
Contributor Author

@Wi1l-B0t Wi1l-B0t Oct 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we have the same length there is no rand order

All peers selected in this case. and the items in a hashset is unordered

So order doesn't matter in this case, I think.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So order doesn't matter in this case, I think.

Agree, although the logic changes, it's ok to me

Copy link
Member

@cschuchardt88 cschuchardt88 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use a solution that fixes the current problem that @vncoelho said. This PR isn't a workaround or fix to the real problem.

Copy link
Contributor

@roman-khimov roman-khimov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. #4212 (comment) problem is different from this PR and should be solved separately.
  2. UnconnectedMax is just 1000, practically we're much closer to 100 length, so while this PR is more efficient, I'm not sure it's worth added complexity.

@Wi1l-B0t
Copy link
Contributor Author

Wi1l-B0t commented Oct 6, 2025

Use a solution that fixes the current problem that @vncoelho said. This PR isn't a workaround or fix to the real problem.

THAT is another issue!

@cschuchardt88
Copy link
Member

cschuchardt88 commented Oct 6, 2025

THAT is another issue!

So what issue does this PR solve?

@Wi1l-B0t
Copy link
Contributor Author

Wi1l-B0t commented Oct 6, 2025

THAT is another issue!

So what issue does this PR solve?

The previous implementation is inefficient.
Random sorting is not a good idea.
The reservoir sampling algorithm is more efficient solution for random selection.

Copy link
Member

@cschuchardt88 cschuchardt88 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based off previous implementation of code there could have an issue or bug with ordering.

@shargon shargon added Discussion Initial issue state - proposed but not yet accepted Ready to Merge and removed Ready to Merge Discussion Initial issue state - proposed but not yet accepted labels Oct 7, 2025
@shargon shargon merged commit d0bcc5e into neo-project:dev Oct 10, 2025
7 checks passed
@Wi1l-B0t Wi1l-B0t deleted the optimize.random-sample-peers branch October 11, 2025 11:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants