Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shard rebalancer #517

Open
purplefox opened this issue Aug 12, 2022 · 0 comments
Open

Shard rebalancer #517

purplefox opened this issue Aug 12, 2022 · 0 comments
Assignees

Comments

@purplefox
Copy link
Collaborator

purplefox commented Aug 12, 2022

Each shard has a 1-1 correspondence with a Raft group. We have R replicas of each shard (including the leader), where R is the replication factor.
The leader replica for each shard is determined by which replica won the last leadership election, this is timing dependent and effectively non deterministic.
If we have a three node cluster and start all nodes at the same time, we should have a roughly even distribution of leaders across the nodes.
However, if we start nodes one by one, we will likely find that later nodes don't end up with any leaders as they are joining shard groups that already have leaders. Also if we kill nodes, and then restart them, it's likely the restarted nodes will simply join existing raft groups without a leader election resulting in no leaders on the restarted node.
After #511 is merged we will pin processors to raft leaders and they will be mobile across the cluster. If all leaders are poorly distributed across the cluster then so are processors and therefore processing load is not evenly distributed. This will likely result in reduced overall processing capacity and lack of scalability.

To remedy this issue we should create a new component called Rebalancer.

  • It will be the job of this component to periodically monitor the distribution of leaders across nodes and if there is a significant imbalance to direct Dragonboat to transfer leadership of a raft group from one node to another.
  • Dragonboat has a function RequestLeaderTransfer on NodeHost that looks like it an be used to initiate a transfer
  • The procManager struct maintains the leader state on each node - the Rebalancer can enquire here to get the state.
  • We should make sure we don't trigger too many transfers.
  • Consider a heuristic where we initiate transfers when an imbalance threshold has been crossed. We can compute a measure of imbalance.
  • If there are a lot of leadership elections going on, e.g. at startup, shutdown or if a node has died or restarted, then don't initiate a transfer. A heuristic could be something like "only consider a transfer if there have been no leadership changes for X seconds"
  • The Rebalancer should implement the Service interface and be started/stopped like other services in Server
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants