-
Notifications
You must be signed in to change notification settings - Fork 42
feat: rewrite consensus metrics to not assume epoch monotonicity #698
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
88814c4 to
9efe7d0
Compare
3a5a65b to
cf20225
Compare
9038dea to
10b8ed1
Compare
10b8ed1 to
d1cc352
Compare
AshwinSekar
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'm fine with the overall direction of this change. I'd like to give a chance for @akhi3030 to review this and the histogram change.
He's out this week, but given that we've already removed the problematic assert we should be good to wait until he gets back.
akhi3030
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for catching this problem and for the fix. Generally looks good a few small comments though.
| }; | ||
|
|
||
| let root_epoch = sharable_banks.root().epoch(); | ||
| let epoch_schedule = sharable_banks.root().epoch_schedule().clone(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't remember seeing any code so far where we were able to copy the epoch schedule out of the bank. I guess I assumed that it is possible that the epoch schedule can change in future banks (and that is why it is stored in banks). In other words, just want to make sure that this is indeed safe and would the metrics container need access to sharable_banks instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be safe - bank.rs sets its epoch schedule by cloning from GenesisConfig while children build their epoch schedules by cloning from their respective parents.
d1cc352 to
9b751c7
Compare
akhi3030
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Problem and Summary of Changes
consensus_metrics.rsis currently written incorrectly.We can't assume the invariant that
maybe_new_epochis invoked with a monotonically non-decreasingepoch, given that blocks from different forks may play out of order during replay.To rectify, this PR introduces a map from epoch to metrics + updates the appropriate structure upon receiving a block.
When it comes to metrics emission, suppose that slot
Sis the highest possible slot in some epoch calledS.epoch. We emit "end of epoch" metrics forS.epochat the earlier of:S.epoch + 1Sfinalizes (ifSdoesn't finalize, then default to (1)).