-
Notifications
You must be signed in to change notification settings - Fork 1.1k
fix: make FreeMemWithEvictionStep atomic #4885
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
}; | ||
|
||
shard_set->pool()->AwaitBrief(std::move(cb)); | ||
} | ||
} | ||
|
||
void DbSlice::SendQueuedInvalidationMessagesAsync() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We drained it before because we blocked. Now we don't.
TODO investigate: we should call this also from heartbeat (so on the second iteration it drains it if there were items added).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After looking at this the while loop
was redundant
in the flow of heartbeat
. Why? Because heartbeat was preempted(because it tried to sent the queued invalidated messages). pending_send_map_
can only change in two places: 1) during evictions from heartbeat 2) through db_slice and drained at the end via OnCbFinish()
. The former (1) is guaranteed that it won't evict because it's preempted and until it completes it won't run another heartbeat on that thread while the later (2) drains the pending_send_map_
via OnCbFinish()
. Therefore, there is no correctness issue of "missing or not draining all of the items in the pending_send_map_".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As for DispatchBrief, I also don't see any issue whatsoever, they messages will reach the connections eventually and it should be enough
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DispatchBrief can also block btw, it's just a much rare event
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@romange the rationale was this doesn't happen in practice because we don't reach this state easily (where we have the task queues internally full). Your comment is very valid, and maybe we should move this outside of the non preemptive critical section. I think it should be a simple change. I will update on this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@romange would it worth to add a function that won't dispatch if the task queue is full (but also won't preempt and return to the caller?)
That way, we can send the pending_items later if we can't dispatch because the intenral queues are full
Signed-off-by: kostas <[email protected]>
src/server/db_slice.cc
Outdated
if (bucket.IsEmpty()) | ||
continue; | ||
|
||
if (!bucket.IsBusy(slot_id)) | ||
continue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (bucket.IsEmpty() || !bucket.IsBusy(slot_id))
continue;
src/server/db_slice.cc
Outdated
events_.evicted_keys += evicted_items; | ||
DVLOG(2) << "Eviction time (us): " << (time_finish - time_start) / 1000; | ||
return pair<uint64_t, size_t>{evicted_items, evicted_bytes}; | ||
return finalize(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need to call finalize if expired_keys_events_recording_
is false and journal
is null
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually it's better to check in finalize to avoid iterating over keys_to_journal
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bool record_keys = owner_->journal() != nullptr || expired_keys_events_recording_;
and then:
1388 if (record_keys)
1389 keys_to_journal.emplace_back(key);
So keys_to_journal
will be empty if either we have key space notifications or if we need to write to the journal.
So no action really needed there because keys_to_journal
will be empty.
What we can save is calling SendQueuedInvalidationMessagesAsync
if pending_send_map.empty()
src/server/db_slice.cc
Outdated
// send the deletion to the replicas. | ||
for (string_view key : keys_to_journal) { | ||
if (auto journal = owner_->journal(); journal) | ||
RecordExpiryBlocking(db_ind, key); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add a comment saying that even though you call RecordExpiryBlocking it does not block due to JournalFlushGuard above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find funny that Adi asked me why I used
// Disable flush journal changes to prevent preemtion
journal::JournalFlushGuard journal_flush_guard(shard_owner()->journal());
when the caller
of this function already does the same. Then you are asking me to add a comment here.
yet the only reason I added the line above on the first line of this function
is precisely to show intent
(that we disable the flushing in journal explicitly).
I will add a comment 😄
size_t starting_segment_id, | ||
size_t increase_goal_bytes) { | ||
// Disable flush journal changes to prevent preemtion | ||
journal::JournalFlushGuard journal_flush_guard(shard_owner()->journal()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The caller to this function also uses journal::JournalFlushGuard
In this class distructor we do journal_->SetFlushMode(true);
This means that the logic now is broken if there are other calls to journal not inside FreeMemWithEvictionStepAtomic
- I would add a dcheck in JournalFlushGuard to see we do not have nested calls to it.
- why did you add it here if its also in the caller?
++num_seg_visited, segment_id = GetNextSegmentForEviction(segment_id, db_ind)) { | ||
const auto& bucket = db_table->prime.GetSegment(segment_id)->GetBucket(bucket_id); | ||
if (bucket.IsEmpty() || !bucket.IsBusy(slot_id)) | ||
continue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only difference is I combined those two statements
if ((evicted_items == max_eviction_per_hb) || (evicted_bytes >= increase_goal_bytes)) | ||
goto finish; | ||
} | ||
for (int32_t slot_id = num_slots - 1; slot_id >= 0; --slot_id) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No functional changes in this block of code. I moved the FiberAtomicGuard
above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@adiholden I can also move back the Guard
locally so the diff is cleaner
FreeMemWithEvictionStep was not atomic yet was used by a flow -- RetireExpireAndEvict() -- which should be atomic. Therefore, we make this function atomic (since it's only used by that flow).
Resolves #4875