Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP Add LedgerStateCache #4642

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

ThomasBrady
Copy link
Contributor

@ThomasBrady ThomasBrady commented Feb 6, 2025

Adds LedgerStateCache to store all Soroban entries:

  • Populates (LedgerStateCache::addEntry) the cache in PopulateLedgerCacheWork (executed in LedgerManagerImpl::loadLastKnownLedger).
  • Updates the cache (LedgerStateCache:::addEntries) each ledger from LedgerManagerImpl::transferLedgerEntriesToBucketList.
  • Reads the cache (LedgerStateCache::readEntry) in LedgerTxnRoot::Impl:::getNewestVersion.

Notes:

  • LedgerStateCache lives in an optional shared pointer and is created at startup, originally in the LedgerManagerImpl
  • LedgerTxnRoot takes an std::optional<LedgerStateCache> in its constructor, and PopulateLedgerCacheWork access it with LedgerManager::getLedgerStateCache.
  • addEntry and addEntries are private and only accessible from friend classes LedgerManagerImpl and PopulateLedgerCacheWork
  • addEntry and addEntries acquire a unique lock on the cache
  • Currently, the LedgerStateCache only supports contract entries. To access them in the index, I've added Bucket / Index::getContractEntryRange.
  • The LedgerStateCache is enabled via Config::IN_MEMORY_SOROBAN_STATE_FOR_TESTING (default true).

Description

Resolves #4556

Checklist

  • Reviewed the contributing document
  • Rebased on top of master (no merge commits)
  • Ran clang-format v8.0.0 (via make format or the Visual Studio extension)
  • Compiles
  • Ran all tests
  • If change impacts performance, include supporting evidence per the performance document

…the cache in LedgerManagerImpl::loadLastKnownLedger via PopulateLedgerCacheWork. Updates the cache each ledger from LedgerManagerImpl::transferLedgerEntriesToBucketList. Reads the cache via LedgerTxnRoot::Impl:::getNewestVersion.
@ThomasBrady ThomasBrady requested a review from SirTyson February 6, 2025 17:51
@ThomasBrady ThomasBrady force-pushed the global-in-memory-cache branch from c1f11fd to 8df0219 Compare February 6, 2025 22:29
Copy link
Contributor

@SirTyson SirTyson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in general this looks good! I think we need explicit cache tests though. Specifically, we need population tests (which isn't covered in any unit tests currently I think). While we might have transitive coverage of actual cache usage, a few explicit unit tests would be useful as well.

I have two sorta high level comments. First, given that this is our base apply time cache and we're making protocol assumptions based on its efficiency, we need to be memory conscious and worry about things like copy semantics, probably preffering const shared_ptrs where we can.

2nd, I think we need to be more defensive in general about who has access with this cache. Many different threads and subsystems access state now after Marta'a background apply changes. However, each thread/subsystem has different expectations wrt that that access looks like. This cache in particular is an apply-time only cache, so I think it should be very limited to the apply time thread. I think to accomplish this we should have LedgerTxnRoot hold a unique_ptr to the cache instead of spreading around shared_ptrs. I think this is probably easiest if you construct a uniqe_ptr of the cache via your work class, then call some function like LedgerManager::setApplyTimeCache(unique_ptr&&). LedgerManager will then set the cache pointer in the ltx root. I think this is important as we want to make sure that only ltx has access to the cache. We might need to add an invalidation function too, such that the addbucketEntriesToCache function is a member of LedgerTxnRoot if that makes sense. This would also allow us to get rid of locks, since we'd be guaranteed everything is happening from the apply time thread. This might be different than what was discussed in the original design doc, but I think makes more sense after Marta's changes, where there is a more clear distinction between "apply time only state" and "state snapshots used by overlay and other subsystems"

CLOG_DEBUG(Ledger, "Not a contract entry, type: {}", lk.type());
}
}
if (!lastContractEntry && firstContractEntry && lk.type() > TTL)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no entry larger than TTL, lastContractEntry.second is guarenteed to be EOF

@@ -103,6 +105,12 @@ class InMemoryIndex
return mOfferRange;
}

std::optional<std::pair<std::streamoff, std::streamoff>>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would change this to just return the starting offset, given that the ending offset is guaranteed to always be EOF

// bound is EOF
else
{
CLOG_DEBUG(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These debug messages are super noisy, probably best to remove if tests are passing.

{
if (!getIndex().getContractEntryRange())
{
CLOG_DEBUG(Bucket, "LiveBucket::getContractEntryRange() = nullopt");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: remove noisy message

if (mDiskIndex)
{
// Get the smallest and largest possible contract entry keys
LedgerKey upperBound(TTL /*9*/);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Upper bound not required since no entry is larger than TTL

return;
}
mState.erase(entry);
mState.emplace(entry);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since this is population, we should just call emplace and assert it actually got inserted. There should be no duplicate inserts on construction.

if (mLedgerStateCache)
{
auto populateLedgerCacheWork =
mApp.getWorkScheduler().executeWork<PopulateLedgerCacheWork>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this would be better as a step in AssumeStateWork. That's where we do all our "in-memory matches current state" work, such as setting the BucketList, indexing buckets, etc.

@@ -534,6 +535,8 @@ LedgerTxn::Impl::commitChild(EntryIterator iter,
RestoredKeys const& restoredKeys,
LedgerTxnConsistency cons) noexcept
{
// We will want to acquire the write lock to the cache in this function.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove comment

mBucketsToProcess.push_back(bucket);
CLOG_DEBUG(Ledger, "Adding bucket {} to mBucketsToProcess[{}]",
xdr::xdr_to_string(bucket->getHash()), counter);
counter++;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: ++counter

{
mBucketsToProcess.push_back(bucket);
CLOG_DEBUG(Ledger, "Adding bucket {} to mBucketsToProcess[{}]",
xdr::xdr_to_string(bucket->getHash()), counter);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: use binToHex() or hexAbbrev() instead of xdr_to_string

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add in-memory Ledger State mode
2 participants