fix: Add back CustomMemoryResource with bug fix#17698
Open
sungwoo-XCENA wants to merge 10 commits into
Open
Conversation
✅ Deploy Preview for meta-velox canceled.
|
The prototype commit registered CustomMemoryResource and gave QueryCtx per-resource root pools, but pool allocations still routed through the MemoryManager default allocator. This wires the dispatch end-to-end: - MemoryPool::Options carries a shared_ptr<CustomMemoryResource>. MemoryPoolImpl resolves allocator_ and arbitrator_ from the resource at construction, falling back to manager defaults when null. - genChild propagates the resource to children, so aggregate and leaf children of a custom-resource root inherit the allocator and tag. - MemoryManager::addRootPool resolves resourceTag to a registered resource once and stores the shared_ptr. Unregistered tags throw. - arbitrator addPool/removePool route through pool->arbitrator() so custom arbitrators participate in pool registration. - customResources_ becomes vector<shared_ptr<CustomMemoryResource>>; pools keep strong references, replacing the implicit manager-outlives-pools contract with a refcount invariant. Tests consolidated into CustomMemoryResourceTest.cpp covering registration, ordering, pointer stability, per-query pool creation, chained-reclaimer ordering, cross-resource spill flow, and root/child allocator dispatch. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…dispatch - Require arbitrator and reclaimerFactory on registration; registration now throws on any null required field. - Drop pure-virtual MemoryPool::arbitrator() in favor of a nullptr default so external subclassers keep compiling; MemoryPoolImpl still overrides to return the resolved arbitrator. - Move QueryCtx custom-pool creation into an anonymous-namespace helper taking a raw QueryCtx*. - Refresh doc comments on the public API and restore the load-bearing invariant that pools share ownership of CustomMemoryResource. - Update test helper to populate the new required fields, add negative coverage for null arbitrator/factory, and drop the now-contradictory noReclaimerFactory test.
…make invalid states unrepresentable - Move the resource registry out of MemoryManager into a standalone CustomMemoryResourceRegistry that wraps ScopedRegistry. Exposes a process-global singleton via global() and per-query/per-test scopes via createRegistry(parent). Registration and lookup go through the Registry instance (Registry::insert, Registry::find, Registry::clear); aligns with the pattern from facebookincubator#16982/facebookincubator#16986. - Convert CustomMemoryResource from a struct with optional fields into a class with a validating constructor. Required fields are now checked at construction, so post-construction the resource is guaranteed valid. Removes the silent wrong-tier failure mode where a null allocator or arbitrator would let the pool quietly fall back to the default tier. - Add MemoryManager::addCustomRootPool(name, shared_ptr<resource>, debugOpts) as the per-query materialization entry point. Drop the resourceTag overload of addRootPool. - Tests use per-test isolated registries so concurrent runs cannot collide on tag registration. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…sources Wires CustomMemoryResourceRegistry into Task/Driver/Operator so that for every custom root pool registered on a QueryCtx, Task builds a parallel 'task -> node -> operator' aggregate hierarchy. Operators look up their mirrored leaf pool via Operator::customPool(tag), letting them allocate from non-default memory backends (e.g. GPU, CXL) without disturbing the default pool hierarchy or its accounting. - Adds kCustomMemoryResourceRegistryKey for QueryCtx-scoped lookup. - Task: initCustomTaskPools, getOrAddCustomNodePool / getOrAddCustomJoinNodePool, addCustomOperatorPool, customNodePool, with ownership in customChildPools_. - DriverCtx::addCustomOperatorPools mirrors addOperatorPool per tag. - OperatorCtx caches the per-tag leaf pools and exposes customPool(). - New CustomMemoryHierarchyTest covers the mirrored hierarchy. - Docs: memory.rst and spilling.rst updated. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
e67cfd0 to
c9260b4
Compare
Build Impact AnalysisSelective Build Targets (building these covers all 524 affected)Total affected: 524/581 targets
Affected targets (524)Directly changed (441)
Transitively affected (83)
Slow path • Graph generated from PR branch |
bikramSingh91
approved these changes
Jun 4, 2026
|
@bikramSingh91 has imported this pull request. If you are a Meta employee, you can view this in D107459609. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
The original PR#17529 was merged into main, but later it was found that memory leak was suppressed accidentally in CI. The PR merge was reverted by #17667.
This PR is to add back the original #17529 with the memory leak fix.
Root cause
CustomMemoryHierarchyTest creates kSerial tasks (Task::create → init() builds drivers eagerly), forming a Task↔Driver shared_ptr cycle. The tests never terminated them, so every task's pool subtree leaked — exactly the LeakSanitizer failure the reviewers hit.
Changes