Skip to content

✨ Implement mark-and-sweep garbage collection #980

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 21 commits into
base: main
Choose a base branch
from

Conversation

q-inho
Copy link

@q-inho q-inho commented Jun 2, 2025

Description

The DD package now uses a mark-and-sweep garbage collector instead of reference counting.
Nodes no longer store reference counts.
The incRef and decRef functions now manage the root set for the new garbage collector.

Resolves #644

@q-inho
Copy link
Author

q-inho commented Jun 2, 2025

@burgholzer , Do you have any tips for testing for the failed test? I locally test using these commands

cmake -S . -B build -DCMAKE_BUILD_TYPE=Release
cmake --build build --config Release
ctest -C Release --test-dir build/test/dd

but it shows follwong on my test


Start 120: DDPackageTest.ResetClearsRoots
120/288 Test #120: DDPackageTest.ResetClearsRoots ................................................................................   Passed    0.00 sec

Copy link
Member

@burgholzer burgholzer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey 👋🏻
Many thanks for the changes here. This is really appreciated!

I would not consider this a full review yet. But I wanted to get a first batch of comments out to you. Take everything with a grain of salt.
As I was reading through the PR, my opinion on how this should probably look like evolved a bit. So some of the comments might not perfectly align with each other.
Generally, I believe the high-level incRef/decRef (or track/untrack) functions are here to stay. The key difference is that the actual reference counts are now only evaluated once the unique tables are full enough to trigger collection instead of all the time during computation.

Feel free to ask any questions that pop up. You can see the clang-tidy linter complaints on the summary section on the GitHub Actions summary page.

Comment on lines 235 to 248
// helper to track complex number references recursively
template <class Node> void markComplexRec(Node* p, bool inc) noexcept {
if (Node::isTerminal(p)) {
return;
}
for (const auto& child : p->e) {
if (inc) {
cn.incRef(child.w);
} else {
cn.decRef(child.w);
}
markComplexRec(child.p, inc);
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per the other comment, I believe to would be best to handle this differently. Specifically, to handle numbers similarly to the nodes here.
Note that the RealNumber class does not have any space for flags.
So you probably have to rely on a similar trick to the one we use for marking negative numbers, manipulating the lower bits of the pointer to the numbers.
A 1 in the LSB of the pointer address indicates the number is negative.
One could use the second bit to mark entries for example.

Comment on lines 22 to 24
* @brief The total number of active entries.
*/
std::size_t numActiveEntries = 0U;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Active entries do not make any sense any more now that they are not being tracked through continuous reference counting anymore.
So any mentions of it can essentially be removed.

It could be helpful to provide a method in the DD package that still allows to compute this by marking all active entries (using the new mark method) and then counting in the unique table.

Comment on lines 119 to 123
// Collecting complex numbers forces collection of the node tables. The
// following calls explicitly use `true` to enforce this.
const auto vCollect = vUniqueTable.garbageCollect(true);
const auto mCollect = mUniqueTable.garbageCollect(true);
const auto dCollect = dUniqueTable.garbageCollect(true);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

force should only be set to true here if complex numbers have been collected. Similar to the original code.

@@ -120,7 +120,7 @@ std::map<std::string, std::size_t> sample(const qc::QuantumComputation& qc,
auto measurement = dd.measureAll(e, false, mt);
counts.operator[](measurement) += 1U;
}
// reduce reference count of measured state
// remove the measured state edge from the root set
dd.decRef(e);
Copy link
Member

@burgholzer burgholzer Jun 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just based on the changes to the docstrings throughout the files I believe these methods deserve a more appropriate name now. Maybe something like track and untrack?

The rename should also help to revisit all the places where these functions are called, to see whether these calls are still necessary. One of the goals here would actually be to require as little as possible manual intervention.
The garbage collection would ideally check in periodic intervals how full the unique tables are. If one of them could be collected, the marking and collection would start.

@q-inho
Copy link
Author

q-inho commented Jun 2, 2025

Thank you for the detailed review. I began with a minimal core implementation, but realized later many other codes are interacted, resulting existing code paths to be escaped from my purpose. I really appreciate for your feedback.

@burgholzer
Copy link
Member

@burgholzer , Do you have any tips for testing for the failed test? I locally test using these commands

cmake -S . -B build -DCMAKE_BUILD_TYPE=Release

cmake --build build --config Release

ctest -C Release --test-dir build/test/dd

but it shows follwong on my test




Start 120: DDPackageTest.ResetClearsRoots

120/288 Test #120: DDPackageTest.ResetClearsRoots ................................................................................   Passed    0.00 sec



As for testing, the CI is simply running (a version of)

ctest -C Release --test-dir build/ --output-on-failure

So that should allow you to see the same results.

If you use an IDE like CLion (which ai can only recommend. And CLion is free since a couple weeks ago), then you'll also get an "All tests" target that you can execute directly from the IDE.

@q-inho
Copy link
Author

q-inho commented Jun 2, 2025

@burgholzer , Do you have any tips for testing for the failed test? I locally test using these commands

cmake -S . -B build -DCMAKE_BUILD_TYPE=Release

cmake --build build --config Release

ctest -C Release --test-dir build/test/dd

but it shows follwong on my test




Start 120: DDPackageTest.ResetClearsRoots

120/288 Test #120: DDPackageTest.ResetClearsRoots ................................................................................   Passed    0.00 sec

As for testing, the CI is simply running (a version of)

ctest -C Release --test-dir build/ --output-on-failure

So that should allow you to see the same results.

If you use an IDE like CLion (which ai can only recommend. And CLion is free since a couple weeks ago), then you'll also get an "All tests" target that you can execute directly from the IDE.

I didn't know CLion is free. Thats really a good tip to know. Thanks a lot!

Copy link

codecov bot commented Jun 4, 2025

Codecov Report

Attention: Patch coverage is 98.24561% with 1 line in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/dd/RealNumberUniqueTable.cpp 93.7% 1 Missing ⚠️

📢 Thoughts on this report? Let us know!

@q-inho q-inho requested a review from burgholzer June 4, 2025 12:00
@q-inho
Copy link
Author

q-inho commented Jun 4, 2025

@burgholzer I twisted some strategies and now it is passing the tests. May I ask for your further feedbacks?

Copy link
Member

@burgholzer burgholzer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Many thanks for your continued work on this. It is certainly heading in the right direction. Great progress. 🙌
I think I have now reviewed the full PR in all its detail. You can find respective comments inline.

Please try to address all of them in your next iteration or explicitly ask questions (as reply to the comments) if something is unclear.

This will still be some work, but I am sure this is doable 😌

Edit: note that there are still some unresolved/unaddressed comments from the first review. Some of them partially overlap with the comments in the latest review, but some don't.

Comment on lines +398 to +414
namespace dd {
template <class Node> struct EdgePtrHash {
std::size_t operator()(const Edge<Node>& e) const noexcept {
const auto h1 = murmur64(reinterpret_cast<std::size_t>(e.p));
const auto h2 = murmur64(reinterpret_cast<std::size_t>(e.w.r));
const auto h3 = murmur64(reinterpret_cast<std::size_t>(e.w.i));
return qc::combineHash(qc::combineHash(h1, h2), h3);
}
};

template <class Node> struct EdgePtrEqual {
bool operator()(const Edge<Node>& lhs, const Edge<Node>& rhs) const noexcept {
return lhs.p == rhs.p && lhs.w.r == rhs.w.r && lhs.w.i == rhs.w.i;
}
};
} // namespace dd

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change should not be necessary if I am not overlooking things.
The Edge class defines a hash function at the very end of this file.

And the equality operator defined for the Edge class should also be sufficient here.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. Edge already provides them.

Comment on lines 273 to 332
template <class Node> bool addRoot(const Edge<Node>& e) {
if constexpr (std::is_same_v<Node, vNode>) {
auto it = vectorRoots.find(e);
if (it == vectorRoots.end()) {
vectorRoots.emplace(e, 1U);
return true;
}
++it->second;
return false;
} else if constexpr (std::is_same_v<Node, mNode>) {
auto it = matrixRoots.find(e);
if (it == matrixRoots.end()) {
matrixRoots.emplace(e, 1U);
return true;
}
++it->second;
return false;
} else if constexpr (std::is_same_v<Node, dNode>) {
auto it = densityRoots.find(e);
if (it == densityRoots.end()) {
densityRoots.emplace(e, 1U);
return true;
}
++it->second;
return false;
}
return false;
}

template <class Node> bool removeRoot(const Edge<Node>& e) {
if constexpr (std::is_same_v<Node, vNode>) {
auto it = vectorRoots.find(e);
if (it != vectorRoots.end()) {
if (--it->second == 0U) {
vectorRoots.erase(it);
return true;
}
return false;
}
} else if constexpr (std::is_same_v<Node, mNode>) {
auto it = matrixRoots.find(e);
if (it != matrixRoots.end()) {
if (--it->second == 0U) {
matrixRoots.erase(it);
return true;
}
return false;
}
} else if constexpr (std::is_same_v<Node, dNode>) {
auto it = densityRoots.find(e);
if (it != densityRoots.end()) {
if (--it->second == 0U) {
densityRoots.erase(it);
return true;
}
return false;
}
}
return false;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am fairly sure both of these functions can simply be void functions.
Their return type is not used in the main library and this would also remove any need for casts to void. In the existing incRef and decRef functions.

Comment on lines 275 to 280
auto it = vectorRoots.find(e);
if (it == vectorRoots.end()) {
vectorRoots.emplace(e, 1U);
return true;
}
++it->second;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not 100% sure, but if I remember correctly, this can be simplified to

vectorRoots[e]++;

I might be wrong though.

Comment on lines 302 to 331
template <class Node> bool removeRoot(const Edge<Node>& e) {
if constexpr (std::is_same_v<Node, vNode>) {
auto it = vectorRoots.find(e);
if (it != vectorRoots.end()) {
if (--it->second == 0U) {
vectorRoots.erase(it);
return true;
}
return false;
}
} else if constexpr (std::is_same_v<Node, mNode>) {
auto it = matrixRoots.find(e);
if (it != matrixRoots.end()) {
if (--it->second == 0U) {
matrixRoots.erase(it);
return true;
}
return false;
}
} else if constexpr (std::is_same_v<Node, dNode>) {
auto it = densityRoots.find(e);
if (it != densityRoots.end()) {
if (--it->second == 0U) {
densityRoots.erase(it);
return true;
}
return false;
}
}
return false;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this function should handle the case where the Edge is not in the root set and error out in that case.

Comment on lines 262 to 271
template <class Node> void markNodes(Node* p) noexcept {
if (Node::isTerminal(p) || p->marked()) {
return;
}
p->mark();
for (const auto& child : p->e) {
cn.incRef(child.w);
markNodes(child.p);
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this function should be marking Edges now. Because otherwise, you will never be marking the top edge weight.

Comment on lines 122 to 125
// first sweep all node tables
const auto vCollect = vUniqueTable.garbageCollect(true);
const auto mCollect = mUniqueTable.garbageCollect(true);
const auto dCollect = dUniqueTable.garbageCollect(true);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As pointed out in the previous review, this is not yet the same logic as before this PR with regard to the force flag.

Comment on lines 131 to 152
// unmark all complex numbers again so reference counts are reset for the next
// collection cycle
{
std::unordered_set<vNode*> visited{};
for (const auto& [edge, _] : vectorRoots) {
cn.decRef(edge.w);
markComplexNumbersRec(edge.p, false, &visited);
}
}
{
std::unordered_set<mNode*> visited{};
for (const auto& [edge, _] : matrixRoots) {
cn.decRef(edge.w);
markComplexNumbersRec(edge.p, false, &visited);
}
}
{
std::unordered_set<dNode*> visited{};
for (const auto& [edge, _] : densityRoots) {
cn.decRef(edge.w);
markComplexNumbersRec(edge.p, false, &visited);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would assume this is no longer necessary once the reference counting for the numbers is adjusted.
Maybe the top level edge-weights need to be unmarked, but no recursive traversal should be necessary.

@@ -163,7 +204,6 @@ dEdge Package::makeZeroDensityOperator(const std::size_t n) {
f = makeDDNode(static_cast<Qubit>(p),
std::array{f, dEdge::zero(), dEdge::zero(), dEdge::zero()});
}
incRef(f);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on where this PR is heading, this incRef needs to stay.

@@ -179,7 +219,6 @@ vEdge Package::makeZeroState(const std::size_t n, const std::size_t start) {
for (std::size_t p = start; p < n + start; p++) {
f = makeDDNode(static_cast<Qubit>(p), std::array{f, vEdge::zero()});
}
incRef(f);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on where this PR is heading, this incRef needs to stay.

@q-inho
Copy link
Author

q-inho commented Jun 4, 2025

Thanks a lot for detailed comments. I am really learning a lot from this PR about cpp projects. I will reach out for further questions if I am not clear.

Co-authored-by: Lukas Burgholzer <[email protected]>
Signed-off-by: Inho Choi <[email protected]>
@burgholzer
Copy link
Member

@q-inho may I ask for an update on this? Is there anything blocking your progress?

@burgholzer burgholzer moved this to In Progress in MQT Core Jun 9, 2025
@burgholzer burgholzer added enhancement New feature or request minor Minor version update DD Anything related to the DD package c++ Anything related to C++ code unitaryHack Issues and PRs intended for unitaryHack labels Jun 9, 2025
@burgholzer burgholzer added this to the DD Package Improvements milestone Jun 9, 2025
@q-inho
Copy link
Author

q-inho commented Jun 9, 2025

@burgholzer Thank you for asking. I am currently not passing the test for Grover, so I am reviewing the changes I made, where it is going wrong. I am reviewing the codes myself. 😅

@burgholzer
Copy link
Member

@burgholzer Thank you for asking. I am currently not passing the test for Grover, so I am reviewing the changes I made, where it is going wrong. I am reviewing the codes myself. 😅

Have you already implemented the changes to the handling of the complex number ref counts? If so, it might be worth pushing the changes even if tests fail. Maybe I can quickly point you towards the underlying error.

@q-inho
Copy link
Author

q-inho commented Jun 9, 2025

@burgholzer Thank you for asking. I am currently not passing the test for Grover, so I am reviewing the changes I made, where it is going wrong. I am reviewing the codes myself. 😅

Have you already implemented the changes to the handling of the complex number ref counts? If so, it might be worth pushing the changes even if tests fail. Maybe I can quickly point you towards the underlying error.

I believed I made correct implementation on complex number ref counts. I will check little bit more and will push soon. Thank you for your help

@burgholzer
Copy link
Member

@q-inho
Just fyi: today (anywhere on earth) is officially the last day of unitary hack. Since you submitted the PRs before that deadline, you can keep working on them until next Monday.
Of course, you are welcome to keep working on the pull request after that. It just won't be counted towards the unitary hack bounties.

@burgholzer
Copy link
Member

@q-inho may I ask for an update on this?

@q-inho
Copy link
Author

q-inho commented Jun 13, 2025

@burgholzer Apologize for delay due to urgent works. I just updated current progress. I think this now implement mark & sweep and replaces node reference counts with a mark flag and manages roots through Package::incRef/decRef. Complex numbers are kept in RealNumberUniqueTable and marked via pointer bits.

@q-inho
Copy link
Author

q-inho commented Jun 15, 2025

The Windows build failed because it requires clearMark to be declared as constexpr, while passing on Mac and Linux. However, defining clearMark as constexpr require the use of reinterpret_cast, which is not permitted in constant expressions. As a temporary workaround, I have replaced constexpr with inline.

@burgholzer
Copy link
Member

Hey @q-inho 👋🏼
could you please give me push access to your fork so that I can push some changes to this PR?

@q-inho
Copy link
Author

q-inho commented Jun 15, 2025

@burgholzer Thanks for your help! I've already enabled Allow edits by maintainers for this PR. Did you mean I should grant you direct access by adding you as a collaborator on my repository branch instead?

@burgholzer
Copy link
Member

Yeah, thanks!

I just pushed a work-in-progress commit with some fixups and corrections.
This is not yet functional unfortunately, but I need to get some sleep now.
I'll investigate more tomorrow. If you want to look at in the meantime, feel free to go ahead.
Something about the complex number tracking is still off. The mark and unmark does not seem to work fully.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c++ Anything related to C++ code DD Anything related to the DD package enhancement New feature or request minor Minor version update unitaryHack Issues and PRs intended for unitaryHack
Projects
No open projects
Status: In Progress
Development

Successfully merging this pull request may close these issues.

✨ Mark and Sweep Garbage Collection for DD Package
2 participants