Skip to content

🚧 Mark-And-Sweep Garbage Collection #1020

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 46 commits into
base: main
Choose a base branch
from

Conversation

MatthiasReumann
Copy link
Collaborator

Description

Continues the work from #980 and resolves issue #644.

Checklist:

  • The pull request only contains commits that are focused and relevant to this change.
  • I have added appropriate tests that cover the new/changed functionality.
  • I have updated the documentation to reflect these changes.
  • I have added entries to the changelog for any noteworthy additions, changes, fixes or removals.
  • I have added migration instructions to the upgrade guide (if needed).
  • The changes follow the project's style guidelines and introduce no new warnings.
  • The changes are fully tested and pass the CI checks.
  • I have reviewed my own code changes.

q-inho and others added 24 commits June 2, 2025 10:31
Co-authored-by: Lukas Burgholzer <[email protected]>
Signed-off-by: Inho Choi <[email protected]>
…ounting and custom hash/equality functions for edges
Co-authored-by: Lukas Burgholzer <[email protected]>
Signed-off-by: Inho Choi <[email protected]>
Copy link
Member

@burgholzer burgholzer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just briefly had some time to look over this and wanted to leave some comments just in case. Many thanks for working on this 🙏

Comment on lines 46 to 47
std::uint16_t flags = 0; // TODO: Would it make sense to use a larger datatype
// here since Ref is gone?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. I was thinking about that already.
Initially, I would have hoped that we get some space savings for the nodes here, but that turned out to be wishful thinking. So we might as well make the padding explicit.
However that's also kind of awkward as there is no 48bit type.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's your take on using a bit field? All I know is that they exist.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm. Just read through the linked page.
Sounds reasonable in principle. I am a bit worried that the standard text mentions the implementation defined nature of the allocation details quite often.
We should, at least, make sure that on the platforms that we commonly test with, the packing and alignment of the resulting struct is as we would expect it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Quick googling tells us that this might not be the most portable and efficient solution.

So I guess we stick to 32 bit for now? 🤔

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. Let's stick with that for now. We don't use most of the flag bitfield anyway.
So this is highly likely to be the most portable solution for now.

Copy link
Collaborator Author

@MatthiasReumann MatthiasReumann Jun 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about BitFields made me wondering: Wouldn't it be nice to have something like:

struct Flags { // Size: 4 bytes, alignment 4 bytes
  uint32_t mark : 1;
  uint32_t reduced : 1;
  uint32_t dm : 1;
  uint32_t firstPath : 1;
  uint32_t conjugated : 1;
};

Flags f;

The compiler would give us the respective masks for free. LLVM uses it in a similar fashion too, see here.

Anyhow, probably something for a follow-up PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That looks really reasonable. and convenient.
Seems to be something for a separate PR, but I definitely like the idea! 👍🏼

@@ -101,8 +101,8 @@ MatrixDD getInverseDD(const qc::Operation& op, Package& dd,
* @brief Apply a unitary operation to a given vector DD.
*
* @details This is a convenience function that realizes @p op times @p in and
* correctly accounts for the permutation of the operation's qubits as well as
* automatically handles reference counting.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I would personally prefer to keep the old wording in most of the places where this was changed. Essentially, we are still doing some kind of reference counting, but only at the top level.

Comment on lines 261 to 274
template <class Edge> static void mark(const RootSet<Edge>& roots) {
for (auto& [edge, _] : roots) {
auto e = edge;
e.mark();
}
}

/// @brief Unmark edges contained in @p roots.
template <class Edge> static void unmark(const RootSet<Edge>& roots) {
for (auto& [edge, _] : roots) {
auto e = edge;
e.unmark();
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just so that I noted it down: I am still not quite sure if this trick of copying the hashmap key is actually valid.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wondering that too. Seems to work - but looks awkward. I think something like a const_cast could be useful here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe one could also just separately mark edge.p and edge.w. Maybe that would work without the copy because only the pointers are const, but not the data they point to 🤷🏼

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Much easier solution: Make Edge::mark and Edge::unmark const functions 🫠

@burgholzer burgholzer added enhancement New feature or request DD Anything related to the DD package c++ Anything related to C++ code labels Jun 26, 2025
Copy link

codecov bot commented Jun 26, 2025

Codecov Report

Attention: Patch coverage is 99.09091% with 2 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/dd/FunctionalityConstruction.cpp 92.3% 1 Missing ⚠️
src/dd/RealNumberUniqueTable.cpp 96.2% 1 Missing ⚠️

📢 Thoughts on this report? Let us know!

@MatthiasReumann
Copy link
Collaborator Author

MatthiasReumann commented Jun 26, 2025

👋🏻 @burgholzer

I think this is a good point to ask you for some input. The TODO:s in the code highlight the discussion points.

  1. The current implementation holds a static .5 in the real unique table. Not used, this number will certainly be deleted - some tests have failed as a consequence of that. But: What's more interesting is that other tests - Approximation - fail due to numeric issues when removing this static value. For example, the TwoQubitCorrectlyRebuilt doesn't approximate anything with the exact budget of .25 but works for .26. Similarly, ThreeQubitRemoveNodeWithChildren.

  2. For some reason the Grover/Grover.Functionality/15_qubits_2 is the only test that fails inconsistently on my system and those of the CI. I want to believe this is also due to numerical issues - but I'm not entirely sure either. In fact, this also happens for the recursive version of the test.

  3. Using the track and untrack semantic requires a different philosophy for referencing counting in the project, I think. As of now it feels kind of awkward to use track and untrack throughout. Currently, it also seems a bit inconsistent in the way it is used.

    Here's a suggestion: It is the callees job to track and untrack states. When returning states, the state should always be untracked. The function itself can change the tracked states but must always "clean up their own garbage". Essentially, this bowls then down to the end-user calling track and untrack for DD's they want to - well - track. I could even imagine that is possible to refactor VectorDD to a struct with constructor (track) and destructor (untrack) for something similar to the RAII idiom.

    This is a major refactoring - for sure.

Thanks 🙇🏻

@burgholzer
Copy link
Member

👋🏻 @burgholzer

I think this is a good point to ask you for some input. The TODO:s in the code highlight the discussion points.

  1. The current implementation holds a static .5 in the real unique table. Not used, this number will certainly be deleted - some tests have failed as a consequence of that. But: What's more interesting is that other tests - Approximation - fail due to numeric issues when removing this static value. For example, the TwoQubitCorrectlyRebuilt doesn't approximate anything with the exact budget of .25 but works for .26. Similarly, ThreeQubitRemoveNodeWithChildren.

Yeah. This 0.5 turned out to be fairly important for numerical stability in experimental evaluations. Since 0.5 can be exactly represented as a FP number, it is highly beneficial to have it represented explicitly. A some point, we even thought about adding more numbers of the form 1/2^k to the table unconditionally.
That is actually one of the main reason for the RealNumberTable and all the managing we do in that regard instead of simply relying on complex numbers and straight-up multiplications.
One potential immediate solution here would be to add 0.5 to the statically defined numbers.
That adds another if check to all of the computations though. So I think I'd much rather have it baked into the respective unique table somehow.
Maybe one needs a way to mark a node/number as "immortal". Another flag?

  1. For some reason the Grover/Grover.Functionality/15_qubits_2 is the only test that fails inconsistently on my system and those of the CI. I want to believe this is also due to numerical issues - but I'm not entirely sure either. In fact, this also happens for the recursive version of the test.

Yeah. That is highly likely to be numerical issues as well. Grover is particularly sensitive to these kinds of errors.
In the ideal case, the DD after every Grover iteration only consists of two strands: one uniform superposition strand of nodes with edge weights 1/sqrt(2) for every edge. And one that marks the solution strand for Grover.
The big problem is that numerical inaccuracies create a situation where the uniform superposition branch grows exponentially in terms of the numbers of nodes, while it is exponentially close to the ideal state in terms of fidelity.
That's a fairly fundamental problem that the current implementation of the DD package tries very hard to work around for certain scales of systems.

  1. Using the track and untrack semantic requires a different philosophy for referencing counting in the project, I think. As of now it feels kind of awkward to use track and untrack throughout. Currently, it also seems a bit inconsistent in the way it is used.
    Here's a suggestion: It is the callees job to track and untrack states. When returning states, the state should always be untracked. The function itself can change the tracked states but must always "clean up their own garbage". Essentially, this bowls then down to the end-user calling track and untrack for DD's they want to - well - track. I could even imagine that is possible to refactor VectorDD to a struct with constructor (track) and destructor (untrack) for something similar to the RAII idiom.
    This is a major refactoring - for sure.

I would very much be open to have a more fundamental change to how references are being tracked as part of the package. I agree that it feels awkward at times. I am open to ideas for how to better do this and to automate it slightly better.
I'd just have one request: Can we extract some of the useful changes from here in a separate PR to reduce the size of the PR here?

I hope this helps.

Comment on lines +100 to +103
const auto next = dd->multiply(iterationOp, iteration);
dd->track(next);
dd->untrack(iteration); // This will automatically untrack the iterationOp.
iteration = next;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interestingly, adding garbageCollect() here causes the numerical issues.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose that might be because the tables are actually full enough so that collection happens. Without garbage collection, "dead" entries might become alive again in subsequent computations. With garbage collection, these entries might be gone and the computation might result in slightly different results due to tolerances and such.
The interesting thing is that I would not expect things to actually change based on the changes in this PR. We are still using the same criterion for when to collect garbage and we should be tracking the same DDs as previously with the more fine-grained reference counting.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c++ Anything related to C++ code DD Anything related to the DD package enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants