-
Notifications
You must be signed in to change notification settings - Fork 38
Concurrent Immix #311
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Concurrent Immix #311
Conversation
@@ -0,0 +1,536 @@ | |||
#define private public // too lazy to change openjdk... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is copied from lxr branch. Basically when code needs to be patched, we need to call a private method. A proper solution is to define friend class in OpenJDK
#define __ ideal. | ||
|
||
void MMTkSATBBarrierSetC2::object_reference_write_pre(GraphKit* kit, Node* src, Node* slot, Node* pre_val, Node* val) const { | ||
if (can_remove_barrier(kit, &kit->gvn(), src, slot, val, /* skip_const_null */ false)) return; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Be careful of barrier elimination when implementing this in the binding. target == null does not mean the barrier can be eliminated. This bug takes me a while to find out.
openjdk/barriers/mmtkSATBBarrier.hpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this mmtkSATBBarrier is also an object remembering barrier. Theoretically, we should be able to reuse the existing mmtkObjectBarrier. But due to the current barrier api design, the mmtkObjectBarrier has the generational semantic baked in.
_pre_barrier_c1_runtime_code_blob = Runtime1::generate_blob(buffer_blob, -1, "mmtk_pre_write_code_gen_cl", false, &pre_write_code_gen_cl); | ||
MMTkPostBarrierCodeGenClosure post_write_code_gen_cl; | ||
_post_barrier_c1_runtime_code_blob = Runtime1::generate_blob(buffer_blob, -1, "mmtk_post_write_code_gen_cl", false, &post_write_code_gen_cl); | ||
// MMTkBarrierCodeGenClosure write_code_gen_cl_patch_fix(true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was in the lxr branch and after I discussed with wenyu, we believe it is redundant. The code patching has already been dealt with, no need to do any special things here
@@ -70,13 +72,16 @@ class MMTkBarrierSetRuntime: public CHeapObj<mtGC> { | |||
static void object_reference_array_copy_pre_call(void* src, void* dst, size_t count); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In OpenJDK, array copy does not give base address of src or dst arrays, so nothing can be remembered. I have a patch in Iso to pass down base address of both src and dst arrarys (they are required by the publication semantics). I am not sure if it is worthwhile to have that in our OpenJDK fork
All the tests failed. @tianleq Is there anything we need to do for running ConcurrentImmix? |
I did not test compressed pointers as in Iso, it is not supported yet. Other than that, the minheap is much larger due to it being non-moving immix |
Thanks. I temporarily disabled compressed pointer for concurrent Immix. We should get compressed pointer working before merging the PR. |
openjdk/barriers/mmtkSATBBarrier.cpp
Outdated
constexpr int kUnloggedValue = 1; | ||
|
||
static inline intptr_t side_metadata_base_address() { | ||
return UseCompressedOops ? SATB_METADATA_BASE_ADDRESS : SATB_METADATA_BASE_ADDRESS; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line is problematic, and I believe the crash is due to this. If compressed pointers is enabled, then probably we need a different base address
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The address of unlog bits is unrelated to whether we use compressed oops or not. It should be just SATB_METADATA_BASE_ADDRESS
OOM is expected, as concurrent Immix is non-moving. Tianle and Wenyu clarified the issue about compressed pointer. Currently the SATB barrier computes metadata address differently, based on whether compressed pointer is in used or not, which is incorrect. We should use the same approach as how metadata is computed in the object barrier. The current implementation of SATB barrier is derived from lxr which uses field barrier, and needs to deal with the difference of field slots with/without compressed pointer. |
ea4c968
to
fc9d143
Compare
fc9d143
to
5018b70
Compare
Current CI runs concurrent Immix with 4x min heap. There are still some benchmarks that ran out of memory. I will increase it to 5x. There are a few correctness issues. Segfault in mmtk_openjdk::abi::OopDesc::size: fastdebug h2, fastdebug jython
old_queue is not empty: fastdebug pmd
Segfault in mark_lines_for_object: release h2, release jythonThis could be the same issue as the first one (
|
I am aware of 4x is not enough, jython OOM even with 5x, this is true in the stop-the-world non-moving immix. The iython crash is what I saw when barrier is incorrectly eliminated. I fixed that and never saw it again. I also look at my test run, only see OOMs on jython. I will try to reproduce those locally |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess the code were copied from the field logging barrier in the lxr
branch. When not using compressed oops, each field is 64 bits, and the granularity of field-logging bits is one bit per 8 bytes. But when using compressed oops, fields become 32 bits, and the field-logging bits becomes one bit per 4 bytes. That's why the shift changes between 5 or 6 depending on whether compressed oops is enabled.
But here we are working on the object-remembering barrier, and using the regular global unlog bits. Its granularity is only related to the object alignment, not the field size. On 64-bit machines, objects are always 64-bit aligned. So we should always shift by 6 when computing the address of unlog bits. Similarly, when computing the in-byte shift, we should always shift by 3.
Change this and the segmentation fault disappears, even when using compressed oops.
openjdk/barriers/mmtkSATBBarrier.cpp
Outdated
constexpr int kUnloggedValue = 1; | ||
|
||
static inline intptr_t side_metadata_base_address() { | ||
return UseCompressedOops ? SATB_METADATA_BASE_ADDRESS : SATB_METADATA_BASE_ADDRESS; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The address of unlog bits is unrelated to whether we use compressed oops or not. It should be just SATB_METADATA_BASE_ADDRESS
|
I also observed a crash during a full GC (after patching the shifting operations for unlog bits as I suggested above).
It has something to do with finalizers. |
The initial implementation does not schedule finalization related packets in final pause. This commit should have fixed that: mmtk/mmtk-core@2bbb200 |
Two tests are still failing with mmtk/mmtk-core@2bbb200. old_queue is not empty: fastdebug pmd
Segfault in ConcurrentTraceObjects: release h2
|
Because this is object-remembering barrier, the alignment is only related to object alignment, not field alignment. So we don't need to test if compressed oops is enabled.
|
|
If we enable weak reference processing, GC threads get stuck during reference enqueueing. MMTk's reference processor will be holding a lock during reference enqueueing, and will call the binding's
During the scanning, it encounters a field that points itself (a weak reference), and will try to add that as a weak reference to MMTk. MMTk currently holds a lock to the weak reference processor, and tries to acquire the lock again to add the weak reference. This causes a dead lock. I am not sure what needs to be fixed here. Clearly when we are at reference enqueueing, weak reference processing should have been done. Thus we should not push another weak reference, because it will not be processed in this GC. This potentially can be fixed by setting the reference processor to disallow new references before reference enqueueing. But I am not sure if it is reasonable to allow the write barrier to be triggered here, and if it is reasonable to scan the object here. |
I think the fix is to check if we are currently concurrently marking, and if we are then the write barrier should fire. If the concurrent marker is not active, then the write barrier is not required since it means: either the write barrier is happening in a pause which is unnecessary or that no GC is currently triggered, in which case we don't need to keep track of old references. If, however, we end up making this into a generational GC somehow (using sticky mark-bits I guess), then you potentially always need a barrier active in which case the above solution will not work. |
This seems to be an issue of using object remembering barrier. Object remembering barrier requires us to store all of its reference fields when it is first encountered. In this particular case,no need to keep track of this reference |
barrier in SATB's object_reference_write_pre
This reverts commit b120bae.
This commit follows what G1 does for pre write barrier: https://github.com/mmtk/openjdk/blob/28e56ee32525c32c5a88391d0b01f24e5cd16c0f/src/hotspot/share/gc/g1/g1BarrierSet.inline.hpp#L38. Just skip the barrier for |
This doesn't work due to #313. I just changed MMTk core to work around the issue in #311 (comment). |
Draft PR for concurrent immix