-
Notifications
You must be signed in to change notification settings - Fork 6k
8359646: C1 crash in AOTCodeAddressTable::add_C_string #25841
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
👋 Welcome back kvn! A progress list of the required criteria for merging this PR into |
@vnkozlov This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be:
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 11 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. ➡️ To integrate this PR with the above commit message to the |
@adinn and @ashu-mehra please look. Very rare case because very narrow window for concurrency. |
This happens during assembly phase. |
@@ -819,6 +820,9 @@ bool AOTCodeCache::store_code_blob(CodeBlob& blob, AOTCodeEntry::Kind entry_kind | |||
// we need to take a lock to prevent race between compiler threads generating AOT code | |||
// and the main thread generating adapter | |||
MutexLocker ml(Compile_lock); | |||
if (!is_on()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just for the record:
I can see why this is needed to stop compiler threads dereferencing a null cache pointer or null table pointer when some other thread might concurrently be closing the cache.
That led me to wonder why we don't need to further synchronize concurrent execution of non-compiler threads. I convinced myself that whenever a non-compiler thread calls AOTCodeCache::close()
the only other running threads that might try to access the AOT cache are compiler threads -- calls to the close()
method are from MetaspaceShared::preload_and_dump_impl
(metaspaceShared.cpp), before_exit
(java.cpp) and Threads::create_vm
(threads.cpp). (Well, modulo a rogue jcmd, perhaps?).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Ans with jcmd
, as @iklam pointed in an other PR, "that would create far worse problems than the bug that we are trying to fix here" (https://git.openjdk.org/jdk/pull/25816#issuecomment-2975238342)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good
Unless I missed something, I think there is still synchronization problem between the thread adding a string to the AOTCodeCache and the thread closing the AOTCodeCache. See the following pattern of execution: t0: Thread T1 calls add_C_string(), checks for I think at the time of shutting down the AOTCodeCache, the thread should hold both |
It can't because T1 holds the lock. See that I added the lock before |
Thank you @adinn and @ashu-mehra for review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Thank you, @iklam |
/integrate |
Going to push as commit 9607021.
Your commit was automatically rebased without conflicts. |
yup, I missed that lock. All good then. |
It is concurrency issue. Call to
AOTCodeAddressTable::add_C_string()
happened after checks that AOT code cache is still opened. But, because there is no synchronization, other thread (VM) closed/delete AOT code cache (after dumping) before code inadd_C_string()
accessed it.Added missed AOTCodeCStrings_lock in places where we modify, store and delete AOT strings table. Moved MutexLocker from
AOTCodeAddressTable::add_C_string()
to its caller and do additional check after it.I also noticed that we missed similar check after Compile_lock when we are storing AOT code.
Tested hs-tier1-6,hs-tier10-rt,stress,xcomp
Progress
Issue
Reviewers
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/25841/head:pull/25841
$ git checkout pull/25841
Update a local copy of the PR:
$ git checkout pull/25841
$ git pull https://git.openjdk.org/jdk.git pull/25841/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 25841
View PR using the GUI difftool:
$ git pr show -t 25841
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/25841.diff
Using Webrev
Link to Webrev Comment