Skip to content

8359646: C1 crash in AOTCodeAddressTable::add_C_string #25841

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

vnkozlov
Copy link
Contributor

@vnkozlov vnkozlov commented Jun 17, 2025

It is concurrency issue. Call to AOTCodeAddressTable::add_C_string() happened after checks that AOT code cache is still opened. But, because there is no synchronization, other thread (VM) closed/delete AOT code cache (after dumping) before code in add_C_string() accessed it.

Added missed AOTCodeCStrings_lock in places where we modify, store and delete AOT strings table. Moved MutexLocker from AOTCodeAddressTable::add_C_string() to its caller and do additional check after it.

I also noticed that we missed similar check after Compile_lock when we are storing AOT code.

Tested hs-tier1-6,hs-tier10-rt,stress,xcomp


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8359646: C1 crash in AOTCodeAddressTable::add_C_string (Bug - P3)(⚠️ The fixVersion in this issue is [25] but the fixVersion in .jcheck/conf is 26, a new backport will be created when this pr is integrated.)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/25841/head:pull/25841
$ git checkout pull/25841

Update a local copy of the PR:
$ git checkout pull/25841
$ git pull https://git.openjdk.org/jdk.git pull/25841/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 25841

View PR using the GUI difftool:
$ git pr show -t 25841

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/25841.diff

Using Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Jun 17, 2025

👋 Welcome back kvn! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Jun 17, 2025

@vnkozlov This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8359646: C1 crash in AOTCodeAddressTable::add_C_string

Reviewed-by: adinn, iklam

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 11 new commits pushed to the master branch:

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the rfr Pull request is ready for review label Jun 17, 2025
@openjdk
Copy link

openjdk bot commented Jun 17, 2025

@vnkozlov The following label will be automatically applied to this pull request:

  • hotspot-compiler

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@vnkozlov
Copy link
Contributor Author

@adinn and @ashu-mehra please look.

Very rare case because very narrow window for concurrency.

@vnkozlov
Copy link
Contributor Author

This happens during assembly phase.

@mlbridge
Copy link

mlbridge bot commented Jun 17, 2025

Webrevs

@@ -819,6 +820,9 @@ bool AOTCodeCache::store_code_blob(CodeBlob& blob, AOTCodeEntry::Kind entry_kind
// we need to take a lock to prevent race between compiler threads generating AOT code
// and the main thread generating adapter
MutexLocker ml(Compile_lock);
if (!is_on()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for the record:

I can see why this is needed to stop compiler threads dereferencing a null cache pointer or null table pointer when some other thread might concurrently be closing the cache.

That led me to wonder why we don't need to further synchronize concurrent execution of non-compiler threads. I convinced myself that whenever a non-compiler thread calls AOTCodeCache::close() the only other running threads that might try to access the AOT cache are compiler threads -- calls to the close() method are from MetaspaceShared::preload_and_dump_impl (metaspaceShared.cpp), before_exit (java.cpp) and Threads::create_vm (threads.cpp). (Well, modulo a rogue jcmd, perhaps?).

Copy link
Contributor Author

@vnkozlov vnkozlov Jun 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Ans with jcmd, as @iklam pointed in an other PR, "that would create far worse problems than the bug that we are trying to fix here" (https://git.openjdk.org/jdk/pull/25816#issuecomment-2975238342)

Copy link
Contributor

@adinn adinn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Jun 17, 2025
@ashu-mehra
Copy link
Contributor

Unless I missed something, I think there is still synchronization problem between the thread adding a string to the AOTCodeCache and the thread closing the AOTCodeCache. See the following pattern of execution:

t0: Thread T1 calls add_C_string(), checks for is_on_for_dump which returns true
t1: Thread T2 calls close() and completes the call to finish_write()
t2: T1 acquires AOTCodeCStrings_lock and checks that _table is not null which returns true
t3: T2 clears the table and sets it to null
t4: T1 tries to call table->add_C_string

I think at the time of shutting down the AOTCodeCache, the thread should hold both Compile_lock and AOTCodeCStrings_lock.

@vnkozlov
Copy link
Contributor Author

t3: T2 clears the table and sets it to null

It can't because T1 holds the lock. See that I added the lock before delete _table.

@vnkozlov
Copy link
Contributor Author

Thank you @adinn and @ashu-mehra for review.

Copy link
Member

@iklam iklam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@vnkozlov
Copy link
Contributor Author

Thank you, @iklam

@vnkozlov
Copy link
Contributor Author

/integrate

@openjdk
Copy link

openjdk bot commented Jun 17, 2025

Going to push as commit 9607021.
Since your change was applied there have been 11 commits pushed to the master branch:

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Jun 17, 2025
@openjdk openjdk bot closed this Jun 17, 2025
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Jun 17, 2025
@openjdk
Copy link

openjdk bot commented Jun 17, 2025

@vnkozlov Pushed as commit 9607021.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@vnkozlov vnkozlov deleted the 8359646 branch June 17, 2025 15:59
@ashu-mehra
Copy link
Contributor

It can't because T1 holds the lock. See that I added the lock before delete _table.

yup, I missed that lock. All good then.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot-compiler [email protected] integrated Pull request has been integrated
Development

Successfully merging this pull request may close these issues.

4 participants