Skip to content

Fix possible use-after-free on reference count and some minor CI changes#130

Open
leonardoheld wants to merge 3 commits into
uptane:masterfrom
leonardoheld:master
Open

Fix possible use-after-free on reference count and some minor CI changes#130
leonardoheld wants to merge 3 commits into
uptane:masterfrom
leonardoheld:master

Conversation

@leonardoheld
Copy link
Copy Markdown

@leonardoheld leonardoheld commented Feb 17, 2025

Hi there! I have several commits that I want to try to get feedback on and I'm slowly working them upstream, mainly about getting Aktualizr to build with newer dependencies because Debian Bullseye is getting EOL'd quite soon (2026-08-31) and I'd like to have a working MacOS build with what's available on brew.

Thus this is the first commit that fixes a possible use-after-free that GCC 12+ (on Bookworm, brew) yells about.

The other major fixes I'm working with are related to Boost: a ton of stuff changed from Bullseye. Do you explicitly want to keep compatibility with the older releases? Are there plans to drop bullseye eventually so I can focus my efforts on just bringing the build up to the latest ones?

Thanks!

Signed-off-by: Leonardo Held <leonardoheld@protonmail.com>
…g build

This change is specifically done so that the main build action on bullseye
doesn't fail on forks.

Signed-off-by: Leonardo Held <leonardoheld@protonmail.com>
This fixes a warning with GCC 12 on Bookworm by using the atomics library to
increment/decrement the reference counter instead of relying on raw pointer
arithmetic.

Signed-off-by: Leonardo Held <leonardoheld@protonmail.com>
@cajun-rat
Copy link
Copy Markdown
Collaborator

mainly about getting Aktualizr to build with newer dependencies because Debian Bullseye is getting EOL'd quite soon (2026-08-31) and I'd like to have a working MacOS build with what's available on brew.

Awesome. I'd love to break the dependency on oldstable too and get the full set of CI tests running under Bookworm (aka Debian Stable) as a first step to dropping support for Bullseye.

Thus this is the first commit that fixes a possible use-after-free that GCC 12+ (on Bookworm, brew) yells about.

Are you sure the use-after-free is caused by a thread race condition, and not that the static analyser in GCC can only spot the problem when the refcount is a simple int? Perhaps std::atomic is just confusing the analyser and not fixing a logic bug. I'm keen to be convinced: I've seen the warning before, but I can't figure out why the current code is wrong.

The other major fixes I'm working with are related to Boost: a ton of stuff changed from Bullseye. Do you explicitly want to keep compatibility with the older releases? Are there plans to drop bullseye eventually so I can focus my efforts on just bringing the build up to the latest ones?

Yes we should drop Bullseye support eventually, but I think we need a migration path rather than a hard cutover. I'd suggest that we should support Boost etc versions back to whatever is in either:

  • Debian oldstable
  • The oldest still-supported version of Yocto

What were the Boost things that are broken? I also had some problems getting the PKCS#11 and OSTree tests to working, but in the end all we can go is to chip away at it and slowly get more and more of the CI tests running on Debian stable.

@leonardoheld
Copy link
Copy Markdown
Author

leonardoheld commented Feb 18, 2025

Are you sure the use-after-free is caused by a thread race condition, and not that the static analyser in GCC can only spot the problem when the refcount is a simple int?

Oh, I think it's 100% caused by the compiler being too clever; a toy example doesn't trigger the behaviour https://godbolt.org/z/zPPr9cfWb.

I searched the web and everyone seems to have come up with funny solutions for this exact specific shared_ptr case: scylladb/seastar#1037 (comment). I'll try some others to see if I can prevent from relying on atomics.

a migration path rather than a hard cutover

Makes sense. Those versions would be

1.78 for Yocto and 1.74 for Debian.

The good news is that both will go EOL by 2026, so there's a target: might be Scarthgap with 1.84.0 and Debian Bookworm with who knows what but > 1.83. An integration branch might work, perhaps?

What were the Boost things that are broken?

Errm, I wasn't very clear in the MR description in hindsight: I wanted to get a MacOS build going, because that's what I use daily and I'm salty the brew instructions no longer work 😭 so I got it to build at least on MacOS with whatever is latest on brew. Then I wanted to do the same for Bookworm but matching the Boost version from Mac OS which is 1.87.0 because Boost has a bug that utterly breaks the build on Bookworm for some reason and /then/ some - but not a ton - of stuff breaks/needs changing.

@cajun-rat
Copy link
Copy Markdown
Collaborator

Shall we continue the Boost / brew building topic on a separate PR / issue? I want those things too, but I don't want to let figuring out a solution get in the way of this PR.

I searched the web and everyone seems to have come up with funny solutions for this exact specific shared_ptr case: scylladb/seastar#1037 (comment). I'll try some others to see if I can prevent from relying on atomics.

The #pragma GCC diagnostic push / #pragma GCC diagnostic ignored "-Wuse-after-free" approach in bhalevy/seastar@fd6010f seems like a good solution to me, although I don't think the pre-processor #ifdef is really needed. The nice thing about that is we can leave a comment about the problem and it is obvious how and when to revert back to the simpler solution.

That link above is just an (untested) idea. I'm cool with other solutions that give gcc more information or disable the warning, or even the std::atomic solution if we leave a comment why it is there.

The rest of the PR is a +1 from me. Thank you!

@leonardoheld
Copy link
Copy Markdown
Author

leonardoheld commented Feb 19, 2025

That link above is just an (untested) idea.
With the patch below

diff --git src/libaktualizr-posix/asn1/asn1_message.h src/libaktualizr-posix/asn1/asn1_message.h
index c0565763c..3c7f4b40e 100644
--- src/libaktualizr-posix/asn1/asn1_message.h
+++ src/libaktualizr-posix/asn1/asn1_message.h
@@ -55,12 +55,15 @@ class Asn1Message {
    */
   static Asn1Message::Ptr FromRaw(AKIpUptaneMes_t** msg) { return new Asn1Message(msg); }
 
+#pragma GCC diagnostic push
+#pragma GCC diagnostic ignored "-Wuse-after-free"
   friend void intrusive_ptr_add_ref(Asn1Message* m) { m->ref_count_++; }
   friend void intrusive_ptr_release(Asn1Message* m) {
     if (--m->ref_count_ == 0) {
       delete m;
     }
   }
+#pragma GCC diagnostic pop
 
   AKIpUptaneMes_PR present() const { return msg_.present; }
   Asn1Message& present(AKIpUptaneMes_PR present) {

We get the use-after-free warning from another random place that uses the asn1 shared_ptr implementation

/source/src/aktualizr_secondary/secondary_rpc_test.cc:301:73: error: pointer 'in_msg' may be used after 'void operator delete(void*, std::size_t)' [-Werror=use-after-free]
  301 |     size_t data_size = static_cast<size_t>(in_msg.uploadDataReq()->data.size);
      |                                            ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~

so we'd probably have to sprinkle the #pragma directive everywhere the code is used if I understood correctly - can't say I'm a fan 🫠.

I can either do the abive or explain better the why/how regarding the atomic operations. Whatever you think is best is probably more important.

Shall we continue the Boost / brew building topic on a separate PR / issue?

Sure! I'll open up a detailed issue as soon as I have the time because I think this will probably span more than one PR.

@cajun-rat
Copy link
Copy Markdown
Collaborator

so we'd probably have to sprinkle the #pragma directive everywhere the code is used if I understood correctly - can't say I'm a fan 🫠.

Yuk. Let's not do that :)

How this? It fixes the issue on my Debian bookworm box:

diff --git a/src/libaktualizr-posix/asn1/asn1_message.h b/src/libaktualizr-posix/asn1/asn1_message.h
index c0565763c..a9ce6aed9 100644
--- a/src/libaktualizr-posix/asn1/asn1_message.h
+++ b/src/libaktualizr-posix/asn1/asn1_message.h
@@ -56,7 +56,8 @@ class Asn1Message {
   static Asn1Message::Ptr FromRaw(AKIpUptaneMes_t** msg) { return new Asn1Message(msg); }
 
   friend void intrusive_ptr_add_ref(Asn1Message* m) { m->ref_count_++; }
-  friend void intrusive_ptr_release(Asn1Message* m) {
+
+  friend __attribute__ ((noinline)) void intrusive_ptr_release(Asn1Message* m) {
     if (--m->ref_count_ == 0) {
       delete m;
     }

(Note to self: you need to disable ccache when doing these tests...)

cajun-rat added a commit that referenced this pull request Mar 24, 2025
gcc 12 reports a use-after-free warning for this code. We believe it is
a FP. More details are in #130
@cajun-rat
Copy link
Copy Markdown
Collaborator

I created #131 for the asn1_message.h fix

cajun-rat added a commit to cajun-rat/aktualizr that referenced this pull request Apr 1, 2025
gcc 12 reports a use-after-free warning for this code. We believe it is
a FP. More details are in uptane/aktualizr#130
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@leonardoheld I imagine you'll drop the changes to this file since the actual issue in completely unrelated to multi-threading, right?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants