Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 10 additions & 1 deletion compiler/rustc_codegen_llvm/src/attributes.rs
Original file line number Diff line number Diff line change
Expand Up @@ -517,7 +517,16 @@ pub(crate) fn llfn_attrs_from_instance<'ll, 'tcx>(
to_add.push(llvm::CreateAllocKindAttr(cx.llcx, AllocKindFlags::Free));
// applies to argument place instead of function place
let allocated_pointer = AttributeKind::AllocatedPointer.create_attr(cx.llcx);
attributes::apply_to_llfn(llfn, AttributePlace::Argument(0), &[allocated_pointer]);
let attrs: &[_] = if llvm_util::get_version() >= (21, 0, 0) {
Copy link
Member

@RalfJung RalfJung Jan 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to have some sort of justification for why this is correct. This operation destroys the provenance given to it, why is it okay to consider that non-capturing?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding from the zulip conversation is that since the allocation ends there's nothing further that the callee could do that would be relevant to aliasing rules of this allocation.

Whether or not the allocator touches the memory afterwards (e.g. for zeroing), isn't relevant to this side of the allocator boundary.

I assume we need to tell LLVMs both things separately because they just happen to be tracked separately.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nikic just to check the obvious, adding this attribute wouldn't allow

let p = alloc();
foo(p);
dealloc(p);

to be reordered to

let p = alloc();
dealloc(p);
foo(p);

Being an allocator method already inhibits that reordering.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Does not capture provenance" means "if the function call stashes the pointer somewhere, accessing that pointer after the function returns is UB". It does not limit what can be done with the pointer within the function itself.

FWIW, the C free function is marked captures(none).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason why we shouldn't match what C is doing then?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thinking was that if GlobalAlloc is used, it may inspect the address of the pointer, so using captures(none) may not be correct.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let attrs: &[_] = if llvm_util::get_version() >= (21, 0, 0) {
let attrs: &[_] = if llvm_util::get_version() >= (21, 0, 0) {
// "Does not capture provenance" means "if the function call stashes the pointer somewhere,
// accessing that pointer after the function returns is UB". That is definitely the case here since
// freeing will destroy the provenance.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thinking was that if GlobalAlloc is used, it may inspect the address of the pointer, so using captures(none) may not be correct.

But C allocators would also have to look at the address (e.g. comparing it to memory pools). Is the difference that those are assumed to be compiled separately?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What we care about here are effects that are observable outside the allocator. Specifically, what I had in mind is GlobalAlloc::dealloc() doing something like print "I'm now freeing pointer 0xdeadbeef".

// "Does not capture provenance" means "if the function call stashes the pointer somewhere,
// accessing that pointer after the function returns is UB". That is definitely the case here since
// freeing will destroy the provenance.
let captures_addr = AttributeKind::CapturesAddress.create_attr(cx.llcx);
&[allocated_pointer, captures_addr]
} else {
&[allocated_pointer]
};
attributes::apply_to_llfn(llfn, AttributePlace::Argument(0), attrs);
}
if let Some(align) = codegen_fn_attrs.alignment {
llvm::set_alignment(llfn, align);
Expand Down
8 changes: 5 additions & 3 deletions library/alloc/src/slice.rs
Original file line number Diff line number Diff line change
Expand Up @@ -448,9 +448,11 @@ impl<T> [T] {
// SAFETY:
// allocated above with the capacity of `s`, and initialize to `s.len()` in
// ptr::copy_to_non_overlapping below.
unsafe {
s.as_ptr().copy_to_nonoverlapping(v.as_mut_ptr(), s.len());
v.set_len(s.len());
if s.len() > 0 {
unsafe {
s.as_ptr().copy_to_nonoverlapping(v.as_mut_ptr(), s.len());
v.set_len(s.len());
}
}
v
}
Expand Down
6 changes: 5 additions & 1 deletion library/alloc/src/vec/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -2818,7 +2818,11 @@ impl<T, A: Allocator> Vec<T, A> {
let count = other.len();
self.reserve(count);
let len = self.len();
unsafe { ptr::copy_nonoverlapping(other as *const T, self.as_mut_ptr().add(len), count) };
if count > 0 {
unsafe {
ptr::copy_nonoverlapping(other as *const T, self.as_mut_ptr().add(len), count)
};
}
self.len += count;
}

Expand Down
30 changes: 30 additions & 0 deletions tests/codegen-llvm/lib-optimizations/append-elements.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
//@ compile-flags: -O -Zmerge-functions=disabled
//@ min-llvm-version: 21
#![crate_type = "lib"]

//! Check that a temporary intermediate allocations can eliminated and replaced
//! with memcpy forwarding.
//! This requires Vec code to be structured in a way that avoids phi nodes from the
//! zero-capacity length flowing into the memcpy arguments.

// CHECK-LABEL: @vec_append_with_temp_alloc
#[no_mangle]
pub fn vec_append_with_temp_alloc(dst: &mut Vec<u8>, src: &[u8]) {
// CHECK-NOT: call void @llvm.memcpy
// CHECK: call void @llvm.memcpy.{{.*}}%dst.i{{.*}}%src.0
// CHECK-NOT: call void @llvm.memcpy
let temp = src.to_vec();
dst.extend(&temp);
// CHECK: ret
}

// CHECK-LABEL: @string_append_with_temp_alloc
#[no_mangle]
pub fn string_append_with_temp_alloc(dst: &mut String, src: &str) {
// CHECK-NOT: call void @llvm.memcpy
// CHECK: call void @llvm.memcpy.{{.*}}%dst.i{{.*}}%src.0
// CHECK-NOT: call void @llvm.memcpy
let temp = src.to_string();
dst.push_str(&temp);
// CHECK: ret
}
Loading