Skip to content

[libc] Fix malloc Block alignment issue on riscv32 #117815

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Jan 16, 2025
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions libc/src/__support/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ add_header_library(
libc.src.__support.CPP.optional
libc.src.__support.CPP.span
libc.src.__support.CPP.type_traits
libc.src.__support.math_extras
)

add_object_library(
Expand Down
229 changes: 110 additions & 119 deletions libc/src/__support/block.h
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@
#include "src/__support/CPP/type_traits.h"
#include "src/__support/libc_assert.h"
#include "src/__support/macros/config.h"
#include "src/__support/math_extras.h"

#include <stdint.h>

Expand Down Expand Up @@ -68,8 +69,8 @@ using cpp::optional;
/// The blocks store their offsets to the previous and next blocks. The latter
/// is also the block's size.
///
/// Blocks will always be aligned to a `ALIGNMENT` boundary. Block sizes will
/// always be rounded up to a multiple of `ALIGNMENT`.
/// All blocks have their usable space aligned to some multiple of max_align_t.
/// This also implies that block outer sizes are aligned to max_align_t.
///
/// As an example, the diagram below represents two contiguous `Block`s. The
/// indices indicate byte offsets:
Expand Down Expand Up @@ -129,8 +130,9 @@ class Block {
Block(const Block &other) = delete;
Block &operator=(const Block &other) = delete;

/// Creates the first block for a given memory region, followed by a sentinel
/// last block. Returns the first block.
/// Initializes a given memory region into a first block and a sentinel last
/// block. Returns the first block, which has its usable space aligned to
/// max_align_t.
Comment on lines +119 to +121
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a LIBC_ASSERT to the end of init that checks the alignment?

Copy link
Contributor Author

@mysterymath mysterymath Jan 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an invariant for all Blocks, so I've checked for this in the block constructor. init necessarily calls out to that constructor, so that should be sufficient to establish it for init.

static optional<Block *> init(ByteSpan region);

/// @returns A pointer to a `Block`, given a pointer to the start of the
Expand Down Expand Up @@ -186,11 +188,19 @@ class Block {
}

/// @returns A pointer to the usable space inside this block.
///
/// Aligned to some multiple of max_align_t.
LIBC_INLINE cpp::byte *usable_space() {
return reinterpret_cast<cpp::byte *>(this) + BLOCK_OVERHEAD;
auto *s = reinterpret_cast<cpp::byte *>(this) + BLOCK_OVERHEAD;
LIBC_ASSERT(reinterpret_cast<uintptr_t>(s) % alignof(max_align_t) == 0 &&
"usable space must be aligned to a multiple of max_align_t");
return s;
}
LIBC_INLINE const cpp::byte *usable_space() const {
return reinterpret_cast<const cpp::byte *>(this) + BLOCK_OVERHEAD;
const auto *s = reinterpret_cast<const cpp::byte *>(this) + BLOCK_OVERHEAD;
LIBC_ASSERT(reinterpret_cast<uintptr_t>(s) % alignof(max_align_t) == 0 &&
"usable space must be aligned to a multiple of max_align_t");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

llvm-project/libc/src/__support/block.h:188: Assertion failed: 'reinterpret_cast<uintptr_t>(s) % alignof(max_align_t) == 0 && "usable space must be aligned to a multiple of max_align_t"' in function: 'const cpp::byte *__llvm_libc_20_0_0_git::Block::usable_space() const'

return s;
}

// @returns The region of memory the block manages, including the header.
Expand All @@ -201,11 +211,12 @@ class Block {
/// Attempts to split this block.
///
/// If successful, the block will have an inner size of at least
/// `new_inner_size`, rounded to ensure that the split point is on an
/// ALIGNMENT boundary. The remaining space will be returned as a new block.
/// Note that the prev_ field of the next block counts as part of the inner
/// size of the returnd block.
optional<Block *> split(size_t new_inner_size);
/// `new_inner_size`. The remaining space will be returned as a new block,
/// with usable space aligned to `usable_space_alignment`. Note that the prev_
/// field of the next block counts as part of the inner size of the block.
/// `usable_space_alignment` must be a multiple of max_align_t.
optional<Block *> split(size_t new_inner_size,
size_t usable_space_alignment = alignof(max_align_t));

/// Merges this block with the one that comes after it.
bool merge_next();
Expand Down Expand Up @@ -248,46 +259,48 @@ class Block {
/// nullptr.
LIBC_INLINE void mark_last() { next_ |= LAST_MASK; }

LIBC_INLINE constexpr Block(size_t outer_size) : next_(outer_size) {
LIBC_INLINE Block(size_t outer_size) : next_(outer_size) {
LIBC_ASSERT(outer_size % ALIGNMENT == 0 && "block sizes must be aligned");
LIBC_ASSERT(is_usable_space_aligned(alignof(max_align_t)) &&
"usable space must be aligned to a multiple of max_align_t");
}

LIBC_INLINE bool is_usable_space_aligned(size_t alignment) const {
return reinterpret_cast<uintptr_t>(usable_space()) % alignment == 0;
}

/// @returns The new inner size of this block that would give the usable
/// space of the next block the given alignment.
LIBC_INLINE size_t padding_for_alignment(size_t alignment) const {
if (is_usable_space_aligned(alignment))
return 0;

// We need to ensure we can always split this block into a "padding" block
// and the aligned block. To do this, we need enough extra space for at
// least one block.
//
// |block |usable_space |
// |........|......................................|
// ^
// Alignment requirement
//
// Returns the minimum inner size necessary for a block of that size to
// always be able to allocate at the given size and alignment.
//
// Returns 0 if there is no such size.
LIBC_INLINE static size_t min_size_for_allocation(size_t alignment,
size_t size) {
LIBC_ASSERT(alignment >= alignof(max_align_t) &&
alignment % alignof(max_align_t) == 0 &&
"alignment must be multiple of max_align_t");

if (alignment == alignof(max_align_t))
return size;

// We must create a padding block to align the usable space of the next
// block. The next block's usable space can be found by advancing by
// sizeof(Block) then aligning up. The amount advanced is the amount of
// additional inner size required.
//
// |block |space |block |usable_space |
// |........|........|........|....................|
// ^
// Alignment requirement
//
alignment = cpp::max(alignment, ALIGNMENT);
uintptr_t start = reinterpret_cast<uintptr_t>(usable_space());
uintptr_t next_usable_space = align_up(start + BLOCK_OVERHEAD, alignment);
uintptr_t next_block = next_usable_space - BLOCK_OVERHEAD;
return next_block - start + sizeof(prev_);
// The minimum advancment is sizeof(Block), since the resulting position may
// happen to be aligned. What is the maximum? Advancing by sizeof(Block)
// leaves the result still aligned to alignof(Block). So the most additional
// advancement required would be if the point is exactly alignof(Block) past
// alignment. The remaining size to the next alignment would then be
// alignment - alignof(Block). So the total maximum advancement required is
// sizeof(Block) + alignment - alignof(Block).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had some trouble thinking about why the max is calculated this way but @mysterymath explained offline and I'm going to write down the explanation which can hopefully be added here to expand on the max calculation.

When picking a new size, we need to ensure we can always pick a point inside that block that is aligned to alignment. We also need to make sure that advancing by the requested size will ensure the address at the end of the new size is Block-aligned (so we can always allocate new Blocks after it).

Assuming the alignment is just one max_align_t, we can always return the size size the new inner size would always be aligned. Otherwise, we need to advance by a little extra space.

The minimum additional advancement is sizeof(Block), since the resulting position may happen to be aligned. This space is also needed just to hold the next block this can be partitioned into.

What is the maximum? We know that requesting an additional size by `alignment` will always ensure there's a point in the new allocation that will be aligned to `alignment`, so we can add that to our new size. However, let's say our new usable_space would already be aligned to `alignment`, then advancing by `alignment` would not be necessary since the usable_space would already be aligned. If instead the usable_space were one byte after an alignment point, then we'd only need to advance the size by at most `alignment - 1` byte to find a point in the new block that can be aligned to `alignment`. Over the contiguous memory region Blocks operate under, any Block split always ensures partitions are at minimum Block-aligned. Therefore, the most we need to advance is actually `alignment - alignof(Block)`. This way, even if advancing by `size + sizeof(Block)` happens to be aligned to `alignment`, we don't need to request the full `alignment`. Leaving out the `alignof(Block)` also ensures the partition after this one will always be Block-aligned.

Feel free to correct/expand on anything I got wrong with my understanding here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I've rewritten this to be more explicit about the goals and mechanisms in light of our discussion.

if (add_overflow(size, sizeof(Block), size))
return 0;
if (add_overflow(size, alignment - alignof(Block), size))
return 0;
return size;
}

// Check that we can `allocate` a block with a given alignment and size from
// this existing block.
bool can_allocate(size_t alignment, size_t size) const;

// This is the return type for `allocate` which can split one block into up to
// three blocks.
struct BlockInfo {
Expand All @@ -309,21 +322,30 @@ class Block {
Block *next;
};

// Divide a block into up to 3 blocks according to `BlockInfo`. This should
// only be called if `can_allocate` returns true.
// Divide a block into up to 3 blocks according to `BlockInfo`. Behavior is
// undefined if allocation is not possible for the given size and alignment.
static BlockInfo allocate(Block *block, size_t alignment, size_t size);

LIBC_INLINE static uintptr_t next_possible_block_start(
uintptr_t ptr, size_t usable_space_alignment = alignof(max_align_t)) {
return align_up(ptr + sizeof(Block), usable_space_alignment) -
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could be overthinking here, but is it possible for ptr + sizeof(Block) causing next_possible_block_start to also overflow? I imagine it's very unlikely, but I'm thinking it's possible through Block::split if min_outer_size is large enough there(?)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feel free to disregard this since this is probably me nitpicking on potential overflows.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These conveniently return the correct answer in modulo arithmetic, so I modified their uses to detect overflow and react accordingly.

sizeof(Block);
}
LIBC_INLINE static uintptr_t prev_possible_block_start(
uintptr_t ptr, size_t usable_space_alignment = alignof(max_align_t)) {
return align_down(ptr, usable_space_alignment) - sizeof(Block);
}

private:
/// Construct a block to represent a span of bytes. Overwrites only enough
/// memory for the block header; the rest of the span is left alone.
LIBC_INLINE static Block *as_block(ByteSpan bytes) {
LIBC_ASSERT(reinterpret_cast<uintptr_t>(bytes.data()) % alignof(Block) ==
0 &&
"block start must be suitably aligned");
return ::new (bytes.data()) Block(bytes.size());
}

/// Like `split`, but assumes the caller has already checked to parameters to
/// ensure the split will succeed.
Block *split_impl(size_t new_inner_size);

/// Offset from this block to the previous block. 0 if this is the first
/// block. This field is only alive when the previous block is free;
/// otherwise, its memory is reused as part of the previous block's usable
Expand All @@ -343,81 +365,46 @@ class Block {
/// previous block is free.
/// * If the `last` flag is set, the block is the sentinel last block. It is
/// summarily considered used and has no next block.
} __attribute__((packed, aligned(cpp::max(alignof(max_align_t), size_t{4}))));
};

inline constexpr size_t Block::BLOCK_OVERHEAD =
align_up(sizeof(Block), ALIGNMENT);

LIBC_INLINE ByteSpan get_aligned_subspan(ByteSpan bytes, size_t alignment) {
if (bytes.data() == nullptr)
return ByteSpan();

auto unaligned_start = reinterpret_cast<uintptr_t>(bytes.data());
auto aligned_start = align_up(unaligned_start, alignment);
auto unaligned_end = unaligned_start + bytes.size();
auto aligned_end = align_down(unaligned_end, alignment);

if (aligned_end <= aligned_start)
return ByteSpan();

return bytes.subspan(aligned_start - unaligned_start,
aligned_end - aligned_start);
}

LIBC_INLINE
optional<Block *> Block::init(ByteSpan region) {
optional<ByteSpan> result = get_aligned_subspan(region, ALIGNMENT);
if (!result)
if (!region.data())
return {};

region = result.value();
// Two blocks are allocated: a free block and a sentinel last block.
if (region.size() < 2 * BLOCK_OVERHEAD)
return {};
uintptr_t start = reinterpret_cast<uintptr_t>(region.data());
uintptr_t end = start + region.size();

if (cpp::numeric_limits<size_t>::max() < region.size())
uintptr_t block_start = next_possible_block_start(start);
uintptr_t last_start = prev_possible_block_start(end);
if (block_start + sizeof(Block) > last_start)
return {};

Block *block = as_block(region.first(region.size() - BLOCK_OVERHEAD));
Block *last = as_block(region.last(BLOCK_OVERHEAD));
auto *last_start_ptr = reinterpret_cast<cpp::byte *>(last_start);
Block *block =
as_block({reinterpret_cast<cpp::byte *>(block_start), last_start_ptr});
Block *last = as_block({last_start_ptr, sizeof(Block)});
block->mark_free();
last->mark_last();
return block;
}

LIBC_INLINE
bool Block::can_allocate(size_t alignment, size_t size) const {
if (inner_size() < size)
return false;
if (is_usable_space_aligned(alignment))
return true;

// Alignment isn't met, so a padding block is needed. Determine amount of
// inner_size() consumed by the padding block.
size_t padding_size = padding_for_alignment(alignment) - sizeof(prev_);

// Check that there is room for the allocation in the following aligned block.
size_t aligned_inner_size = inner_size() - padding_size - BLOCK_OVERHEAD;
return size <= aligned_inner_size;
}

LIBC_INLINE
Block::BlockInfo Block::allocate(Block *block, size_t alignment, size_t size) {
LIBC_ASSERT(
block->can_allocate(alignment, size) &&
"Calls to this function for a given alignment and size should only be "
"done if `can_allocate` for these parameters returns true.");
LIBC_ASSERT(alignment % alignof(max_align_t) == 0 &&
"alignment must be a multiple of max_align_t");

BlockInfo info{block, /*prev=*/nullptr, /*next=*/nullptr};

if (!info.block->is_usable_space_aligned(alignment)) {
Block *original = info.block;
optional<Block *> maybe_aligned_block =
original->split(info.block->padding_for_alignment(alignment));
// The padding block has no minimum size requirement.
optional<Block *> maybe_aligned_block = original->split(0, alignment);
LIBC_ASSERT(maybe_aligned_block.has_value() &&
"This split should always result in a new block. The check in "
"`can_allocate` ensures that we have enough space here to make "
"two blocks.");
"it should always be possible to split for alignment");

if (Block *prev = original->prev_free()) {
// If there is a free block before this, we can merge the current one with
Expand All @@ -441,37 +428,41 @@ Block::BlockInfo Block::allocate(Block *block, size_t alignment, size_t size) {
}

LIBC_INLINE
optional<Block *> Block::split(size_t new_inner_size) {
optional<Block *> Block::split(size_t new_inner_size,
size_t usable_space_alignment) {
LIBC_ASSERT(usable_space_alignment % alignof(max_align_t) == 0 &&
"alignment must be a multiple of max_align_t");

if (used())
return {};
// The prev_ field of the next block is always available, so there is a
// minimum size to a block created through splitting.
if (new_inner_size < sizeof(prev_))
new_inner_size = sizeof(prev_);

size_t old_inner_size = inner_size();
new_inner_size =
align_up(new_inner_size - sizeof(prev_), ALIGNMENT) + sizeof(prev_);
if (old_inner_size < new_inner_size)
return {};

if (old_inner_size - new_inner_size < BLOCK_OVERHEAD)
return {};
size_t old_outer_size = outer_size();

return split_impl(new_inner_size);
}
// Compute the minimum outer size that produces a block of at least
// `new_inner_size`.
size_t min_outer_size = outer_size(cpp::max(new_inner_size, sizeof(prev_)));

LIBC_INLINE
Block *Block::split_impl(size_t new_inner_size) {
size_t outer_size1 = outer_size(new_inner_size);
LIBC_ASSERT(outer_size1 % ALIGNMENT == 0 && "new size must be aligned");
ByteSpan new_region = region().subspan(outer_size1);
uintptr_t start = reinterpret_cast<uintptr_t>(this);
uintptr_t next_block_start =
next_possible_block_start(start + min_outer_size, usable_space_alignment);
size_t new_outer_size = next_block_start - start;
LIBC_ASSERT(new_outer_size % alignof(max_align_t) == 0 &&
"new size must be aligned to max_align_t");

if (old_outer_size < new_outer_size ||
old_outer_size - new_outer_size < sizeof(Block))
return {};

ByteSpan new_region = region().subspan(new_outer_size);
next_ &= ~SIZE_MASK;
next_ |= outer_size1;
next_ |= new_outer_size;

Block *new_block = as_block(new_region);
mark_free(); // Free status for this block is now stored in new_block.
new_block->next()->prev_ = new_region.size();

LIBC_ASSERT(new_block->is_usable_space_aligned(usable_space_alignment) &&
"usable space must have requested alignment");
return new_block;
}

Expand Down
23 changes: 6 additions & 17 deletions libc/src/__support/freelist_heap.h
Original file line number Diff line number Diff line change
Expand Up @@ -89,28 +89,14 @@ LIBC_INLINE void *FreeListHeap::allocate_impl(size_t alignment, size_t size) {
if (!is_initialized)
init();

size_t request_size = size;

// TODO: usable_space should always be aligned to max_align_t.
if (alignment > alignof(max_align_t) ||
(Block::BLOCK_OVERHEAD % alignof(max_align_t) != 0)) {
// TODO: This bound isn't precisely calculated yet. It assumes one extra
// Block::ALIGNMENT to accomodate the possibility for padding block
// overhead. (alignment - 1) ensures that there is an aligned point
// somewhere in usable_space, but this isn't tight either, since
// usable_space is also already somewhat aligned.
if (add_overflow(size, (alignment - 1) + Block::ALIGNMENT, request_size))
return nullptr;
}
size_t request_size = Block::min_size_for_allocation(alignment, size);
if (!request_size)
return nullptr;

Block *block = free_store.remove_best_fit(request_size);
if (!block)
return nullptr;

LIBC_ASSERT(block->can_allocate(alignment, size) &&
"block should always be large enough to allocate at the correct "
"alignment");

auto block_info = Block::allocate(block, alignment, size);
if (block_info.next)
free_store.insert(block_info.next);
Expand All @@ -135,6 +121,9 @@ LIBC_INLINE void *FreeListHeap::aligned_allocate(size_t alignment,
if (size % alignment != 0)
return nullptr;

// The minimum alignment supported by Block is max_align_t.
alignment = cpp::max(alignment, alignof(max_align_t));

return allocate_impl(alignment, size);
}

Expand Down
Loading
Loading