Improve `Vec2` #9830

Dunqing · 2025-03-17T12:59:42Z

overlookmotel · 2025-03-17T15:51:43Z

Personally, I would say the best next step would be to update all the methods so they follow std. After that, we can start changing Vec to our desired shape and simplifying it, removing ZSTs support, optimizing etc. But if we start from std's implementation, we know we're starting from a good place.

The only thing I'm aware of which will be really tricky porting from std is the Extend trait. std::vec::Vec has specialized implementations of extend for slice iterators etc, where the number of elements the iterator yields is guaranteed - so it can cut out bounds checks etc. But this requires nightly features (specialization and TrustedLen trait), so unfortunately it's not possible for us.

To get some of those optimizations, we'd need separate methods e.g. extend_from_slice (which std has) and extend_from_array (which it doesn't).

The only thing I think that's missing from your list is removing the #[expect(...)] attrs and fixing the warnings that are currently suppressed by them. But, in my opinion, probably best to do std changes first - hopefully that will resolve most of the things which would cause lint warnings anyway.

overlookmotel · 2025-03-17T15:57:16Z

It would also be good to remove Bump from the implementation - so we completely divorce Vec from bumpalo.

We could wrap Bump in another structure called e.g. Arena and Vec::new_in etc could take an &'a Arena instead of &'b Bump. resize could become a method on Arena. That would make it easier later on to turn Arena into our own custom allocator, and remove bumpalo entirely.

By the way, the reason why I think we should have separate Allocator and Arena types is that I have in mind further down the line that we may want Allocator to contain multiple arenas. e.g.:

struct Allocator {
    /// Stores the AST (same as now)
    ast: Arena,
    /// Stores `Semantic`
    semantic: Arena,
    /// "Scratch" space for temporary data in transformer etc, to avoid heap allocations
    temp: Arena,
}

But we can make Allocator deref to Arena, so you can also call Vec::new_in with an &Allocator. So there'd be no API changes for all our code that uses Vec.

How about we split the work - I'll work on the Arena changes, and you work on Vec?

overlookmotel · 2025-03-17T16:10:11Z

A couple more vague thoughts:

`Vec` and `RawVec`

The len and capacity fields are going to have to be in the same struct to be able to have them as u32s and reduce the size of Vec. That's inescapable, but is unfortunate in a way, because the Vec / RawVec division is useful for 2 reasons:

"Division of labour" with RawVec handling the low-level tasks of allocating etc, and Vec adding higher-level abstractions on top of it.
RawVec is not generic over T, so it is better for compilation times.

I wonder if we can preserve this multi-level approach somehow? (not sure how)

Index Vec?

If we're changing the type of len and capacity, do we want to make that generic rather than a hard-coded u32?

We'd then also have an arena-allocated IndexVec<K, T> type, using the same code.

Or maybe that's unnecessary complication. All our IndexVec keys are NonMaxU32 anyway, so we don't actually need that flexibility.

Dunqing · 2025-03-18T01:51:48Z

A couple more vague thoughts:

Vec and RawVec

The len and capacity fields are going to have to be in the same struct to be able to have them as u32s and reduce the size of Vec. That's inescapable, but is unfortunate in a way, because the Vec / RawVec division is useful for 2 reasons:

"Division of labour" with RawVec handling the low-level tasks of allocating etc, and Vec adding higher-level abstractions on top of it.

RawVec is not generic over T, so it is better for compilation times.

I wonder if we can preserve this multi-level approach somehow? (not sure how)

I was thinking of changing the RawVec to a kind of a trait, then we can still impl the trait for other higher-level abstractions, but I am not sure, I will try while working on the above tasks

Index Vec?

If we're changing the type of len and capacity, do we want to make that generic rather than a hard-coded u32?

We'd then also have an arena-allocated IndexVec<K, T> type, using the same code.

Or maybe that's unnecessary complication. All our IndexVec keys are NonMaxU32 anyway, so we don't actually need that flexibility.

Dunqing · 2025-03-18T01:54:03Z

It would also be good to remove Bump from the implementation - so we completely divorce Vec from bumpalo.

We could wrap Bump in another structure called e.g. Arena and Vec::new_in etc could take an &'a Arena instead of &'b Bump. resize could become a method on Arena. That would make it easier later on to turn Arena into our own custom allocator, and remove bumpalo entirely.

By the way, the reason why I think we should have separate Allocator and Arena types is that I have in mind further down the line that we may want Allocator to contain multiple arenas. e.g.:

struct Allocator {
/// Stores the AST (same as now)
ast: Arena,
/// Stores Semantic
semantic: Arena,
/// "Scratch" space for temporary data in transformer etc, to avoid heap allocations
temp: Arena,
}
But we can make Allocator deref to Arena, so you can also call Vec::new_in with an &Allocator. So there'd be no API changes for all our code that uses Vec.

How about we split the work - I'll work on the Arena changes, and you work on Vec?

That's a good idea, I am okay to work on Vec changes. I guess this should be started when all tasks are done, right?

Dunqing · 2025-03-18T01:55:04Z

Personally, I would say the best next step would be to update all the methods so they follow std. After that, we can start changing Vec to our desired shape and simplifying it, removing ZSTs support, optimizing etc. But if we start from std's implementation, we know we're starting from a good place.

The only thing I'm aware of which will be really tricky porting from std is the Extend trait. std::vec::Vec has specialized implementations of extend for slice iterators etc, where the number of elements the iterator yields is guaranteed - so it can cut out bounds checks etc. But this requires nightly features (specialization and TrustedLen trait), so unfortunately it's not possible for us.

To get some of those optimizations, we'd need separate methods e.g. extend_from_slice (which std has) and extend_from_array (which it doesn't).

The only thing I think that's missing from your list is removing the #[expect(...)] attrs and fixing the warnings that are currently suppressed by them. But, in my opinion, probably best to do std changes first - hopefully that will resolve most of the things which would cause lint warnings anyway.

Thanks for your suggestions, I've updated the checklist and adjusted the order.

Dunqing added the A-ast Area - AST label Mar 17, 2025

Dunqing self-assigned this Mar 17, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve `Vec2` #9830

Improve `Vec2` #9830

Dunqing commented Mar 17, 2025 •

edited

Loading

overlookmotel commented Mar 17, 2025 •

edited

Loading

overlookmotel commented Mar 17, 2025

overlookmotel commented Mar 17, 2025 •

edited

Loading

Dunqing commented Mar 18, 2025

`Vec` and `RawVec`

Index Vec?

Dunqing commented Mar 18, 2025

Dunqing commented Mar 18, 2025 •

edited

Loading

Improve Vec2 #9830

Improve Vec2 #9830

Comments

Dunqing commented Mar 17, 2025 • edited Loading

overlookmotel commented Mar 17, 2025 • edited Loading

overlookmotel commented Mar 17, 2025

overlookmotel commented Mar 17, 2025 • edited Loading

Vec and RawVec

Index Vec?

Dunqing commented Mar 18, 2025

Vec and RawVec

Index Vec?

Dunqing commented Mar 18, 2025

Dunqing commented Mar 18, 2025 • edited Loading

Improve `Vec2` #9830

Improve `Vec2` #9830

Dunqing commented Mar 17, 2025 •

edited

Loading

overlookmotel commented Mar 17, 2025 •

edited

Loading

overlookmotel commented Mar 17, 2025 •

edited

Loading

`Vec` and `RawVec`

`Vec` and `RawVec`

Dunqing commented Mar 18, 2025 •

edited

Loading