Components as entities (v2)#24728
Conversation
…ents-as-entities-alt
Performance TestingI did some reasonably extensive performance testing. I ran both the entire ecs benchmark and several of our stress test examples. I've got the following data:
And the three examples I ran through were
ConclusionFrom the measurements I took I've come to the conclusion that this PR mostly keeps performance the same. At worst it degrades slightly (mostly based on the microbenchmarks). The best solution to fix the resource performance issue is likely still going to be #24058 or something like it. However! I still think this PR is worth merging, because
Important NoteWhen review and discussing this PR, I prefer if you mostly kept it anchored to some line of code (even if only vaguely related). This makes it easier to follow particular threads of argument. |
There was a problem hiding this comment.
Looks good! I don't believe the #24102 is necessary for this, and can be left for later.
With this change, it becomes possible to express uniqueness regarding ComponentId collections/iterators as well. There are now likely several sections of code where it would make sense to convert the previous ComponentId collection/iterator types to their EntitySet version. That would make for a good follow-up PR!
One place for which this is especially relevant is DynamicComponentFetch, the current dynamic way to access the Components of an entity disjointly.
This functionality should fall under similar design considerations as #18234, though that PR has gotten stalled.
chescock
left a comment
There was a problem hiding this comment.
Oh, nice! Reserving the Entity early and then using spawn_at is a really nice approach. I was worried this would be really complex, but it's very readable and even net negative lines!
I'm surprised that using HashMap for access didn't affect the benchmarks! It might be because all of the access checks are done at startup, while we mostly benchmark steady-state performance.
Most of my comments are just style nits, but I do think we should have a clear story for how users should spawn resource entities that have additional components, and we should try to ensure that ResMut<R> and Query<&mut R, Without<IsResource>> are sound.
| (true, true) => self_set.bitand_assign(other_set), | ||
| (true, false) => self_set.sub_assign(other_set), | ||
| (false, true) => { | ||
| *self_inverted = true; | ||
| self_set.difference_from(other_set); | ||
| *self_set = other_set.clone().sub(self_set); | ||
| } | ||
| (false, false) => self_set.union_with(other_set), | ||
| (false, false) => self_set.bitor_assign(other_set), |
There was a problem hiding this comment.
Since these are the operator traits in core::ops, would it be more clear to use the actual operators? (Here and elsewhere.)
| (true, true) => self_set.bitand_assign(other_set), | |
| (true, false) => self_set.sub_assign(other_set), | |
| (false, true) => { | |
| *self_inverted = true; | |
| self_set.difference_from(other_set); | |
| *self_set = other_set.clone().sub(self_set); | |
| } | |
| (false, false) => self_set.union_with(other_set), | |
| (false, false) => self_set.bitor_assign(other_set), | |
| (true, true) => *self_set &= other_set, | |
| (true, false) => *self_set -= other_set, | |
| (false, true) => { | |
| *self_inverted = true; | |
| *self_set = other_set - self_set; | |
| } | |
| (false, false) => *self_set |= other_set, |
(Oh, neat, GitHub finally lets me add a suggestion that crosses deleted lines! ... It does not render it very helpfully, though.)
There was a problem hiding this comment.
I personally prefer union_with, difference_with and intersect_with, as they more clearly communicate the idea of sets compared to bit operations. I did have to resort to using them for the non-in-place methods as the .difference method simply doesn't return the right type, but it should still be an improvement in readabillity.
There was a problem hiding this comment.
I personally prefer
union_with,difference_withandintersect_with, as they more clearly communicate the idea of sets compared to bit operations. I did have to resort to using them for the non-in-place methods as the.differencemethod simply doesn't return the right type, but it should still be an improvement in readabillity.
Yeah, I prefer those, too! Would it be worth leaving the ComponentIdSet wrapper in place so that we can preserve those names (and so we can use ComponentId instead of &ComponentId)?
If you don't like the operators, another option is to write them out using extend, retain, or difference().copied().collect(). I just find bitor_assign even harder to understand than |=, so I really want to avoid calling operator methods by name.
There was a problem hiding this comment.
Currently, I've added union_with, difference_with and intersect_with to EntityEquivalentHashSet, which works fine enough. Take a look at the access file again to see how it looks.
| let resource = func(self); | ||
| move_as_ptr!(resource); | ||
| let entity_mut = self.spawn_with_caller(resource, caller); // ResourceCache is updated automatically | ||
| let entity_mut = self.spawn_at_with_caller(entity, resource, caller).unwrap(); |
There was a problem hiding this comment.
If someone despawns a resource, these spawn_at(...).unwrap() calls will panic, right? Ah, but that's what we do today in the IsResource hook, so that's no worse.
Do we want a "Resource entity {} of {} has been despawned, when it's not supposed to be." message in the panic, like we have in the IsResource hooks?
There was a problem hiding this comment.
In this specific case, it's impossible for the unwrap to panic, because at the top of the function it calls register_component, which allocates the entity. I've switched to spawn_at_unchecked and added an explanation in a comment.
I'll do the same for the other spawn_at(entity).unwrap() or, if they really could fail, throw an appropriate error.
There was a problem hiding this comment.
In this specific case, it's impossible for the unwrap to panic, because at the top of the function it calls
register_component, which allocates the entity.
I meant if it was previously initialized, and then despawned, and then this was called. I was able to get this unwrap() to fail with:
#[derive(Resource, Default)]
struct R;
let mut world = World::new();
world.init_resource::<R>();
let entity = world.register_component::<R>().entity();
let mut entity = world.entity_mut(entity);
entity.remove::<IsResource>();
entity.despawn();
world.init_resource::<R>();There was a problem hiding this comment.
Hmm, would it work to have the on_discard hook just ... panic? Most of the edge cases I'm coming up with involve removing IsResource or despawning, and a panic there should prevent those from happening. It's not great because there's no place to handle the panic. But allowing the removal just puts you in a place where something else will panic soon, which isn't all that much better.
There was a problem hiding this comment.
Given that there currently isn't any way to remove components properly, I think you might be right. During resources as components, we settled on warnings as the harshest error we could throw. But any added component to world.components will just stay there, becoming a problem for the rest of the application's runtime.
| #[cfg_attr(feature = "bevy_reflect", derive(Reflect), reflect(Component, Debug))] | ||
| #[derive(Component, Debug)] | ||
| #[component(on_insert, on_discard, on_despawn)] | ||
| pub struct IsResource(ComponentId); |
There was a problem hiding this comment.
This would be for a follow-up and not this PR, but: Do we still need IsResource?
At the very least, we must not need it to store ComponentId, since that's always going to be equal to the Entity, right?
There was a problem hiding this comment.
This is a good question and depends quite a bit on where we want to have the logic that handles uniqueness. Right now, we need some hooks to ensure that a resource isn't spawned where it isn't supposed to be spawned.
If we put this logic on a component hook on every Resource (which I think makes sense), this is trivial. You just check if context.component_id matches context.entity.
But you only get one hook, and a user might want to use that hook for their own stuff. Using IsResource like this offloads this logic to another component and leaves room for user hooks on resources. However, the link between them is much weaker (what if you remove IsResource, but keep the resource? and other questions), and it would still need to keep a hold of ComponentId.
I think the former is a better solution, but we currently offer the latter, so changing this would be a regression. This is not major, since lifetime observers exist and you get an unlimited amount of them, but observers aren't quite as good as hooks in some cases.
There was a problem hiding this comment.
Yeah, that makes sense! Plus I guess we still want it for Without<IsResource> filters.
But you only get one hook
This actually got fixed! #24000 lets relation hooks get combined with user-defined hooks, so the plumbing is in place if we ever wanted to add hooks to all resources!
There was a problem hiding this comment.
Cool! I think that the solution I want to move to then is as follows: Keep IsResource but make it a ZST, and move the uniqueness logic over to a hook on each resource. Following #24000 we can still allow people to add a hook if they want to.
All this, of course, is not for this PR, but a seperate one.
| schedule.add_systems(iter); | ||
| }); | ||
| add_archetypes(&mut world, archetype_count); | ||
| world.clear_entities(); |
There was a problem hiding this comment.
Is clear_entities() completely unusable now? Any resource that had been spawned can never be created again, right? Should we just remove it completely?
And, should this bench still despawn the entities that are spawned by add_archetypes? We could presumably add a marker component to identify them.
There was a problem hiding this comment.
It is completely unusable except for the one niche use case of benchmarking. In the case of add_archetypes I was able to maintain the behaviour by adding e.despawn() within add_archetypes.
I do not want to redo a bunch of benchmarks right now, but I agree we should remove (or alter) clear_entities in the future.
| self.components.resize_with(least_len, || None); | ||
| // SAFETY: The id has never been registered before. | ||
| unsafe { | ||
| self.components.insert_unique_unchecked(id, info); |
There was a problem hiding this comment.
How important is the perf from using insert_unique_unchecked here instead of an ordinary insert? I'd be inclined to use the safer version if we can.
There was a problem hiding this comment.
There is a genuine performance difference, and I think it'd be best to keep using it.
However, "never having been registered before" meaning "does not exist in this collection" should be more extensively documented somewhere.
There was a problem hiding this comment.
There is a genuine performance difference, and I think it'd be best to keep using it.
Does that difference matter for our case? The linked issue says it makes the insert 30% faster, but that's only a part of component registration, and I'm not sure we're even all that perf-sensitive for something that only runs once per component.
And thinking about it more, I'm not sure we can guarantee that the id is unique. A user can call world.entity_allocator_mut().free(entity) from safe code with the same Entity in a loop, and then register_component could allocate that same Entity for multiple components! It's fine to panic for something that bizarre, but I don't think it should allow UB.
Which also means we need to panic rather than silently overwrite if the key exists, so something like try_insert(id, info).expect(...).
There was a problem hiding this comment.
Registering components does not happen often, so I think it's fine to trade a minor bit of perf for some peace of mind with regards to UB.
| // We only expose `&ResourceCache` to code with access to a resource (such as `&World`), | ||
| // and that would conflict with the `DeferredWorld` passed to the resource hook. | ||
| unsafe { &mut *cache.0.get() }.insert(resource_component_id, context.entity); | ||
| if original_entity != context.entity { |
There was a problem hiding this comment.
Does this mean world.spawn((R, OtherComponents)) no longer works for adding a resource that also has other components? It needs to be something like
let entity = world.register_component::<R>().entity();
world.spawn_at(entity, (R, OtherComponents));?
That should go in the migration guide, but maybe we also need a convenience method like world.insert_resource_with(R, OtherComponents) or world.get_or_spawn_resource_entity::<R>().insert((R, OtherComponents)).
There was a problem hiding this comment.
Correct.
I'm hesitant to add methods like world.insert_resource_with, because currently bevy_world_serialize drops any extra components you add to a resource entity during serialization. I'd much prefer fixing that bug before adding convenience methods.
There was a problem hiding this comment.
This is the same reason why I didn't add required components in 0.19.
UPDATE: nevermind someone else did #24322, let's just hope no one notices before 0.20.
| pub unsafe fn get_resource_by_id(self, component_id: ComponentId) -> Option<Ptr<'w>> { | ||
| // SAFETY: We have permission to access the resource of `component_id`. | ||
| let entity = unsafe { self.resource_entities() }.get(component_id)?; | ||
| let entity = component_id.entity(); |
There was a problem hiding this comment.
How are we ensuring soundness for things like ResMut<R> and Query<&mut R, Without<IsResource>>?
Previously that was handled by IsResource being the source of truth for the resource_entities map, so if the IsResource component got removed then ResMut would no longer see the resource. But now we're going directly to the Entity, so I think it will be found by ResMut even if IsResource is missing and it will also match the Query.
There was a problem hiding this comment.
We are ensuring soundness just like we had before, since IsResource is still the source of truth, which is the only thing (together with the entity) queries care about.
There was a problem hiding this comment.
We are ensuring soundness just like we had before, since
IsResourceis still the source of truth, which is the only thing (together with the entity) queries care about.
But Res and ResMut don't check that IsResource is present, so they aren't seeing that truth. In 0.19 that's still correct because removing IsResource will remove it from the ResourceEntities cache, but with this PR they can find the resource directly from the ComponentId.
... I had started writing a very complex reproduction here, but then I remembered that Resource isn't an unsafe trait. A user can just manually impl Resource for Foo without adding IsResource as a required component! In 0.19, that would mean Res wouldn't find the resource, but with these changes it can cause mutable aliasing.
#[derive(Component, Default)]
struct R(usize);
impl Resource for R {}
let mut world = World::new();
world.init_resource::<R>();
world
.run_system_cached(
|res: Option<Res<R>>, single: Option<Single<&mut R, Without<IsResource>>>| {
let r1 = res.map(|r| ptr::from_ref(&*r));
let r2 = single.map(|mut s| ptr::from_mut(&mut **s).cast_const());
// These values are the same, so the `&mut` and `&` were aliasing!
println!("{r1:?} {r2:?}");
},
)
.unwrap();There was a problem hiding this comment.
Right, I see this comment of yours, @chescock:
#20934 (comment)
Back then it seems to have been decided that relying on the cache as a source of (past) truth was the way to go.
It seems to me that we would then be reliant on such a cache implementation if we don't come up with a proper solution to this problem without such caches.
I have felt that conceptually, this problem has been the same, and quite simple (!= easy), from the start:
We want to access entity data within the process of borrowing entity data.
We currently have 2 ways of "returning" entity data, & and &mut. Both can interfere with one another.
However, there really is a third, that is data that requires no access. This has somewhat come up before.
So I have wondered, can we solve this problem with this third option?
In my understanding, data like Entity, &Archetype and EntityLocation require no access because they are used by the data retrieval process itself/are encoded in the ECS infrastructure itself.
ResourceEntities functioned similarly.
The X-As-Entities efforts are struggling because they try to move ECS-wise "access-less" patterns from outside to the inside of the ECS, where we do not yet have a matching, general "access-less" retrieval pattern.
My point being, I think to properly solve this problem we need to explicitly avoid using access when retrieving entity data.
This either means we need to be outside the ECS, or add a general way to store and retrieve (probably Copy) data to the ECS.
I'm curious about the latter, though one could likely also mix the two approaches.
What do you think?
There was a problem hiding this comment.
Before this conversation becomes too speculative, I would like to ask to @chescock in particular: What things do still need to happen in this PR before you give your approval?
I'm of course happy to answer questions, but these discussions have a tendency to become abstract and I'd rather focus on the code.
In regards to the manual impl Resource problem, this is already a bug (#24686) which this PR doesn't intend to solve.
There was a problem hiding this comment.
I would like to ask to @chescock in particular: What things do still need to happen in this PR before you give your approval?
Oh, sorry if I came off as too negative! I mostly just haven't had time to do another careful review since you updated it. But I wanted to try to reply to your comments when I could.
My main sticking point will be soundness. If I can think of a way to cause UB with only safe code, I'll whine about it :). But I'm willing to approve anyway and let the SMEs make the final call if you argue that the UB is acceptable!
In this thread, I was mostly trying to establish the facts. I thought there was new UB, but you thought that nothing had changed, so I thought I had failed to explain my concern and owed you a concrete test case.
In regards to the manual
impl Resourceproblem, this is already a bug (#24686) which this PR doesn't intend to solve.
The difference here is that we're relying on it for soundness. We could resolve that by making it be unsafe trait Resource, though, and making it a safety requirement to include IsResource as a required component.
I think unsafe trait Resource plus the panic in on_discard will actually make it impossible to have an impl Resource component without IsResource is safe code, closing most of the holes!
The last hole is FilteredResources. The simplest fix is probably to modify get_by_id and get_mut_by_id to look up the entity and check for the presence of an IsResource component. Alternately, we could justify removing it in favor of Query<FilteredEntityRef, With<IsResource>>, since query.get(component_id.entity())?.get_by_id(component_id)? is now possible!
There was a problem hiding this comment.
To be clear, I agree that in the current state of the PR there is UB, and I see UB as blocking to this PR.
Making the Resource trait unsafe may be part of the solution, but I am unsure; do we rely on required components for soundness anywhere already? Under my impression (and this may very well be wrong, I don't have overview there), this infrastructure was in the space of "if something goes wrong, that is a logic error".
If so, we should be careful to make sure that the required component space aligns with being designed for soundness, otherwise we may run into other design complications over there.
Oh, wow, that's been stalled for more than a year. And I still don't feel motivated to pick it back up, so I'm just going to go mark it |
There was a problem hiding this comment.
Not related to this PR, but I discovered while writing one of those tests that this remove command will panic if the resource entity is being despawned. Which we aren't trying to support, but it gives a somewhat unhelpful panic message.
See #23988.
Change
The primary difference between this and #23988, is that this is not reliant on Entity Ranges (#24102). Because of this, there is a possibility for a performance regression with regards to
Components.componentsgoing from aSparseArraytoHashMap. There is an additional hash operation. Additionally, the fields onAccesshave changed from aFixedBitSetto aHashSet, so the set operations are likely slower.