You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Consider two implementations of the same algorithm:
match map.get(&key){Some(old_value) => map.insert(key,merge(old_value, new_value)),None => map.insert(key, new_value),}
match map.entry(key){Entry::Occupied(occupied) => occupied.insert(merge(occupied.get(), new_value)),Entry::Empty(empty) => empty.insert(new_value),};
With std’s HashMap, using entry is more efficient since the returned Entry carries extra information to speed up its methods. The code above effectively only does one "map lookup", whereas the get+insert variant does two.
With OrdMap however, using entry is measurably slower in a real-world Mercurial benchmark. The returned Entry does carry extra information, OccupiedEntry::get just calls OrdMap::get, and OccupiedEntry::insert call OrdMap::get then OrdMap::get_mut. So in the occupied case, the entry-based code does four map lookups instead of one. (Three if get_mut is optimized away as its result is unused.)
Better efficiency through fewer lookups is what motivates the existence of the Entry API in std, but on OrdMap it has the opposite effect.
Could this be optimized, by having OccupiedEntry and VacantEntry each contain a PoolRef<Node> and whatever else is needed to make their methods very cheap? If not, should the entry be considered for removal since instead of bringing efficiency benefits it is actively worse?
The text was updated successfully, but these errors were encountered:
A word of caution since you mentioned real-world, there are some cripling bugs in OrdMap once the tree gains a 3rd level. I've been trying to keep it fixed here #154
Consider two implementations of the same algorithm:
With
std
’sHashMap
, usingentry
is more efficient since the returnedEntry
carries extra information to speed up its methods. The code above effectively only does one "map lookup", whereas theget
+insert
variant does two.With
OrdMap
however, usingentry
is measurably slower in a real-world Mercurial benchmark. The returnedEntry
does carry extra information,OccupiedEntry::get
just callsOrdMap::get
, andOccupiedEntry::insert
callOrdMap::get
thenOrdMap::get_mut
. So in the occupied case, the entry-based code does four map lookups instead of one. (Three ifget_mut
is optimized away as its result is unused.)im-rs/src/ord/map.rs
Lines 1551 to 1558 in 3f4e01a
im-rs/src/ord/map.rs
Lines 1608 to 1615 in 3f4e01a
Better efficiency through fewer lookups is what motivates the existence of the
Entry
API instd
, but onOrdMap
it has the opposite effect.Could this be optimized, by having
OccupiedEntry
andVacantEntry
each contain aPoolRef<Node>
and whatever else is needed to make their methods very cheap? If not, should the entry be considered for removal since instead of bringing efficiency benefits it is actively worse?The text was updated successfully, but these errors were encountered: