Table Micro-optimizations #59

nwinn-student · 2025-11-18T12:09:36Z

nwinn-student
Nov 18, 2025
Maintainer

When looking through table.luau I realized that some serialization aspects could be adjusted to possible squeeze out more performance.

The isArray branch in the loop could be removed and replaced with a tempPos and updates to it. We would need to know to skip a pos though. Likely to cause regressions for arrays.
In the dict section, we could move "type(key)" to as late as possible and remove functions/threads from being added to the constant table. Likely to cause noticable improvement for datasets that use the existing table for keys.

nwinn-student · 2025-11-18T12:20:12Z

nwinn-student
Nov 18, 2025
Maintainer Author

The isArray branch in the loop could be removed and replaced with a tempPos and updates to it. We would need to know to skip a pos though. Likely to cause regressions for arrays.

This optimization sounds too risky, branches are expensive. However, even if the branch predict misses every time the new approach may take more cycles. It needs to read/write to upvalues 2-3 times more per cycle. Luckily it is increment/decrement by 1, so it may be treated as a write and not a read. Still, the benefit to dicts or tables may not be enough. It is merely shifting work from those to arrays.

0 replies

nwinn-student · 2025-11-18T12:28:15Z

nwinn-student
Nov 18, 2025
Maintainer Author

In the dict section, we could move "type(key)" to as late as possible and remove functions/threads from being added to the constant table. Likely to cause noticable improvement for datasets that use the existing table for keys.

"type(element)" is still needed at the top since we need to skip the entire cycle when the element is a thread/function. Else we would need to be able to rollback the changes by shifting pos back. The cost isn't that much though. So maybe we do it, just add a tempPos and in the branch set pos to it.

0 replies

nwinn-student · 2025-11-18T12:37:08Z

nwinn-student
Nov 18, 2025
Maintainer Author

For both serialization and deserialization, localizing the position value could cause performance improvements.

For deserialize, we could localize outside the loop, but we would need to update prior and post to deserialCache.

For serialize, we could do the same.

We do read/write to pos alot, so it could be stored in the L1 cache the entire time, which means localizing won't improve. However, luau is an interpretted language and may not have this sort of optimization yet for upvalues, so the improvement to codegen may be minimal, even causing regressions, but the improvement to non-codegen could be even more.

0 replies

nwinn-student · 2026-03-13T10:31:57Z

nwinn-student
Mar 13, 2026
Maintainer Author

We could pass the id we obtained from table to the respective value functions, which cuts 1 fastcall per value.

It does make break compatibility if a user relies on it, which they would for userdata. Which means we would need to have it as an optional param.

boolean deserialization will be much faster (~15%), but the rest will likely benefit ~2-5%.

0 replies

nwinn-student · 2026-03-21T10:01:10Z

nwinn-student
Mar 21, 2026
Maintainer Author

Although not necessarily 100% on table, but they are impacted the most by this suggestion performance-wise since inflate is triggered far more often.

What if we have a 2MB buffer that we start out with. We can store said buffer in init.luau, we then internally use this buffer for serialization instead of an empty one. We do assume that running in parallel is impossible, but if it is possible we can fallback. This improvement does still allow for us to support the user passing their own buffer/pos.

I expect a minor performance improvement for larger tables (2-5%) and a larger improvement for smaller tables (5-10%).

This idea was inspired by another serialization library I looked into a a few months ago. Likely Sera, but I know not.

0 replies

nwinn-student · 2026-03-21T10:01:29Z

nwinn-student
Mar 21, 2026
Maintainer Author

Although not necessarily 100% on table, but they are impacted the most by this suggestion performance-wise since inflate is triggered far more often.

What if we have a 2MB buffer that we start out with. We can store said buffer in init.luau, we then internally use this buffer for serialization instead of an empty one. We do assume that running in parallel is impossible, but if it is possible we can fallback. This improvement does still allow for us to support the user passing their own buffer/pos.

I expect a minor performance improvement for larger tables (2-5%) and a larger improvement for smaller tables (5-10%).

This idea was inspired by another serialization library (Sera) I looked into a a few months ago. And of course, the JEP draft from issue 8329758 (Faster Startup and Warmup with ZGC) since it reminded me about Sera's approach.

2 replies

nwinn-student Mar 23, 2026
Maintainer Author

2 MB is too much though in some embedded environments. There are also situations where inflate can be too much. The removal of inflate and modification of the starter buffer size is trivial for developers.

nwinn-student Mar 23, 2026
Maintainer Author

Declined. It makes incorrect assumptions about the limitations of the running environment. 2 MB is too much sometimes, it would need to be customizable, but not all users need it. Also removing inflate forces users to set the upper limit, which may not be known or may vary depending on usage.

Table Micro-optimizations #59

Uh oh!

Uh oh!

nwinn-student Nov 18, 2025 Maintainer

Replies: 6 comments · 2 replies

Uh oh!

Uh oh!

nwinn-student Nov 18, 2025 Maintainer Author

Uh oh!

nwinn-student Nov 18, 2025 Maintainer Author

Uh oh!

Uh oh!

nwinn-student Nov 18, 2025 Maintainer Author

Uh oh!

nwinn-student Mar 13, 2026 Maintainer Author

Uh oh!

nwinn-student Mar 21, 2026 Maintainer Author

Uh oh!

Uh oh!

nwinn-student Mar 21, 2026 Maintainer Author

Uh oh!

nwinn-student Mar 23, 2026 Maintainer Author

Uh oh!

nwinn-student Mar 23, 2026 Maintainer Author

nwinn-student
Nov 18, 2025
Maintainer

Replies: 6 comments 2 replies

nwinn-student
Nov 18, 2025
Maintainer Author

nwinn-student
Nov 18, 2025
Maintainer Author

nwinn-student
Nov 18, 2025
Maintainer Author

nwinn-student
Mar 13, 2026
Maintainer Author

nwinn-student
Mar 21, 2026
Maintainer Author

nwinn-student
Mar 21, 2026
Maintainer Author

nwinn-student Mar 23, 2026
Maintainer Author

nwinn-student Mar 23, 2026
Maintainer Author