[WIP] Cache runes in chars mode #252

triallax · 2025-07-28T01:15:47Z

WIP implementation of my proposal in #35 (comment)

unsophisticated and needs further testing and benchmarking, but already quite promising: running gron.awk on a big JSON file in chars mode goes from taking 12(!) minutes to 4 seconds on my laptop with this branch in its current form.

triallax · 2025-07-28T01:24:12Z

using caching for the other functions (e.g. index) is a bit more complicated, because they require knowing which bytes in the string map to which rune. for example, to use the cached runes in e.g. index("مرحبا", "ر"), we need to use strings.Index which takes in a string and not a slice of runes, and then need to somehow map from the "byte index" to the rune/character index

an alternative in this case is to convert the needle to runes and search for it in the haystack slice of runes, but i'm not sure if that has subtle edge cases we may care about (e.g. string normalisation? does strings.Index even handle that? do we care or even want this in the first place?)

on the other hand even with just implementing this optimisation for length and substr we're already landing decent improvements, so maybe the status quo is fine

thoughts?

triallax · 2025-08-15T00:27:08Z

@benhoyt gentle bump, no rush but i'd love to know how you think this looks for a prototype

benhoyt · 2025-08-15T20:55:38Z

Hi @triallax, I'm not opposed to the approach itself. However, I guess the main thing I'm worried about is that it significantly slows down normal (bytes-mode) performance, due I suppose to all the additional allocations from the new([]rune) when creating each new value. I did a test locally (with simple.awk from https://github.com/benhoyt/countwords and kjvbible_x10.txt) and in bytes mode it was about twice as slow, or took twice as long, with this change.

We could try to have a switch in the interpreter which avoids the allocation if in bytes mode. The wiring for that might be annoying, but at least we could test performance. Even then I'd want to make sure the if/else in the hot path didn't slow bytes-mode down much -- GoAWK's been fairly focussed on performance, and I'd hate to regress much.

In any case, it'd be good if you could show some performance numbers, and we could come up with a way to avoid the extra new() / allocation for every string value.

Cache runes in chars mode

0f06d1c

triallax force-pushed the runes-caching branch from f98acfa to 0f06d1c Compare July 28, 2025 01:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[WIP] Cache runes in chars mode #252

[WIP] Cache runes in chars mode #252

Uh oh!

triallax commented Jul 28, 2025 •

edited

Loading

Uh oh!

triallax commented Jul 28, 2025

Uh oh!

triallax commented Aug 15, 2025

Uh oh!

benhoyt commented Aug 15, 2025

Uh oh!

Uh oh!

Uh oh!

[WIP] Cache runes in chars mode #252

Are you sure you want to change the base?

[WIP] Cache runes in chars mode #252

Uh oh!

Conversation

triallax commented Jul 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

triallax commented Jul 28, 2025

Uh oh!

triallax commented Aug 15, 2025

Uh oh!

benhoyt commented Aug 15, 2025

Uh oh!

Uh oh!

triallax commented Jul 28, 2025 •

edited

Loading