Skip to content

Faster integer formatting #767

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Mar 25, 2025
Merged

Faster integer formatting #767

merged 2 commits into from
Mar 25, 2025

Conversation

radiospiel
Copy link
Contributor

@radiospiel radiospiel commented Mar 16, 2025

This PR provides an alternative implementation for a long → decimal conversion.

The main difference is that it uses an algorithm pulled from https://github.com/jeaiii/itoa. The source there is C++, it was converted by hand to C for inclusion with this gem. A writeup of this algorithm can be found here. jeaiii's algorithm is covered by the MIT License, see source code.

On addition this version now also generates the string directly into the fbuffer, foregoing the need to run a separate memory copy.

As a result, I see a speedup of 32% on Apple Silicon M1 for an integer set of benchmarks.

== Encoding ints (45025 bytes)
ruby 3.4.1 (2024-12-25 revision 48d4efcb85) +YJIT +PRISM [arm64-darwin24]
Warming up --------------------------------------
        json (local)     1.566k i/100ms
Calculating -------------------------------------
        json (local)     16.322k (± 3.9%) i/s   (61.27 μs/i) -     82.998k in   5.093499s

Normalize to 8009 byte
== Encoding ints (45025 bytes)
ruby 3.4.1 (2024-12-25 revision 48d4efcb85) +YJIT +PRISM [arm64-darwin24]
Warming up --------------------------------------
       json (2.10.2)     1.228k i/100ms
Calculating -------------------------------------
       json (2.10.2)     12.352k (± 0.5%) i/s   (80.96 μs/i) -     62.628k in   5.070472s

These benchmarks are run via a script (link) which is based on the gem's benchmark/encoder.rb file. There are probably better ways to run benchmarks :) My version allows to combine multiple test cases into a single one.

The script adds a "cents" test case (link) which explicitly covers integers in a range that is relevant for financial institutions. (Comparing benchmarks for that case show a speedup of 41%)

The dumps benchmark, which covers the JSON files in benchmark/data/*.json, sees a speedup of ~8 %.

@radiospiel radiospiel changed the title Adds an improved ltoa implementation Adds an improved ltoa implementation [NOT READY YET] Mar 16, 2025
@radiospiel radiospiel changed the title Adds an improved ltoa implementation [NOT READY YET] Faster integer formatting [NOT READY YET] Mar 16, 2025
@radiospiel radiospiel force-pushed the eno/pr/fixnums branch 2 times, most recently from f68e7bc to a79d618 Compare March 16, 2025 18:52
@radiospiel radiospiel changed the title Faster integer formatting [NOT READY YET] Faster integer formatting Mar 16, 2025
@radiospiel
Copy link
Contributor Author

Note: see this for discussing licensing concerns. Marking this PR as draft for the time being.

@radiospiel radiospiel marked this pull request as draft March 17, 2025 08:52
@radiospiel
Copy link
Contributor Author

radiospiel commented Mar 21, 2025

The windows build errors have been fixed. This PR is ready for review.

The changes are most impactful for long integers. A tests on 19 digit numbers shows a 65% speedup; the "cents" test (a mix of numbers between 1 and 7 digits) has 41%. On 2-digit numbers the gains are still ~11%. (all tested on macOS, Apple Silicon M1)

I think a 32-bit version could be built and might show some small gains. The C++ version, which is the base for these changes, specialises templates for various integer widths. I didn't want to go /that/ far here.

@radiospiel radiospiel marked this pull request as ready for review March 21, 2025 23:06
This commit provides an alternative implementation for a
long → decimal conversion.

The main difference is that it uses an algorithm pulled from
https://github.com/jeaiii/itoa.
The source there is C++, it was converted by hand to C for
inclusion with this gem.
jeaiii's algorithm is covered by the MIT License, see source code.

On addition this version now also generates the string directly into
the fbuffer, foregoing the need to run a separate memory copy.

As a result, I see a speedup of 32% on Apple Silicon M1 for an
integer set of benchmarks.
Some relatively minor change to make the library more in line
with the gem. Some renaming, etc.
@byroot byroot merged commit 2614b12 into ruby:master Mar 25, 2025
33 checks passed
@radiospiel radiospiel deleted the eno/pr/fixnums branch March 27, 2025 15:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants