Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve algorithm to count digits in Long #413

Open
wants to merge 1 commit into
base: develop
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
58 changes: 29 additions & 29 deletions core/common/src/Sinks.kt
Original file line number Diff line number Diff line change
Expand Up @@ -87,35 +87,7 @@ public fun Sink.writeDecimalLong(long: Long) {
negative = true
}

// Binary search for character width which favors matching lower numbers.
var width =
if (v < 100000000L)
if (v < 10000L)
if (v < 100L)
if (v < 10L) 1
else 2
else if (v < 1000L) 3
else 4
else if (v < 1000000L)
if (v < 100000L) 5
else 6
else if (v < 10000000L) 7
else 8
else if (v < 1000000000000L)
if (v < 10000000000L)
if (v < 1000000000L) 9
else 10
else if (v < 100000000000L) 11
else 12
else if (v < 1000000000000000L)
if (v < 10000000000000L) 13
else if (v < 100000000000000L) 14
else 15
else if (v < 100000000000000000L)
if (v < 10000000000000000L) 16
else 17
else if (v < 1000000000000000000L) 18
else 19
var width = countDigitsIn(v)
if (negative) {
++width
}
Expand All @@ -135,6 +107,34 @@ public fun Sink.writeDecimalLong(long: Long) {
}
}

private fun countDigitsIn(v: Long): Int {
val guess = ((64 - v.countLeadingZeroBits()) * 10) ushr 5
return guess + (if (v > DigitCountToLargestValue[guess]) 1 else 0)
Copy link
Collaborator

@fzhinkin fzhinkin Nov 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC from the time I read Romain's blogpost, by extending DigitCountToLargestValue's length to the next power of two (32 in this case) and replacing DigitCountToLargestValue[guess] with DigitCountToLargestValue[guess.and(0x1f)] you can win a few extra percents of performance on JVM (as it should optimize out bounds checks performed on array access).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DigitCountToLargestValue is actually slightly different than the table used in the blogpost:

private val PowersOfTen = longArrayOf(
    0,
    10,
    100,
    1000,
    10000,
    100000,
    1000000,
    10000000,
    100000000,
    1000000000,
    10000000000,
    100000000000,
    1000000000000,
    10000000000000,
    100000000000000,
    1000000000000000,
    10000000000000000,
    100000000000000000,
    1000000000000000000
)

The main reason is that the original table doesn't work when the input is Long.MAX_VALUE, as it's bigger than 10^18 (last value in the array), but 10^19 is outside of the Long range.

I wonder if the one in the PR performs better? Worth benchmarking them against each other?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I meant is that loads from DigitCountToLargestValue table are compiled into a code that checks if an index is within array's bounds before performing a load.
However, if compiler can prove that indices are always in bounds, it'll abstain from generating the check.
By expanding the table to have a power-of-two length (and filling meaningless cells with, let's say, -1) and then explicitly truncating index's most significant bits (i.e., dividing an index by table's length and taking the remainder), we can hint a compiler that a value is always in bounds and it'll generate faster code: https://gist.github.com/fzhinkin/42997a2cfc18a437f88e9c31bef969c9

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW I checked and on Android the power-of-two array + truncation doesn't remove the bounds check. It just adds an extra instruction. See https://godbolt.org/z/jdTzMcxbf

}

private val DigitCountToLargestValue = longArrayOf(
-1, // Every value has more than 0 digits.
9L, // For 1 digit (index 1), the largest value is 9.
99L,
999L,
9999L,
99999L,
999999L,
9999999L,
99999999L,
999999999L,
9999999999L,
99999999999L,
999999999999L,
9999999999999L,
99999999999999L,
999999999999999L,
9999999999999999L,
99999999999999999L,
999999999999999999L, // For 18 digits (index 18), the largest value is 999999999999999999.
Long.MAX_VALUE, // For 19 digits (index 19), the largest value is MAX_VALUE.
)

/**
* Writes [long] to this sink in hexadecimal form (i.e., as a string in base 16).
*
Expand Down
36 changes: 36 additions & 0 deletions core/common/test/AbstractSinkTest.kt
Original file line number Diff line number Diff line change
Expand Up @@ -343,6 +343,42 @@ abstract class AbstractSinkTest internal constructor(
assertLongDecimalString("10000000000000000", 10000000000000000L)
assertLongDecimalString("100000000000000000", 100000000000000000L)
assertLongDecimalString("1000000000000000000", 1000000000000000000L)
assertLongDecimalString("-9", -9L)
assertLongDecimalString("-99", -99L)
assertLongDecimalString("-999", -999L)
assertLongDecimalString("-9999", -9999L)
assertLongDecimalString("-99999", -99999L)
assertLongDecimalString("-999999", -999999L)
assertLongDecimalString("-9999999", -9999999L)
assertLongDecimalString("-99999999", -99999999L)
assertLongDecimalString("-999999999", -999999999L)
assertLongDecimalString("-9999999999", -9999999999L)
assertLongDecimalString("-99999999999", -99999999999L)
assertLongDecimalString("-999999999999", -999999999999L)
assertLongDecimalString("-9999999999999", -9999999999999L)
assertLongDecimalString("-99999999999999", -99999999999999L)
assertLongDecimalString("-999999999999999", -999999999999999L)
assertLongDecimalString("-9999999999999999", -9999999999999999L)
assertLongDecimalString("-99999999999999999", -99999999999999999L)
assertLongDecimalString("-999999999999999999", -999999999999999999L)
assertLongDecimalString("-10", -10L)
assertLongDecimalString("-100", -100L)
assertLongDecimalString("-1000", -1000L)
assertLongDecimalString("-10000", -10000L)
assertLongDecimalString("-100000", -100000L)
assertLongDecimalString("-1000000", -1000000L)
assertLongDecimalString("-10000000", -10000000L)
assertLongDecimalString("-100000000", -100000000L)
assertLongDecimalString("-1000000000", -1000000000L)
assertLongDecimalString("-10000000000", -10000000000L)
assertLongDecimalString("-100000000000", -100000000000L)
assertLongDecimalString("-1000000000000", -1000000000000L)
assertLongDecimalString("-10000000000000", -10000000000000L)
assertLongDecimalString("-100000000000000", -100000000000000L)
assertLongDecimalString("-1000000000000000", -1000000000000000L)
assertLongDecimalString("-10000000000000000", -10000000000000000L)
assertLongDecimalString("-100000000000000000", -100000000000000000L)
assertLongDecimalString("-1000000000000000000", -1000000000000000000L)
}

private fun assertLongDecimalString(string: String, value: Long) {
Expand Down