-
Notifications
You must be signed in to change notification settings - Fork 6
Description
Hello Alex! I checked the claims on the Benchmarks page regarding sqlean-re
being 6.57x slower than sqlite-regex
and found the results unreproducible. Also, the test itself is very unfair.
The unfair test
I'll start with the latter, as it is independent of the hardware. You are benchmarking sqlite-regex
with this pattern:
\d{4}-\d{2}-\d{2}
While for sqlean-re
, you use this pattern:
([0-9])([0-9])([0-9])([0-9])-([0-9])([0-9])-([0-9])([0-9])
These are very different patterns. You introduced eight capturing groups into the second pattern, which makes this regexp significantly slower. The first pattern, on the other hand, has zero capturing groups. I don't think you can compare these two patterns in the same benchmark.
To make the patterns roughly equivalent, the second one should be:
[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9]
The unreproducible results
After changing the pattern for sqlean-re
to be equivalent to sqlite-regex
, I ran your benchmark on MacBook Air (M1, 2020), and got these results:
Benchmark 1: ./sqlite-regex.sh
Time (mean ± σ): 433.4 ms ± 1.6 ms [User: 426.2 ms, System: 5.6 ms]
Range (min … max): 431.7 ms … 436.4 ms 10 runs
Benchmark 2: ./sqlean-re.sh
Time (mean ± σ): 474.4 ms ± 3.5 ms [User: 465.9 ms, System: 6.0 ms]
Range (min … max): 469.3 ms … 479.0 ms 10 runs
Summary
'./sqlite-regex.sh' ran
1.09 ± 0.01 times faster than './sqlean-re.sh'
So much for "6.57x slower".
I don't think that the disclaimer "Benchmarks are hard and easy to game" justifies your claims about the relative performance of different regexp implementations.