Skip to content

Draft: feat: --max-bytes-used (#2) #6

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

thomie
Copy link
Contributor

@thomie thomie commented Mar 3, 2025

When supplying the flag --max-bytes-used, quickbench now also reports the max residency of the program under test. Works for haskell programs only, and they need to be compiled with -rtsopts.

Example output:

+----------------------------------------------++---------------------------++-----------------------------+
|                                              ||     Time (s)              || Max bytes used              |
+==============================================++===========================++=============================+
|                                              || hledger-1.40 hledger-1.41 ||   hledger-1.40 hledger-1.41 |
+==============================================++===========================++=============================+
| -f examples/1ktxns-1kaccts.journal balance   ||         0.21         0.21 ||          3.84M        4.28M |
| -f examples/2ktxns-1kaccts.journal balance   ||         0.35         0.33 ||          7.43M        6.39M |
| -f examples/3ktxns-1kaccts.journal balance   ||         0.49         0.49 ||         10.48M       11.59M |
| -f examples/4ktxns-1kaccts.journal balance   ||         0.53         0.36 ||         14.61M       11.49M |
| -f examples/5ktxns-1kaccts.journal balance   ||         0.47         0.41 ||         18.32M       15.52M |
| -f examples/6ktxns-1kaccts.journal balance   ||         0.47         0.49 ||         21.72M       21.60M |
| -f examples/7ktxns-1kaccts.journal balance   ||         0.55         0.57 ||         22.35M       25.17M |
| -f examples/8ktxns-1kaccts.journal balance   ||         0.61         0.64 ||         22.28M       24.02M |
| -f examples/9ktxns-1kaccts.journal balance   ||         0.70         0.67 ||         31.41M       24.03M |
| -f examples/10ktxns-1kaccts.journal balance  ||         0.77         0.78 ||         36.00M       35.56M |
| -f examples/20ktxns-1kaccts.journal balance  ||         1.52         1.55 ||         72.62M       72.62M |
| -f examples/30ktxns-1kaccts.journal balance  ||         2.19         2.29 ||         85.87M       96.99M |
| -f examples/40ktxns-1kaccts.journal balance  ||         2.91         3.04 ||        120.17M      130.21M |
| -f examples/50ktxns-1kaccts.journal balance  ||         3.60         3.62 ||        129.01M      140.33M |
| -f examples/60ktxns-1kaccts.journal balance  ||         4.28         4.44 ||        162.47M      175.05M |
| -f examples/70ktxns-1kaccts.journal balance  ||         4.99         5.07 ||        195.87M      207.53M |
| -f examples/80ktxns-1kaccts.journal balance  ||         5.60         5.76 ||        211.37M      219.93M |
| -f examples/90ktxns-1kaccts.journal balance  ||         6.34         6.49 ||        241.11M      250.92M |
| -f examples/100ktxns-1kaccts.journal balance ||         6.90         7.01 ||        255.24M      264.18M |
+----------------------------------------------++---------------------------++-----------------------------+

@thomie thomie marked this pull request as draft March 3, 2025 07:44
@thomie
Copy link
Contributor Author

thomie commented Mar 3, 2025

Hello @simonmichael,

In #6 , I just added --max-bytes-used.

It would be quite easy to support any or all of the following ghc runtime stats:

 [("bytes allocated", "25546720")
 ,("num_GCs", "9")
 ,("average_bytes_used", "280828")
 ,("max_bytes_used", "836352")
 ,("num_byte_usage_samples", "4")
 ,("peak_megabytes_allocated", "8")
 ,("init_cpu_seconds", "0.000944")
 ,("init_wall_seconds", "0.000727")
 ,("mut_cpu_seconds", "0.019297")
 ,("mut_wall_seconds", "0.762342")
 ,("GC_cpu_seconds", "0.005914")
 ,("GC_wall_seconds", "0.006002")
 ,("exit_cpu_seconds", "0.000326")
 ,("exit_wall_seconds", "0.001238")
 ,("total_cpu_seconds", "0.026521")
 ,("total_wall_seconds", "0.770320")
 ,("major_gcs", "4")
 ,("allocated_bytes", "25546720")
 ,("max_live_bytes", "836352")
 ,("max_large_objects_bytes", "102448")
 ,("max_compact_bytes", "0")
 ,("max_slop_bytes", "45504")
 ,("max_mem_in_use_bytes", "8388608")
 ,("cumulative_live_bytes", "1123312")
 ,("copied_bytes", "4343224")
 ,("par_copied_bytes", "0")
 ,("cumulative_par_max_copied_bytes", "0")
 ,("cumulative_par_balanced_copied_bytes", "0")
 ,("fragmentation_bytes", "0")
 ,("alloc_rate", "1323842590")
 ,("productivity_cpu_percent", "0.729142")
 ,("productivity_wall_percent", "0.989658")
 ,("bound_task_count", "1")
 ,("sparks_count", "0")
 ,("sparks_converted", "0")
 ,("sparks_overflowed", "0")
 ,("sparks_dud ", "0")
 ,("sparks_gcd", "0")
 ,("sparks_fizzled", "0")
 ,("work_balance", "0.000000")
 ,("n_capabilities", "1")
 ,("task_count", "4")
 ,("peak_worker_count", "3")
 ,("worker_count", "3")
 ,("gen_0_collections", "5")
 ,("gen_0_par_collections", "0")
 ,("gen_0_cpu_seconds", "0.002513")
 ,("gen_0_wall_seconds", "0.002565")
 ,("gen_0_max_pause_seconds", "0.000854")
 ,("gen_0_avg_pause_seconds", "0.000513")
 ,("gen_1_collections", "4")
 ,("gen_1_par_collections", "0")
 ,("gen_1_cpu_seconds", "0.003401")
 ,("gen_1_wall_seconds", "0.003437")
 ,("gen_1_max_pause_seconds", "0.002737")
 ,("gen_1_avg_pause_seconds", "0.000859")
 ]

Rather than a separate flag for each, something like --rts-stats=max_bytes_used,peak_megabytes_allocated perhaps?

What do you think?

@simonmichael
Copy link
Owner

Woo that's a lot! Let's keep it simple, max_byte_used looks the most useful.

This is quite a specific feature, but useful to haskell devs like us. How about calling it -m/--mem (or -m/--mem-ghc if we want to be more accurate).

@simonmichael
Copy link
Owner

If used with a non-haskell program, or a haskell program not compiled the right way, what happens ?

Is it worth using GHC's RTS stats, or could we get the same info by a more general technique, not depending on GHC ?

@thomie
Copy link
Contributor Author

thomie commented Apr 6, 2025

I think "bytes allocated" could be quite useful as well, as a proxy for runtime.

For example from https://hasura.io/blog/hasura-and-well-typed-collaborate-on-haskell-tooling#fn1:

The number of bytes allocated acts as a proxy for the amount of computation performed, since Haskell programs tend to allocate frequently, and allocations are more consistent than CPU or wall clock time. ↩︎

I can't find a better reference now, but it's what the ghc project itself used to do / does. It would be a partial answer to your question in simonmichael/hledger#2122 (comment):

PS any ideas for simple robust automated performance testing

+----------------------------------------------++---------------------------++-----------------------------+
|                                              ||     Time (s)              || Max bytes used              |
+==============================================++===========================++=============================+
|                                              || hledger-1.40 hledger-1.41 ||   hledger-1.40 hledger-1.41 |
+==============================================++===========================++=============================+
| -f examples/1ktxns-1kaccts.journal balance   ||         0.21         0.21 ||          3.84M        4.28M |
| -f examples/2ktxns-1kaccts.journal balance   ||         0.35         0.33 ||          7.43M        6.39M |
| -f examples/3ktxns-1kaccts.journal balance   ||         0.49         0.49 ||         10.48M       11.59M |
| -f examples/4ktxns-1kaccts.journal balance   ||         0.53         0.36 ||         14.61M       11.49M |
| -f examples/5ktxns-1kaccts.journal balance   ||         0.47         0.41 ||         18.32M       15.52M |
| -f examples/6ktxns-1kaccts.journal balance   ||         0.47         0.49 ||         21.72M       21.60M |
| -f examples/7ktxns-1kaccts.journal balance   ||         0.55         0.57 ||         22.35M       25.17M |
| -f examples/8ktxns-1kaccts.journal balance   ||         0.61         0.64 ||         22.28M       24.02M |
| -f examples/9ktxns-1kaccts.journal balance   ||         0.70         0.67 ||         31.41M       24.03M |
| -f examples/10ktxns-1kaccts.journal balance  ||         0.77         0.78 ||         36.00M       35.56M |
| -f examples/20ktxns-1kaccts.journal balance  ||         1.52         1.55 ||         72.62M       72.62M |
| -f examples/30ktxns-1kaccts.journal balance  ||         2.19         2.29 ||         85.87M       96.99M |
| -f examples/40ktxns-1kaccts.journal balance  ||         2.91         3.04 ||        120.17M      130.21M |
| -f examples/50ktxns-1kaccts.journal balance  ||         3.60         3.62 ||        129.01M      140.33M |
| -f examples/60ktxns-1kaccts.journal balance  ||         4.28         4.44 ||        162.47M      175.05M |
| -f examples/70ktxns-1kaccts.journal balance  ||         4.99         5.07 ||        195.87M      207.53M |
| -f examples/80ktxns-1kaccts.journal balance  ||         5.60         5.76 ||        211.37M      219.93M |
| -f examples/90ktxns-1kaccts.journal balance  ||         6.34         6.49 ||        241.11M      250.92M |
| -f examples/100ktxns-1kaccts.journal balance ||         6.90         7.01 ||        255.24M      264.18M |
+----------------------------------------------++---------------------------++-----------------------------+
@thomie
Copy link
Contributor Author

thomie commented Apr 6, 2025

If used with a non-haskell program, what happens ?

You get some inexplicable error message:

hGetLine: end of file

or a haskell program not compiled the right way,

error: hledger-1.42: Most RTS options are disabled. Link with -rtsopts to enable them.

I guess it would be better to first check if the program is a haskell program and whether it's linked with -rtsopts (by checking +RTS --info), and show a warning and ignore the -m flag if it isn't. TODO.

--mem-ghc

There are multiple different memory measurements collected by the ghc runtime system, so just "--mem-ghc" is perhaps not accurate enough (is it max or average or total allocation?).

A --rts-stats option would be the most flexible.

@simonmichael
Copy link
Owner

simonmichael commented Apr 9, 2025

Currently, quickbench is positioned as a really quick and easy (quick and dirty, some might say) reporting tool - "a better time". I think adding a lot of specialist measurements doesn't fit with this. Do we think it's worthwhile/affordable to expand the scope ? Our competition would be tools like bench (haskell) and hyperfine (rust).

@simonmichael
Copy link
Owner

I'm not against adding a simple "memory" measurement for haskell programs, as I'd personally find that very handy, but I wonder how far to go.

@simonmichael
Copy link
Owner

simonmichael commented Apr 9, 2025

I'd also love to show "transactions per second" in these reports when benchmarking hledger or other PTA apps. Supporting such custom metrics, somehow, would be another nice feature (and scope expansion).

@thomie
Copy link
Contributor Author

thomie commented Apr 10, 2025

We could add a --custom-metrics-parser option to quickbench.

Its argument should be the path to a program whose task it is to parse the output of the program under test and output custom metrics in some standardized format (json/csv). quickbench would then take those metrics and report them alongside the time measurements.

In hledger, there would be a hledger-metrics.sh script which outputs for a single run for example:

{"memory": 123000000, "transactions per second": 20000}

Then quickbench --custom-metrics-parser=hledger-metrics.sh bench.sh reports:

time memory transactions per second
-f examples/1ktxns-1kaccts.journal balance 0.21 123000000 20000
... ... ... ...

@simonmichael
Copy link
Owner

Increasingly off topic, but while we're brainstorming: I wished for an easy display of changes (perhaps only between two executables). Eg to summarise results for two hledger versions I made this one by hand:

command 1k txns 10k txns 100k txns
time:
all commands = = =
memory:
print +slightly -10% -1%
register -10% -18% -5%
balance +slightly -13% -5%

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants