bench: Add memory-limited FIO benchmarks with breach detection by yerzhan7 · Pull Request #1808 · awslabs/mountpoint-s3

yerzhan7 · 2026-04-16T14:11:31Z

Summary

Extend the existing FIO throughput benchmarks CI so that each supported workload also runs with Mountpoint built using --features mem_limiter and mounted with --max-memory-target=512 (MiB - fixed). Under this extra pressure, Mountpoint must stay within the configured budget; the new jobs surface per-test peak memory usage in a GitHub Actions step summary table so regressions/improvements in the memory limiter are easy to spot.

Non-latency only. Same FIO job definitions are used for both the regular and memory-limited variants.

The existing bench (S3 Standard), cache-bench, and S3 Express bench jobs are extended with a strategy.matrix that fans each one out to two variants:
- default — unchanged behaviour.
- mem-limited — builds with --features mem_limiter, runs with S3_MAX_MEMORY_TARGET_MIB=512, and emits an extra GitHub Actions summary table.
The matrix also drives per-variant job name suffixes, gh-pages chart sub-paths, and S3 results sub-prefixes, so the two variants don't collide.
New .github/actions/scripts/render-mem-summary.sh that renders a Markdown table to $GITHUB_STEP_SUMMARY with per-test peak RSS, the limit, a breach flag, and peak reserved memory per area/kind. Gated on matrix.variant == 'mem-limited'.
Shared benchmark scripts (fs_bench.sh, fs_cache_bench.sh) are now parameterised via the S3_MAX_MEMORY_TARGET_MIB env variable. When set, they:
- Build with --features mem_limiter.
- Mount with --max-memory-target=<N>.
- Ask mount-s3-log-analyzer for an additional JSON file via --mem-limit-mib=<N> --extra-metrics-out=<PATH>.
When unset, behaviour is unchanged.
mount-s3-log-analyzer gains two optional flags, --mem-limit-mib and --extra-metrics-out, wired together with clap's requires so either both are set or neither. When both are set, the analyzer also parses Mountpoint metric log lines for:
- mem.bytes_reserved[area=prefetch]
- mem.bytes_reserved[area=upload]
- pool.reserved_bytes[kind=get_object]
- pool.reserved_bytes[kind=put_object]
and writes JSON with the test name, peak RSS in MiB, memory limit in MiB, a `breached = peak_rss_mib

Example GitHub Actions summary

https://github.com/awslabs/mountpoint-s3/actions/runs/24720419069

| Test | Peak RSS (MiB) | Memory Limit (MiB) | Status | Peak Prefetch Reserved (MiB) | Peak Upload Reserved (MiB) | Peak Pool GetObject (MiB) | Peak Pool PutObject (MiB) |
|---|---|---|---|---|---|---|---|
| mix_1r4w | 1562.546875 | 512 | ❌ BREACHED | 32 | N/A | 32 | 1376 |
| rand_read_4t_direct | 22.0625 | 512 | ✅ OK | 68.5 | N/A | 64 | N/A |

Notes:

Breach is non-fatal: the ❌ is informational; the CI job does not fail on a breach.
A metric is rendered as N/A only when Mountpoint never emitted it in the logs (e.g. pool.reserved_bytes[kind=get_object] in a write-only workload). If the metric was emitted with value 0, the column shows 0.0.
The _extra_metrics.json file is consumed only by the memory summary step. It is not fed into the gh-pages benchmark charts.

Where results are stored

Memory-limited results are stored under distinct mem_limited sub-paths so they don't collide with the existing charts:

Workload	Throughput chart path	Peak-memory chart path
S3 Standard throughput	`dev/bench/mem_limited`	`dev/bench/mem_limited/peak_mem_usage`
S3 Standard cache	`dev/cache_bench/mem_limited`	`dev/cache_bench/mem_limited/peak_mem_usage`
S3 Express One Zone throughput	`dev/s3-express/bench/mem_limited`	`dev/s3-express/bench/mem_limited/peak_mem_usage`

Why a separate `_extra_metrics.json`?

The existing <test>_peak_mem.json follows the {name, value, unit} schema required by benchmark-action/github-action-benchmark and feeds the gh-pages charts. Adding more fields there would pollute the charts for non-memory-limited runs. Keeping the file separate lets each consumer (benchmark-action vs. GH Actions summary) receive only what it needs.

Does this change impact existing behavior?

No - adding new benchmarks only.

Does this change need a changelog entry? Does it require a version change?

No.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and I agree to the terms of the Developer Certificate of Origin (DCO).

…duce iterations" This reverts commit 1f0507b. Signed-off-by: Yerzhan Mazhkenov <20302932+yerzhan7@users.noreply.github.com>

…tection" This reverts commit 2b4e145. Signed-off-by: Yerzhan Mazhkenov <20302932+yerzhan7@users.noreply.github.com>

Add memory-limited siblings for the throughput benchmarks in bench.yml (S3 Standard bench + cache-bench) and bench_s3express.yml (bench). The new jobs build Mountpoint with --features mem_limiter, mount with --max-memory-target=512, and publish charts under dev/.../mem_limited/. The FIO benchmark scripts are now shared between the regular and memory-limited variants, driven by the S3_MAX_MEMORY_TARGET_MIB env variable. When set, they pass --features mem_limiter to cargo run, --max-memory-target to Mountpoint, and --mem-limit-mib to the log analyzer. Extend mount-s3-log-analyzer with an optional --mem-limit-mib flag that, when set, also emits <out_dir>/<test>_extra_metrics.json with peak RSS, reserved memory peaks for prefetch/upload areas and get_object/put_object pool kinds, and a breached flag. The _extra_metrics.json files are consumed only by the memory-limited CI jobs' Render memory summary step (render-mem-summary.sh), which renders a Markdown table to GITHUB_STEP_SUMMARY. They are not fed into the gh-pages benchmark charts. Breach is informational only; it does not fail the job. Signed-off-by: Yerzhan Mazhkenov <20302932+yerzhan7@users.noreply.github.com>

…rations Disable bench, latency-bench, and cache-bench jobs in bench.yml and bench_s3express.yml to only run mem-limited-bench and mem-limited-cache-bench jobs during memory limiter development. Reduce FIO iterations from 10 to 1 in fs_bench.sh and fs_cache_bench.sh for faster CI turnaround. This commit is intended to be reverted once development iteration is complete. Signed-off-by: Yerzhan Mazhkenov <20302932+yerzhan7@users.noreply.github.com>

Mountpoint log lines are prefixed with a tracing-subscriber timestamp, level, and target, so patterns anchored with ^ never match. Restore the original unanchored peak-RSS pattern and apply the same approach to the new labeled-metric pattern for mem.bytes_reserved and pool.reserved_bytes. Also restore the original Vec<u64> + max collection for peak RSS to keep the diff minimal. Without this fix, memory-limited CI runs produced extra_metrics.json files with all-zero values because every log line failed to match. Signed-off-by: Yerzhan Mazhkenov <20302932+yerzhan7@users.noreply.github.com>

A gauge metric is not emitted in Mountpoint logs until something actually reserves against it (e.g. pool.reserved_bytes[kind=get_object] does not appear in a write-only workload). Previously we defaulted absent keys to 0.0 in extra_metrics.json, which is indistinguishable from a true zero reservation. Drop absent keys from extra_metrics.json entirely and render them as "N/A" in the GitHub Actions summary table. This also makes the schema forward-compatible with future label values such as pool.reserved_bytes[kind=append] for incremental uploads. Signed-off-by: Yerzhan Mazhkenov <20302932+yerzhan7@users.noreply.github.com>

Signed-off-by: Yerzhan Mazhkenov <20302932+yerzhan7@users.noreply.github.com>

…duce iterations" This reverts commit 042ce4c. Signed-off-by: Yerzhan Mazhkenov <20302932+yerzhan7@users.noreply.github.com>

Replace the combined mem.bytes_reserved|pool.reserved_bytes regex and the (metric,labels) HashMap with four dedicated regexes and a fixed [Option<u64>; 4] table keyed by output field name. Drops the unreachable unit-suffix branch and simplifies the match-and-update code. Output schema and N/A rendering behaviour are unchanged. Signed-off-by: Yerzhan Mazhkenov <20302932+yerzhan7@users.noreply.github.com>

Add links to the three new memory-limited chart paths under the Regression Testing section of BENCHMARKING.md. Add a "## Memory Breach Detection" heading above the GitHub Actions summary table emitted by render-mem-summary.sh. Signed-off-by: Yerzhan Mazhkenov <20302932+yerzhan7@users.noreply.github.com>

- fold --max-memory-target into optional_args in fs_bench.sh and fs_cache_bench.sh instead of keeping a separate mem_target_arg. - take the extra-metrics output path as an explicit --extra-metrics-out CLI flag in mount-s3-log-analyzer instead of deriving it from the peak-memory output file's parent directory; pair it with --mem-limit-mib via clap's `requires` so either both are set or neither. - simplify peak tracking with Option::max. Signed-off-by: Yerzhan Mazhkenov <20302932+yerzhan7@users.noreply.github.com>

Replace the separate *-mem-limited jobs with a strategy.matrix on the existing bench/cache-bench jobs. Each job now fans out to `default` and `mem-limited` variants driven by matrix vars (features, env, name/path suffixes, results subdir). The `Render memory summary` step is gated on the `mem-limited` variant. Results are uploaded to a `matrix.results_subdir`-derived S3 prefix so the two variants don't collide (github.job no longer disambiguates across matrix variants). Signed-off-by: Yerzhan Mazhkenov <20302932+yerzhan7@users.noreply.github.com>

passaro

LGTM!

Extend the S3 Express throughput benchmark matrix with two new variants that mount Mountpoint with `--incremental-upload`: - `incremental-upload`: default memory. - `incremental-upload-mem-limited`: also built with `--features mem_limiter` and mounted with `--max-memory-target=512`. Both variants run only the `write` and `mix` fio categories (incremental upload is an upload-path feature, so the `read` category is skipped). Matrix changes: - Two new columns on the existing `bench` matrix: `incremental_upload` (wired to `S3_INCREMENTAL_UPLOAD`) and `bench_categories` (wired to `S3_BENCH_CATEGORIES`). - The "Render memory summary" step now gates on `matrix.max_memory_target_mib != ''` so it fires for any mem-limited entry, not just the original `mem-limited` variant. Script changes (`fs_bench.sh`): - When `S3_INCREMENTAL_UPLOAD` is set, pass `--incremental-upload` to the mount command. - The fio categories are now driven by `S3_BENCH_CATEGORIES`; when unset, behaviour is unchanged (`read write mix`). gh-pages layout (follows the nesting convention from #1808): - dev/s3-express/bench/incremental_upload - dev/s3-express/bench/incremental_upload/mem_limited

yerzhan7 had a problem deploying to PR integration tests April 16, 2026 14:11 — with GitHub Actions Error

yerzhan7 added the performance PRs to run benchmarks on label Apr 16, 2026

yerzhan7 temporarily deployed to PR benchmarks April 16, 2026 14:12 — with GitHub Actions Inactive

yerzhan7 force-pushed the wf-changes/mem-limited-benchmarks branch from 3ec17f3 to 7bd5172 Compare April 16, 2026 14:26

yerzhan7 had a problem deploying to PR benchmarks April 16, 2026 14:26 — with GitHub Actions Error

yerzhan7 had a problem deploying to PR integration tests April 16, 2026 14:26 — with GitHub Actions Error

yerzhan7 force-pushed the wf-changes/mem-limited-benchmarks branch from 7bd5172 to 2163d32 Compare April 17, 2026 10:06

yerzhan7 had a problem deploying to PR benchmarks April 17, 2026 10:06 — with GitHub Actions Error

yerzhan7 had a problem deploying to PR integration tests April 17, 2026 10:06 — with GitHub Actions Error

yerzhan7 force-pushed the wf-changes/mem-limited-benchmarks branch from 2163d32 to 2b4e145 Compare April 17, 2026 11:31

yerzhan7 had a problem deploying to PR integration tests April 17, 2026 11:31 — with GitHub Actions Error

yerzhan7 had a problem deploying to PR benchmarks April 17, 2026 11:31 — with GitHub Actions Error

yerzhan7 had a problem deploying to PR integration tests April 17, 2026 12:47 — with GitHub Actions Error

yerzhan7 had a problem deploying to PR benchmarks April 17, 2026 12:47 — with GitHub Actions Error

yerzhan7 requested a deployment to PR benchmarks April 17, 2026 12:47 — with GitHub Actions Waiting

yerzhan7 had a problem deploying to PR benchmarks April 17, 2026 13:58 — with GitHub Actions Error

yerzhan7 had a problem deploying to PR integration tests April 17, 2026 13:58 — with GitHub Actions Error

yerzhan7 had a problem deploying to PR benchmarks April 17, 2026 15:10 — with GitHub Actions Error

yerzhan7 had a problem deploying to PR integration tests April 17, 2026 15:10 — with GitHub Actions Error

yerzhan7 had a problem deploying to PR benchmarks April 17, 2026 15:10 — with GitHub Actions Error

yerzhan7 had a problem deploying to PR integration tests April 20, 2026 16:56 — with GitHub Actions Error

yerzhan7 had a problem deploying to PR benchmarks April 20, 2026 16:56 — with GitHub Actions Error

yerzhan7 had a problem deploying to PR benchmarks April 21, 2026 10:10 — with GitHub Actions Error

yerzhan7 added 11 commits April 21, 2026 12:39

Revert "chore(ci): Temporarily disable non-mem-limited CI jobs and re…

1979541

…duce iterations" This reverts commit 1f0507b. Signed-off-by: Yerzhan Mazhkenov <20302932+yerzhan7@users.noreply.github.com>

Revert "feat(bench): Add memory-limited FIO benchmarks with breach de…

abf5db0

…tection" This reverts commit 2b4e145. Signed-off-by: Yerzhan Mazhkenov <20302932+yerzhan7@users.noreply.github.com>

Cleanup

7268a62

Signed-off-by: Yerzhan Mazhkenov <20302932+yerzhan7@users.noreply.github.com>

cleanup

05a7fee

Signed-off-by: Yerzhan Mazhkenov <20302932+yerzhan7@users.noreply.github.com>

Revert "chore(ci): Temporarily disable non-mem-limited CI jobs and re…

2abee1a

…duce iterations" This reverts commit 042ce4c. Signed-off-by: Yerzhan Mazhkenov <20302932+yerzhan7@users.noreply.github.com>

yerzhan7 force-pushed the wf-changes/mem-limited-benchmarks branch from b7e033b to f0ca8ea Compare April 21, 2026 11:41

yerzhan7 requested a deployment to PR integration tests April 21, 2026 11:42 — with GitHub Actions Waiting

yerzhan7 requested a deployment to PR benchmarks April 21, 2026 11:42 — with GitHub Actions Waiting

yerzhan7 marked this pull request as ready for review April 21, 2026 11:46

yerzhan7 mentioned this pull request Apr 21, 2026

[DRAFT] bench: Add incremental-upload throughput benchmark to S3 Express CI #1813

Draft

yerzhan7 requested a review from passaro April 21, 2026 14:10

passaro reviewed Apr 21, 2026

View reviewed changes

yerzhan7 added 2 commits April 21, 2026 21:19

yerzhan7 deployed to PR benchmarks April 21, 2026 20:36 — with GitHub Actions Active

yerzhan7 temporarily deployed to PR benchmarks April 21, 2026 20:36 — with GitHub Actions Inactive

yerzhan7 temporarily deployed to PR integration tests April 21, 2026 20:36 — with GitHub Actions Inactive

passaro approved these changes Apr 24, 2026

View reviewed changes

yerzhan7 added this pull request to the merge queue Apr 24, 2026

Merged via the queue into main with commit 72e3a50 Apr 24, 2026
171 of 176 checks passed

yerzhan7 deleted the wf-changes/mem-limited-benchmarks branch April 24, 2026 14:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bench: Add memory-limited FIO benchmarks with breach detection#1808

bench: Add memory-limited FIO benchmarks with breach detection#1808
yerzhan7 merged 19 commits intomainfrom
wf-changes/mem-limited-benchmarks

yerzhan7 commented Apr 16, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

passaro left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

yerzhan7 commented Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Example GitHub Actions summary

Where results are stored

Why a separate _extra_metrics.json?

Does this change impact existing behavior?

Does this change need a changelog entry? Does it require a version change?

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

passaro left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

yerzhan7 commented Apr 16, 2026 •

edited

Loading

Why a separate `_extra_metrics.json`?