Skip to content

use zlib-rs for gzip compression in rust code #15417

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 15, 2025

Conversation

folkertdev
Copy link
Contributor

What does this PR try to resolve?

This PR uses zlib-rs via the flate2 crate. It is used for (de)compressing gzip files (e.g. in cargo package).

Using zlib-rs is significantly faster, and produces output of roughly similar size. For the windows-bindgen crate
(a very large crate), the speedup is over 60%, or about 2 seconds.

> time cargo package -p windows-bindgen --no-verify
   Packaging windows-bindgen v0.61.0 (/home/folkertdev/rust/windows-rs/crates/libs/bindgen)
    Updating crates.io index
    Packaged 76 files, 31.2MiB (8.2MiB compressed)

________________________________________________________
Executed in    3.30 secs    fish           external
   usr time    3.19 secs  424.00 micros    3.19 secs
   sys time    0.05 secs   61.00 micros    0.05 secs

> time ~/rust/cargo/target/release/cargo package -p windows-bindgen --no-verify
   Packaging windows-bindgen v0.61.0 (/home/folkertdev/rust/windows-rs/crates/libs/bindgen)
    Updating crates.io index
    Packaged 76 files, 31.2MiB (8.3MiB compressed)

________________________________________________________
Executed in    1.25 secs    fish           external
   usr time    1.15 secs    0.00 micros    1.15 secs
   sys time    0.04 secs  589.00 micros    0.04 secs

How should we test and review this PR?

Generally CI/the test suite should handle correctness.

Something to look out for is zlib-rs producing larger binaries than before. So far we're seeing output sizes that are roughly the same (sometimes a bit better, sometimes a bit worse) as the status quo.

We've not specifically looked at decompression yet, mostly because we could not come up with a good command to benchmark. In general zlib-rs is much faster than stock zlib there too.

Additional information

For the time being, cargo still depends on stock zlib via e.g. curl and git. As far as I know it is the intention to eventually move away from these C dependencies, at which point the dependency on stock zlib also disappears.

Various C dependencies (curl, git) still rely on `libz-sys`, so for the time being a system libc is still required. But, using zlib-rs via flate2 is straightforward, and gives good speedup for `cargo package`. It is also extremely portable, because it's just rust code.
@rustbot
Copy link
Collaborator

rustbot commented Apr 10, 2025

r? @epage

rustbot has assigned @epage.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added A-documenting-cargo-itself Area: Cargo's documentation S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Apr 10, 2025
@joshtriplett joshtriplett added I-nominated-to-discuss To be discussed during issue triage on the next Cargo team meeting and removed A-documenting-cargo-itself Area: Cargo's documentation labels Apr 10, 2025
@joshtriplett
Copy link
Member

Thanks, @folkertdev!

Nominating for next week's Cargo meeting.

@ehuss ehuss moved this to Nominated in Cargo status tracker Apr 15, 2025
@ehuss ehuss removed the I-nominated-to-discuss To be discussed during issue triage on the next Cargo team meeting label Apr 15, 2025
@joshtriplett joshtriplett added the relnotes Release-note worthy label Apr 15, 2025
@epage
Copy link
Contributor

epage commented Apr 15, 2025

We talked about how we want to approach these kind of changes in today's Cargo team meeting.

For gitoxide, its still under active development and each feature needs a lot of vetting before adoption, so we're taking an incremental, feature-at-at-ime, approach with an opt-in and opt-out windows to ensure as smooth of a path as possible.

For something like zlib-rs, it feels smaller and more mature. We likely don't need a a multi-release transition window. We do want it merged at the beginning of a release window to give as much runtime within the release as possible. We also appreciate how easy it would be to back this out during beta if need be. The main risk we see with zlib-rs is platform support.

For our -sys crates, we allow people building cargo to decide whether to use the vendored or system version of the package. The question came up if we should still support those cases in addition to the pure-Rust implementation. I don't feel that that model needs to be applied when adopting a pure-Rust variant of C code. We could use C implementations of json, blake3, toml, base64, etc and the question didn't come up of whether that should be made configurable.

As this is still early in the release Window, we can move forward with this.

@epage epage added this pull request to the merge queue Apr 15, 2025
Merged via the queue into rust-lang:master with commit d811228 Apr 15, 2025
25 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
relnotes Release-note worthy S-waiting-on-review Status: Awaiting review from the assignee but also interested parties.
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

5 participants