Add `permessage-deflate` support, again #426

SvizelPritula · 2024-05-16T17:38:22Z

In #328, @kazk implemented support for the permessage-deflate extention. Implementing this extention required the ability to parse the Sec-Websocket-Extentions header, so @kazk submitted a pull request to the headers crate. However, the maintainers of the headers crate don't wan't to support this header yet, as they don't wan't to commit to any particular API. As such, they suggested that it should be implemented in this or some other crate, tweaked as necessary and potentially moved into the headers crate in the future.

This PR reverts the commit that reverted @kazk's commit that added permessage-deflate support. It also copies his implementation of SecWebsocketExtentions intended for the headers crate.

The implementation of SecWebsocketExtentions relied on some internal utilities of the headers crate. I've removed some of those dependencies and re-implemented others. I have also gated the implementation behind the existing handshake feature, as well as any code that relies on it, since it relies on the headers crate, which in turn relies on many other HTTP crates.

Blocking issues

@nakedible-p found an error in the implementation. As I know very little about the Websocket spec, I'm unsure if I can fix it properly.
The headers crate is licensed under the MIT license, while this crate uses a dual MIT+Apache license. We would need to add a proper license notice, or @kazk would need to consent to the re-licensing of his PR.

Unresolved questions

It is impossible to use the deflate feature without having tungstenite handle the Websocket handshake. There is a WebSocket::from_raw_socket_with_extensions method to allow for exactly that, but it takes an Extensions struct, which has no public constructor.
The only way to get such a struct is from the WebSocketConfig::accept_offers method, which implements the server part of extension negotiation and requires the handshake feature. This method is intended to allow for tungstenite to be integrated into web frameworks. No similar public method exists for clients, although that might not be a big issue, since a client knows whether a request will be a Websocket request up front.
To me, the WebSocketConfig::accept_offers feels out of place on the WebSocketConfig. We could make it a standalone function, perhaps in the extensions module.
I have made the SecWebsocketExtentions implementation public. This is necessary to make the WebSocketConfig::accept_offers method usable. This does mean that if SecWebsocketExtentions were to be added to headers in the future, switching to that implementation would (I think) be a breaking change. An alternative would be to have WebSocketConfig::accept_offers take and return a HeaderValue, which it would parse itself. This would allow us to make SecWebsocketExtentions a private implementation detail.
This is a breaking change, since a new field was added to WebSocketConfig and a new variant was added to the Error enum. Since this field and variant depends on the deflate feature, I've annotated both with #[non_exhaustive]. I've done the same to ProtocolError, which currently has variants gated behind handshake. This sadly means that you cannot create WebSocketConfig using the initializer syntax, instead you have to create an instance with WebSocketConfig::default and mutate it afterward. Sadly, I see no way to avoid this. It would be possible to disable #[non_exhaustive] if all relevant features are enabled.

This causes the library not to build. This reverts commit 42b8797.

Still doesn't build.

Re-implements a couple of internal utilities from the header crate. Also makes some changes to the SecWebsocketExtensions code. The build is fixed again, tests pass.

nakedible-p · 2024-05-16T18:36:04Z

I will be happy to go through the details of the permessage-deflate spec – or rather even fix the bug myself, while some missing window bits support. However, the bug is extremely niche as these extension fields are very rarely used, so I'd prefer to try to get this merged and then do an improvement pull on top of that.

SvizelPritula · 2024-06-12T16:24:08Z

@kazk Would you be willing to relicense your PR under the MIT + Apache dual license as used by this crate?

If not, I believe we can use the code under MIT + Apache anyways, as long as we add a NOTICE file to the repo that looks a little bit like this:

This crate contains code copied from the headers crate, licensed as follows:

*Insert copy of headers license here.*

zitsen · 2024-11-13T07:23:59Z

What's the status of this pr?

SvizelPritula · 2024-11-13T08:56:35Z

What's the status of this pr?

I forgot about it, sorry.

I'd like to modify it to meet the necessary licensing obligations, which shouldn't be too hard.

SvizelPritula · 2024-11-14T18:11:41Z

I'm not a lawyer, but I think that the notices I've to the license files should be compliant with all relevant licenses. As such, this PR is now ready for review! 🥳 Sorry for the delay.

kazk · 2024-11-15T05:13:37Z

@SvizelPritula

I apologize for the delay.

@kazk Would you be willing to relicense your hyperium/headers#88 under the MIT + Apache dual license as used by this crate?

Yes, you can do whatever is necessary for what I did :)
Thanks for your work!

kazk has agreed to relicense his original PR under MIT+Apache, which means we don't have to add extra notices to our license files. This reverts commit 049c753.

zitsen · 2024-12-11T03:37:44Z

Glad to see it still going on, thanks your great work @SvizelPritula @kazk .

So are there any other problems pending this?

Sorry for disturbing, but seems that I should ping you @nakedible-p @daniel-abramov .

daniel-abramov · 2024-12-11T12:36:40Z

I remember I checked #328 back then, and it seems like there are many common parts (back then it was reverted due to dependencies. So yeah, I'm not against merging it.

I'd also appreciate it if someone who tracked the issue could approve it, though (to ensure that I did not miss anything since I did not go as thoroughly through all the changes as I typically try to do).

goriunov · 2024-12-18T23:27:42Z

We have been running a copy of this branch for couple of weeks now in production, at least in the standard cases it seems to work good, both directions server client and client server, different languages clients connect to the server.

daniel-abramov · 2024-12-19T09:59:18Z

We have been running a copy of this branch for couple of weeks now in production, at least in the standard cases it seems to work good, both directions server client and client server, different languages clients connect to the server.

This sounds pretty good! It's always good to know that someone tested it in production!

@SvizelPritula, would you be interested in rebasing on top of a master branch? (there have been quite significant changes recently that affect performance and change the API surface a bit)

goriunov · 2024-12-23T08:30:48Z

@daniel-abramov @SvizelPritula just checking in. Is there anything else that needs to be done before this branch can be merged?

SvizelPritula · 2025-02-17T13:34:57Z

The tests are now fixed, everything should be working.

Sorry for taking so long, I was busy with exams.

daniel-abramov

Thanks for the update!

I had to quickly go through the changes again (albeit skipping all deflate-specific logic) as I keep forgetting the things between reviews (the PR is pretty large).

Generally it looks good, I'm only a bit concerned with a slightly more complicated logic in protocol/** (we have more branching and more conditions), but making it better would possibly require a bigger overhaul.

I also decided to run our benchmarks locally and it seems like this version is somewhat slower than the version in master when it comes to reading. I consistently get a regression of around 15-17%.

read+unmask 100k small messages (server)
                        time:   [6.6915 ms 6.7087 ms 6.7273 ms]
                        change: [+14.968% +15.822% +16.508%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 17 outliers among 100 measurements (17.00%)
  6 (6.00%) low mild
  5 (5.00%) high mild
  6 (6.00%) high severe

read 100k small messages (client)
                        time:   [6.4633 ms 6.4821 ms 6.5024 ms]
                        change: [+17.048% +17.722% +18.370%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 4 outliers among 100 measurements (4.00%)
  3 (3.00%) high mild
  1 (1.00%) high severe

Unfortunately, I did not have time to dig in and investigate the root causes, but I assume that it might be related to the changes around IncompleteMessage / extend_incomplete(), and/or handling of OpData::Continue.

tests/write.rs

daniel-abramov · 2025-02-20T17:43:12Z

.gitignore

+autobahn/client/
+autobahn/server/


Suggested change

autobahn/client/

autobahn/server/

Probably leftovers from a local testing? :)

Those folders contain the test results after running scripts/autobahn-client.sh or scripts/autobahn-server.sh. Those shouldn't be commited, so I added them to .gitignore.

daniel-abramov · 2025-02-20T19:13:19Z

src/protocol/mod.rs

+                                    .as_mut()
+                                    .and_then(|x| x.compression.as_mut())
+                                    .unwrap()
+                                    .decompress(payload.to_vec(), fin)?


Note that this involves copying the data which might get expensive. Since the original payload is not required after the decompression, it's probably better to use into() to directly convert it into Vec<u8>.

src/handshake/headers/sec_websocket_extensions.rs

Removes unused code from a test and fixes a typo in a comment. Co-authored-by: Daniel Abramov <[email protected]>

zitsen · 2025-03-18T02:49:22Z

Any update? @SvizelPritula @daniel-abramov

daniel-abramov · 2025-03-21T14:28:04Z

Any update? @SvizelPritula @daniel-abramov

Unfortunately, I did not have an opportunity to investigate what exactly causes the performance regression.

I understand that some people would benefit from permessage-deflate merged and released with the new version, but unfortunately, I can't promise that it will be part of the next release as of now. I'd need to take a closer look to see why the performance degraded and if there are easy ways to solve this.

PS: It takes some time to check the PR each time I see changes made to this PR since it is very large and so whenever I want to check any updates, I oftentimes need to go through all of the changes to recall things.

SvizelPritula · 2025-03-21T14:32:15Z

Any update? @SvizelPritula @daniel-abramov

I have tried to investigate the regression a bit, but I haven't been successful yet.

This PR is very old, and the crate has seen a lot of performance improvements since it began. (In fact, IIRC this PR is older than the benchmark that regressed.) This unfortunately makes it difficult to find out which change exactly is to blame.

goriunov · 2025-04-30T05:46:16Z

Hey, wondering if there is any progress on this PR?

Would be great to get it in main at some point. We are currently running some production builds from older version of this branch, so not very ideal.

unfortunately we can not run system without compression as we have very very traffic dependent systems that streams terabytes per months and if we remove compression AWS bill goes crazy.

unfortunately there is no proper tested alternative in rust websockets that has compression built in, so we kinda have to hack around

daniel-abramov · 2025-04-30T10:26:53Z

@goriunov, I can understand that. However, merging it "as is" is a bit complicated because there is a minor performance degradation, and no one has had time to investigate its cause. I would be somewhat uncomfortable merging it in the current state, knowing that it may cause some issues that I won't be able to debug/solve quickly (it's a pretty large PR after all).

I thought about it for a bit, and I think one of the ways to approach the problem would be as follows:

70% of the code is deflate and WebSocket extension parsing. Perhaps this could be merged separately under a feature gate or be moved to a separate crate (unless there is already one for it), as it may also help other websocket crates that might want to implement the feature.
Then, the rest of the code could be submitted as separate (smaller) PR (or PRs!) that are easier to review and discuss. This would also allow us to see at which point the performance degradation starts to happen and debug it appropriately.

Unfortunately I did not participate that much in the development of this feature and could not take it over as I did not really use deflate in my projects and I rarely use WebSockets nowadays, so I always struggled to find time to justify the effort (I believe it would probably take a couple of days to properly go through changes one more time, merge them one-by-one, investigate the issues, benchmark, etc).

SvizelPritula · 2025-04-30T18:35:06Z

I think I managed to resolve the performance regression.

Unfortunately, the performance of the benchmarks still regresses when the deflate feature is enabled - but that at least doesn't impact existing users.

erebe · 2025-05-11T11:53:01Z

src/protocol/mod.rs

+    // Only `permessage-deflate` is supported at the moment.
+    pub(crate) fn generate_offers(&self) -> Option<SecWebsocketExtensions> {
+        #[cfg(feature = "deflate")]
+        {


Can be simplified for readability and avoid mutable

#[cfg(feature = "deflate")] { if let Some(compression) = self.compression.map(|c| c.generate_offer()) { Some(SecWebsocketExtensions::new(vec![compression])) } else { None } }

erebe · 2025-05-11T11:58:51Z

src/protocol/mod.rs

+        &self,
+        #[allow(unused)] offers: &SecWebsocketExtensions,
+    ) -> Option<(SecWebsocketExtensions, Extensions)> {
+        #[cfg(feature = "deflate")]


read: avoid mutable and make flow explicit

#[cfg(feature = "deflate")] { // To support more extensions, store extension context in `Extensions` and // concatenate negotiation responses from each extension. if let Some(compression) = &self.compression { if let Some((agreed, compression)) = compression.accept_offer(offers) { let extensions = Extensions { compression: Some(compression) }; return Some((SecWebsocketExtensions::new(vec![agreed]), extensions)) } } None }

erebe · 2025-05-11T12:00:18Z