-
Notifications
You must be signed in to change notification settings - Fork 255
Add permessage-deflate support, again
#426
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
This causes the library not to build. This reverts commit 42b8797.
Still doesn't build.
Re-implements a couple of internal utilities from the header crate. Also makes some changes to the SecWebsocketExtensions code. The build is fixed again, tests pass.
|
I will be happy to go through the details of the permessage-deflate spec – or rather even fix the bug myself, while some missing window bits support. However, the bug is extremely niche as these extension fields are very rarely used, so I'd prefer to try to get this merged and then do an improvement pull on top of that. |
|
@kazk Would you be willing to relicense your PR under the MIT + Apache dual license as used by this crate? If not, I believe we can use the code under MIT + Apache anyways, as long as we add a Additionaly, we would probably have to add "Copyright (c) 2014-2023 Sean McArthur" to |
|
What's the status of this pr? |
I forgot about it, sorry. I'd like to modify it to meet the necessary licensing obligations, which shouldn't be too hard. |
|
I'm not a lawyer, but I think that the notices I've to the license files should be compliant with all relevant licenses. As such, this PR is now ready for review! 🥳 Sorry for the delay. |
|
I apologize for the delay.
Yes, you can do whatever is necessary for what I did :) |
kazk has agreed to relicense his original PR under MIT+Apache, which means we don't have to add extra notices to our license files. This reverts commit 049c753.
|
Glad to see it still going on, thanks your great work @SvizelPritula @kazk . So are there any other problems pending this? Sorry for disturbing, but seems that I should ping you @nakedible-p @daniel-abramov . |
|
I remember I checked #328 back then, and it seems like there are many common parts (back then it was reverted due to dependencies. So yeah, I'm not against merging it. I'd also appreciate it if someone who tracked the issue could approve it, though (to ensure that I did not miss anything since I did not go as thoroughly through all the changes as I typically try to do). |
|
We have been running a copy of this branch for couple of weeks now in production, at least in the standard cases it seems to work good, both directions server client and client server, different languages clients connect to the server. |
This sounds pretty good! It's always good to know that someone tested it in production! @SvizelPritula, would you be interested in rebasing on top of a |
|
@daniel-abramov @SvizelPritula just checking in. Is there anything else that needs to be done before this branch can be merged? |
|
The tests are now fixed, everything should be working. Sorry for taking so long, I was busy with exams. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the update!
I had to quickly go through the changes again (albeit skipping all deflate-specific logic) as I keep forgetting the things between reviews (the PR is pretty large).
Generally it looks good, I'm only a bit concerned with a slightly more complicated logic in protocol/** (we have more branching and more conditions), but making it better would possibly require a bigger overhaul.
I also decided to run our benchmarks locally and it seems like this version is somewhat slower than the version in master when it comes to reading. I consistently get a regression of around 15-17%.
read+unmask 100k small messages (server)
time: [6.6915 ms 6.7087 ms 6.7273 ms]
change: [+14.968% +15.822% +16.508%] (p = 0.00 < 0.05)
Performance has regressed.
Found 17 outliers among 100 measurements (17.00%)
6 (6.00%) low mild
5 (5.00%) high mild
6 (6.00%) high severe
read 100k small messages (client)
time: [6.4633 ms 6.4821 ms 6.5024 ms]
change: [+17.048% +17.722% +18.370%] (p = 0.00 < 0.05)
Performance has regressed.
Found 4 outliers among 100 measurements (4.00%)
3 (3.00%) high mild
1 (1.00%) high severe
Unfortunately, I did not have time to dig in and investigate the root causes, but I assume that it might be related to the changes around IncompleteMessage / extend_incomplete(), and/or handling of OpData::Continue.
| autobahn/client/ | ||
| autobahn/server/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| autobahn/client/ | |
| autobahn/server/ |
Probably leftovers from a local testing? :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those folders contain the test results after running scripts/autobahn-client.sh or scripts/autobahn-server.sh. Those shouldn't be commited, so I added them to .gitignore.
src/protocol/mod.rs
Outdated
| .as_mut() | ||
| .and_then(|x| x.compression.as_mut()) | ||
| .unwrap() | ||
| .decompress(payload.to_vec(), fin)? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that this involves copying the data which might get expensive. Since the original payload is not required after the decompression, it's probably better to use into() to directly convert it into Vec<u8>.
Removes unused code from a test and fixes a typo in a comment. Co-authored-by: Daniel Abramov <[email protected]>
|
Any update? @SvizelPritula @daniel-abramov |
Unfortunately, I did not have an opportunity to investigate what exactly causes the performance regression. I understand that some people would benefit from PS: It takes some time to check the PR each time I see changes made to this PR since it is very large and so whenever I want to check any updates, I oftentimes need to go through all of the changes to recall things. |
I have tried to investigate the regression a bit, but I haven't been successful yet. This PR is very old, and the crate has seen a lot of performance improvements since it began. (In fact, IIRC this PR is older than the benchmark that regressed.) This unfortunately makes it difficult to find out which change exactly is to blame. |
|
Hey, wondering if there is any progress on this PR? Would be great to get it in main at some point. We are currently running some production builds from older version of this branch, so not very ideal. unfortunately we can not run system without compression as we have very very traffic dependent systems that streams terabytes per months and if we remove compression AWS bill goes crazy. unfortunately there is no proper tested alternative in rust websockets that has compression built in, so we kinda have to hack around |
|
@goriunov, I can understand that. However, merging it "as is" is a bit complicated because there is a minor performance degradation, and no one has had time to investigate its cause. I would be somewhat uncomfortable merging it in the current state, knowing that it may cause some issues that I won't be able to debug/solve quickly (it's a pretty large PR after all). I thought about it for a bit, and I think one of the ways to approach the problem would be as follows:
Unfortunately I did not participate that much in the development of this feature and could not take it over as I did not really use deflate in my projects and I rarely use WebSockets nowadays, so I always struggled to find time to justify the effort (I believe it would probably take a couple of days to properly go through changes one more time, merge them one-by-one, investigate the issues, benchmark, etc). |
|
I think I managed to resolve the performance regression. Unfortunately, the performance of the benchmarks still regresses when the |
| // Only `permessage-deflate` is supported at the moment. | ||
| pub(crate) fn generate_offers(&self) -> Option<SecWebsocketExtensions> { | ||
| #[cfg(feature = "deflate")] | ||
| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can be simplified for readability and avoid mutable
#[cfg(feature = "deflate")]
{
if let Some(compression) = self.compression.map(|c| c.generate_offer()) {
Some(SecWebsocketExtensions::new(vec![compression]))
} else {
None
}
}| &self, | ||
| #[allow(unused)] offers: &SecWebsocketExtensions, | ||
| ) -> Option<(SecWebsocketExtensions, Extensions)> { | ||
| #[cfg(feature = "deflate")] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
read: avoid mutable and make flow explicit
#[cfg(feature = "deflate")]
{
// To support more extensions, store extension context in `Extensions` and
// concatenate negotiation responses from each extension.
if let Some(compression) = &self.compression {
if let Some((agreed, compression)) = compression.accept_offer(offers) {
let extensions = Extensions { compression: Some(compression) };
return Some((SecWebsocketExtensions::new(vec![agreed]), extensions))
}
}
None
}|
|
||
| /// Value for `Sec-WebSocket-Extensions` request header. | ||
| pub(crate) fn generate_offer(&self) -> WebsocketExtension { | ||
| let mut offers = Vec::new(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The allocation can be avoided by unrolling the if
// > a client informs the peer server of a hint that even if the server doesn't include the
// > "client_no_context_takeover" extension parameter in the corresponding
// > extension negotiation response to the offer, the client is not going
// > to use context takeover.
// > https://www.rfc-editor.org/rfc/rfc7692#section-7.1.1.2
match (self.server_no_context_takeover, self.client_no_context_takeover) {
(true, true) => to_header_value(&[HeaderValue::from_static(SERVER_NO_CONTEXT_TAKEOVER), HeaderValue::from_static(CLIENT_NO_CONTEXT_TAKEOVER)]),
(true, false) => to_header_value(&[HeaderValue::from_static(SERVER_NO_CONTEXT_TAKEOVER)]),
(false, true) => to_header_value(&[HeaderValue::from_static(CLIENT_NO_CONTEXT_TAKEOVER)]),
(false, false) => to_header_value(&[]),
}| } | ||
|
|
||
| #[cfg(feature = "handshake")] | ||
| fn to_header_value(params: &[HeaderValue]) -> WebsocketExtension { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It can be speeded up for the nominal case, where params is empty.
Instead of creating each time a new buffer and trying to parse the header for holding the known value permessage-deflate
if params.is_empty() {
return WebsocketExtension::default()
}
// Pay the cost of creating a new buffer, creating header value, and parsing it for websocket extension| // https://datatracker.ietf.org/doc/html/rfc7692#section-7.2.1 | ||
| // 1. Compress all the octets of the payload of the message using DEFLATE. | ||
| let mut output = Vec::with_capacity(data.len()); | ||
| let before_in = self.compressor.total_in() as usize; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should not cast u64 into a usize, as on 32bits arch this can overflow easily.
Instead, turn the usize of data.len() into an u64 which is safer (as 128bits arch is not really a thing)
the offset can be converted into usize more safely, or use usize::try_from()
| // After this step, the last octet of the compressed data contains | ||
| // (possibly part of) the DEFLATE header bits with the "BTYPE" bits | ||
| // set to 00. | ||
| output.truncate(output.len() - 4); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
4 can be replaced by TRAILER.len()
| data.extend_from_slice(&TRAILER); | ||
| } | ||
|
|
||
| let before_in = self.decompressor.total_in() as usize; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same comment regarding turning u64 into a usize
| is_final: bool, | ||
| ) -> Result<Vec<u8>, DeflateError> { | ||
| if is_final { | ||
| data.extend_from_slice(&TRAILER); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if possible it would be better to avoid the extend_from_slice as it can re-allocate/copy the whole vector to make more place.
If possible, it should be better to do an extra decompress_vec on error if is_final is set
| } | ||
|
|
||
| // Compress the data of message. | ||
| pub(crate) fn compress(&mut self, data: &[u8]) -> Result<Vec<u8>, DeflateError> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be more aligned with the lib the signature should use Bytes instead of &[u8]
fn compress(&mut self, data: Bytes) -> Result<Bytes, DeflateError>It would avoid turning the Vec<u8> into a Bytes when creating the Frame objet.
For the argument data, it does not change anything, it simply forward the Bytes buffer
| Ok(output) | ||
| } | ||
|
|
||
| pub(crate) fn decompress( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same it should use Bytesxxx, it would avoid conversion
| .map_err(|e| DeflateError::Decompress(e.into()))? | ||
| { | ||
| Status::Ok => output.reserve(2 * output.len()), | ||
| Status::BufError | Status::StreamEnd => break, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't check at all if the stream has actually ended with the final packed, I think.
| pub fn verify_response( | ||
| &self, | ||
| response: Response, | ||
| _config: &Option<WebSocketConfig>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since _config is always being used in the function (not behind a feature gate), it should probably not be marked as intentionally unused
|
I'd like to request that the additions of Messages of doom aside, there's precedent already for having error cases that can't be produced without certain features being turned on. My library doesn't enable |
Add support for the "permessage-deflate" websocket protocol extension as specified by RFC 7692. This is based off of snapview#426 but adds - separation between header parsing and negotiation logic - support for the client/server max window bits parameters - more maintainable feature-guarding - additional unit testing
In #328, @kazk implemented support for the
permessage-deflateextention. Implementing this extention required the ability to parse theSec-Websocket-Extentionsheader, so @kazk submitted a pull request to theheaderscrate. However, the maintainers of theheaderscrate don't wan't to support this header yet, as they don't wan't to commit to any particular API. As such, they suggested that it should be implemented in this or some other crate, tweaked as necessary and potentially moved into theheaderscrate in the future.This PR reverts the commit that reverted @kazk's commit that added
permessage-deflatesupport. It also copies his implementation ofSecWebsocketExtentionsintended for theheaderscrate.The implementation of
SecWebsocketExtentionsrelied on some internal utilities of theheaderscrate. I've removed some of those dependencies and re-implemented others. I have also gated the implementation behind the existinghandshakefeature, as well as any code that relies on it, since it relies on theheaderscrate, which in turn relies on many other HTTP crates.Blocking issues
headerscrate is licensed under the MIT license, while this crate uses a dual MIT+Apache license. We would need to add a proper license notice, or @kazk would need to consent to the re-licensing of his PR.Unresolved questions
deflatefeature without havingtungstenitehandle the Websocket handshake. There is aWebSocket::from_raw_socket_with_extensionsmethod to allow for exactly that, but it takes anExtensionsstruct, which has no public constructor.WebSocketConfig::accept_offersmethod, which implements the server part of extension negotiation and requires thehandshakefeature. This method is intended to allow fortungsteniteto be integrated into web frameworks. No similar public method exists for clients, although that might not be a big issue, since a client knows whether a request will be a Websocket request up front.WebSocketConfig::accept_offersfeels out of place on theWebSocketConfig. We could make it a standalone function, perhaps in theextensionsmodule.SecWebsocketExtentionsimplementation public. This is necessary to make theWebSocketConfig::accept_offersmethod usable. This does mean that ifSecWebsocketExtentionswere to be added toheadersin the future, switching to that implementation would (I think) be a breaking change. An alternative would be to haveWebSocketConfig::accept_offerstake and return aHeaderValue, which it would parse itself. This would allow us to makeSecWebsocketExtentionsa private implementation detail.WebSocketConfigand a new variant was added to theErrorenum. Since this field and variant depends on thedeflatefeature, I've annotated both with#[non_exhaustive]. I've done the same toProtocolError, which currently has variants gated behindhandshake. This sadly means that you cannot createWebSocketConfigusing the initializer syntax, instead you have to create an instance withWebSocketConfig::defaultand mutate it afterward. Sadly, I see no way to avoid this. It would be possible to disable#[non_exhaustive]if all relevant features are enabled.