Skip to content

Conversation

@akonradi-signal
Copy link
Contributor

Reduce the amount of nested code by returning earlier on errors and from self-contained logic instead of branching with if/else and match expressions. Use an enum type to consolidate common code and avoid a panic! call.

/// Try to decode one message frame. May return None.
fn read_message_frame(&mut self, stream: &mut impl Read) -> Result<Option<Message>> {
if let Some(frame) = self
let frame = match self
Copy link
Contributor Author

@akonradi-signal akonradi-signal Sep 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice if this could be written as let Some(frame) = ... else { ... }; but that's not available until Rust 1.65 and the MSRV here is 1.63.

Copy link
Member

@daniel-abramov daniel-abramov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with these ones:

  • ✅ Early return from the first match. It definitely looks better!
  • ✅ Replacing if let Some(ref mut msg) = self.incomplete { .. } else { return .. } with self.incomplete.as_mut().ok_or(..). Definitely looks more elegant!

As for excessive branching: while I generally agree that it may make code less readable and that read_message_frame() would benefit from some refactoring, I'm not sure that the rest of the changes bring the desired simplification for the following reasons:

  1. The current handling of the frame's opcode is just under 70 LOC. I may be biased, but I find 70 LOC that contain a "flat representation" of all possible cases (including errors) quite useful. With the new changes, I get a bit more confused, because, e.g., the handling of the reserved code for the Ctl opcode is done within the match, but the handling of the reserved code for the Data opcode is done in a different place (inside the DataMessageType::try_from() conversion function) for no apparent reason.
  2. I'm generally confused about the new DataMessageType. It feels like it was only introduced to remove a single match case inside the read_message_frame() and that it has no utility otherwise. The TryFrom<Data> for DataMessageType essentially states that we can create DataMessageType from Data, by e.g. doing Data::Binary => Self::Initial(IncompleteMessageType::Binary), which feels logically incorrect, because we cannot state that the message is incomplete before checking fin first (frame.header().is_final). Another problem that it introduces is that it changes the order of operations: since try_from() transforms reserved code into an error, DataMessageType::try_from(data)? would early return an error when the reserved code is used. However, the previous implementation would not do so if the self.incomplete is Some(..) (it would return a different error), because the match arm for _ if self.incomplete.is_some() would be checked before the OpData::Reserved(i).

@akonradi-signal
Copy link
Contributor Author

I've pulled out the uncontroversial changes into separate commits while leaving the total change the same. As for the rest of the feedback:

  1. I agree 70 LOC is not bad now. I'm working on adding support for the "permessage-deflate" extension (another take on Add permessage-deflate support, again #426), which makes this code a bit less compact, and wanted to try to get ahead of that. If separating the data and control frame handling doesn't seem useful now I can omit that.
  2. The DataMessageType is my attempt to not move a match but to remove a panic! that is required to satisfy the compiler but can't be hit in practice. I believe that transforming an input into a custom enum type that is matched exhaustively is a good pattern for ensuring that this kind of "knowledge at a (for now, small) distance" doesn't fail to get updated in the future. I had originally written DataMessageType as an inline definition in read_message_frame but figured that might be stylistically unpalatable and so moved it; I think that scoping might make it more obvious that the intended usage is only for control flow. The early try_from introducing a different error path is a great observation. It's not inherent to the usage of a separate enum type, though, and I'm happy to preserve the original error path if you agree with me that this "custom enum" path is still worth pursuing.

@akonradi-signal
Copy link
Contributor Author

Funnily enough I tried writing a test against the current master for the existing behavior of read_message_frame:

#[test]
fn reserved_data_frame_type_in_incomplete_message() {
    let mut incoming = WriteMoc(Cursor::new(&[
        0x01, 0x06, 0x48, 0x65, 0x6c, 0x6c, 0x6f, 0x2c, // "Hello"
        0x06, // reserved frame code
        0x00, 0x06, 0x20, 0x74, 0x68, 0x65, 0x72, 0x65, // (continuation) " there"
    ]));

    let mut context = WebSocketContext::new(Role::Client, None);

    let err = context.read(&mut incoming).unwrap_err();
    assert!(
        matches!(err, Error::Protocol(ProtocolError::UnknownDataFrameType(0x06))),
        "err was {err:?}"
    );
}

An error is produced by WebSocketContext::read but it's not that one! It turns out that there's an extra check for the reserved opcodes over in

match opcode {
OpCode::Control(Control::Reserved(_)) | OpCode::Data(Data::Reserved(_)) => {
return Err(Error::Protocol(ProtocolError::InvalidOpcode(first & 0x0F)))
}
! So the match block below shouldn't even encounter those reserved opcodes.

@daniel-abramov
Copy link
Member

I've pulled out the uncontroversial changes into separate commits while leaving the total change the same.

Thanks! If you prefer, you could create a separate PR with those commits, so that I can merge them right away.

The DataMessageType is my attempt to not move a match but to remove a panic! that is required to satisfy the compiler but can't be hit in practice.

I agree that having panic!() for an unreachable branch is less than ideal (we could replace it with unreachable!() though, to make the intention clear), but I'm still not quite convinced that it's worth introducing a new entity (type) that has somewhat unclear semantics and the the only utility is to remove that single match.

Eagerly check conditions for a data frame and encode the result
explicitly in a custom enum type. This lets us combine code from two
match arms and eliminate a panic! that was impossible to hit but
required by the compiler.
@akonradi-signal
Copy link
Contributor Author

I've pushed a new final commit on the branch that takes a more focused approach to the refactoring. I've preserved the top-level match on the frame opcode and the precedence of checks for error conditions.

As noted in the commit description, this does let us get rid of a panic!() and lets us collapse the max size checks for text and binary messages. If you remain unconvinced I'll go ahead and open a separate PR for the first two commits and we can iterate more or just drop the last one.

@daniel-abramov
Copy link
Member

Thanks! I've pondered over the changes in order to understand what exactly I do not like about the current (master) implementation of that logic, and why the suggested change (final commit) still feels a bit wrong to me.

I think the main thing I find suboptimal about the implementation in master is not the use of panic!() in the unreachable branch (we could use unreachable!() after all; this does not make the code unreadable). My main concern is that the match in question is used as a replacement for a long and very nested if { } else if { } else if { } state machine, which leads to the use of numerous different match arms, including several conditionals with if. This makes the execution flow feel somewhat complicated (at least for those unfamiliar with the codebase or who do not recall all the details), and the order of the match checks becomes significant. I assume that it might have been the reason why you did not like that match block as well?

The only way to get rid of this complication (without some major re-thinking / refactoring) is to split the processing of the data frame into 2 stages. I believe that your latest change tries to achieve it: judging from the code, the first match stage seems to be trying to process the data opcode by ensuring that the received frame adheres to the protocol and matches the expectation of the current state of the websocket (e.g., self.complete). However, the next match again checks that the (unchanged) internal state (self.incomplete) is in a proper state and returns a protocol error. In other words, I don't feel like it simplifies the execution flow much (but I may be biased).

I attempted to formulate a suggestion based on your changes and what I believe might address the points you dislike, while considering my concerns and maintaining code readability.

Your Version My Suggestion
enum FrameType {
    Continue,
    Initial(IncompleteMessageType),
}

let fin = frame.header().is_final;
let frame_type = match data {
    OpData::Continue => Ok(FrameType::Continue),
    _ if self.incomplete.is_some() => Err(ProtocolError::ExpectedFragment(data)),
    OpData::Text => Ok(FrameType::Initial(IncompleteMessageType::Text)),
    OpData::Binary => Ok(FrameType::Initial(IncompleteMessageType::Binary)),
    OpData::Reserved(i) => Err(ProtocolError::UnknownDataFrameType(i)),
}?;

match frame_type {
    FrameType::Continue => {
        let msg = self
            .incomplete
            .as_mut()
            .ok_or(ProtocolError::UnexpectedContinueFrame)?;
        msg.extend(frame.into_payload(), self.config.max_message_size)?;

        if fin {
            Ok(Some(self.incomplete.take().unwrap().complete()?))
        } else {
            Ok(None)
        }
    }
    FrameType::Initial(data_type) => {
        if fin {
            check_max_size(frame.payload().len(), self.config.max_message_size)?;
            Ok(Some(match data_type {
                IncompleteMessageType::Text => Message::Text(frame.into_text()?),
                IncompleteMessageType::Binary => {
                    Message::Binary(frame.into_payload())
                }
            }))
        } else {
            let mut incomplete = IncompleteMessage::new(data_type);
            incomplete
                .extend(frame.into_payload(), self.config.max_message_size)?;
            self.incomplete = Some(incomplete);
            Ok(None)
        }
    }
}
let fin = frame.header().is_final;

let payload = match (data, self.incomplete.as_mut()) {
    (OpData::Continue, None) => Err(ProtocolError::UnexpectedContinueFrame),
    (OpData::Continue, Some(incomplete)) => {
        incomplete.extend(frame.into_payload(), self.config.max_message_size)?;
        Ok(None)
    }
    (_, Some(_)) => Err(ProtocolError::ExpectedFragment(data)),
    (OpData::Text, _) => Ok(Some((frame.into_payload(), MessageType::Text))),
    (OpData::Binary, _) => Ok(Some((frame.into_payload(), MessageType::Binary))),
    (OpData::Reserved(i), _) => Err(ProtocolError::UnknownDataFrameType(i)),
}?;

match (payload, fin) {
    (None, true) => Ok(Some(self.incomplete.take().unwrap().complete()?)),
    (None, false) => Ok(None),
    (Some((payload, t)), true) => {
        check_max_size(payload.len(), self.config.max_message_size)?;
        match t {
            MessageType::Text => Ok(Some(Message::Text(payload.try_into()?))),
            MessageType::Binary => Ok(Some(Message::Binary(payload))),
        }
    }
    (Some((payload, t)), false) => {
        let mut incomplete = IncompleteMessage::new(t);
        incomplete.extend(payload, self.config.max_message_size)?;
        self.incomplete = Some(incomplete);
        Ok(None)
    }
}

What do you think?

@daniel-abramov
Copy link
Member

@akonradi-signal
Copy link
Contributor Author

akonradi-signal commented Sep 26, 2025

I'm working on adding support for the "permessage-deflate" extension (another take on Add permessage-deflate support, again #426), which makes this code a bit less compact, and wanted to try to get ahead of that.

Closing the loop: https://github.com/signalapp/tungstenite-rs supports permessage-deflate. I incorporated the refactors from #518 as 119f4d7. I had been worried the would be difficult to adopt but I'm pleasantly surprised with how the overall flow turned out!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants