Skip to content

Conversation

@julianz-
Copy link
Contributor

When a client sends plaintext HTTP to a TLS-only port, the SSL layer detects the invalid handshake and may close the underlying socket. The server attempts to send a 400 Bad Request error response, but the socket may already be closed, causing OSError during the flush.

With pyOpenSSL, the response usually succeeds. With the builtin SSL adapter, the socket is typically closed before the write can occur.

This fix overrides _flush_unlocked() in StreamWriter to catch OSError and clear the buffer. This allows:

  • The explicit flush to fail gracefully when sending the 400 response
  • Object finalization (__del__) to complete without errors

Note that swallowing OSErrors here should not affect normal communication: write() calls upon _flush_unlocked() in StreamWriter which in turn calls the base class implementation _flush_unlocked in BufferedWriter. The base implementation handles errors of type BlockingIOError such as WantWriteError and WantReadsError. But if the socket is dead an OSError is raised which propagates up to StreamWriter where _flush_unlocked() now handles the error.

@codecov
Copy link

codecov bot commented Nov 28, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 77.55%. Comparing base (c94db0c) to head (0540de5).
✅ All tests successful. No failed tests found.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #802      +/-   ##
==========================================
- Coverage   77.55%   77.55%   -0.01%     
==========================================
  Files          41       41              
  Lines        4688     4692       +4     
  Branches      542      541       -1     
==========================================
+ Hits         3636     3639       +3     
- Misses        909      911       +2     
+ Partials      143      142       -1     

@julianz- julianz- force-pushed the fix_http_over_https branch from 2aa4578 to 2e1424b Compare November 28, 2025 08:07
When a client sends plaintext HTTP to a TLS-only port, the SSL layer
detects the invalid handshake and may close the underlying socket.
The server attempts to send a 400 Bad Request error response, but the
socket may already be closed, causing OSError during the flush.

With pyOpenSSL, the response usually succeeds. With the builtin SSL
adapter, the socket may be closed before the write can occur.

This fix overrides `_flush_unlocked()` in `StreamWriter` to catch
`OSError` and clear the buffer. This allows:
- The explicit flush to fail gracefully when sending the 400 response
- Object finalization (`__del__`) to complete without errors
Comment on lines +330 to +331
wfile.flush()
wfile.close()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

plz avoid having more than one instruction in try-blocks.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, this should probably be a with-block anyway.

Comment on lines +75 to +80
We override this method because when a client sends plaintext HTTP
to a TLS-only port, SSL detects a bad handshake and may
invalidate the underlying socket. At that point, the socket
may not be writable. Attempting to write (e.g., sending a
400 Bad Request) may succeed or raise OSError. This override
prevents OSError from propagating.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part looks like a code comment, not a docstring.

noqa
PIL
pipelining
plaintext
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might not need this if we don't put that comment into a docstring but have it as a comment.

# fatal alert triggered by invalid handshake).
# Clearing the buffer prevents the error from happening
# again during cleanup.
self._write_buf.clear()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if we should be suppressing problems on the low level like this. It's likely the caller that should perform the suppression since the stream writer does not know the context it's used in (nor should it).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like that we mess with private attributes (although, the existing code already accesses it). In particular, it's weird how TLS quirks leak into makefile. I'd wait for other bits of refactoring, especially the removal of makefile in favor of classmethods before doing anything about this.

@webknjaz
Copy link
Member

webknjaz commented Dec 5, 2025

@julianz- did this become redundant after the last merge? Anything worth salvaging?

@julianz-
Copy link
Contributor Author

julianz- commented Dec 5, 2025

I think this fix is still useful. I saw on my last commit on my fork for #800, "OSError: [Errno 9] Bad file descriptor" coming from one environment because StreamWriter was still trying to complete a write() to a dead socket. The main idea in this PR is to make StreamWriter more resilient to dealing with sockets that are being torn down. I can work on rebasing and updating this PR but I wasn't sure whether you were dead set against the idea because of the need to override private methods?

@webknjaz
Copy link
Member

webknjaz commented Dec 6, 2025

The underlying issue should be fixed, yes. However, I'm not convinced that this is the right place to patch. This is not only because of the private method. It's because the stream writer/reader are generic. They don't know what they operate on, nor should they (to maintain abstraction separation). It's the caller that would know if it's working in the TLS context or plain TCP. And I'm pretty sure the calling side needs to be handling any I/O-related errors that arise from using these objects exactly where the calls are made. Suppressing such problems will influence any caller in any context, making it look like the operations they invoke complete just fine and they can proceed with whatever they were doing, potentially issuing more calls that would definitely fail. This is a recipe for disaster.

The main root cause is likely mismatching lifetimes of objects (sockets vs. stream readers/writers). We seem to be forgetting to close resources after use, as you've identified in one part of the PR. So this was causing the garbage collector to invoke .close() from within the destructor method __del__() and an inconsistent point in time: https://github.com/python/cpython/blob/eba449a1989265a923174142dd67dee074f90967/Lib/_pyio.py#L484.

To address that, we should make sure such the resources are freed and discarded (.close()) once not needed. The idiomatic way of doing this is context managers but the code base is so old that they didn't exist originally and later on, haven't been implemented everywhere they should be. With that, we wouldn't need to mess with close()/flush() manually, since that'll happen upon exiting said blocks. That said, we'll still need those with-blocks wrapped with I/O error handling.


Fixing this is definitely a larger effort that should be coherent across the code base. I'd perhaps prioritize other PRs before handling this one properly.

@webknjaz
Copy link
Member

webknjaz commented Dec 6, 2025

(another bit of context)

Here's the issue that exposed this problem originally: #734.

The warning was suppressed in #735 for Python 3.13 and must be reverted as a part of the PR fixing the problem.

@webknjaz
Copy link
Member

webknjaz commented Dec 6, 2025

It's possible that #779 was closer to the solution partially, but incomplete. I'd still postpone this one for now, though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants