Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix endless 5xx responses leading to pages #5392

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

segiddins
Copy link
Member

@segiddins segiddins commented Jan 14, 2025

The root cause here is that rails was returning
responses to the reverse proxy where the total
size of all the headers was greater than 4k. This
would lead to an
upstream sent too big header while reading response header from upstream
error in nginx, resulting in a default 502 bad
gateway response.

It took us dozens of hours of searching to find
the root cause because our nginx log parsing was
throwing away all of the error logs, and I only
saw it when directly tailing the logs from the
container in k8s. I have since fixed the log pipeline so the errors will show up as errors, with the error message.

As to how the rails app was returning so many
header bytes:
The Redirector middleware generates a 301
whenever it sees a host that does not match a
predefined list. Notably, api.rubygems.org
points to prod, but is not in that list.
The 301 includes a Location header with the
correct host, while maintaining the rest of the
path and query string. This means that the Location
header could contain up to the nginx limit of almost
8k bytes. In combination with the other headers returned
by the rails app, this was sufficient to exceed the
default limit of 4k bytes.

Verified that this fixes the 502s on staging by manually applying.

Tested via curl http://localhost:8080/versions/$(printf 'HelloWorld%.0s' {1..395})/abcdef -H 'X-Forwarded-Proto: https' -H 'X-Edge-Proto: https' -H 'Host: indexs.rubygems.org' -H 'Accept: sam/test' -v

Signed-off-by: Samuel Giddins [email protected]

The root cause here is that rails was returning
responses to the reverse proxy where the total
size of all the headers was greater than 4k. This
would lead to an
`upstream sent too big header while reading response header from upstream`
error in nginx, resulting in a default 502 bad
gateway response.

It took us dozens of hours of searching to find
the root cause because our nginx log parsing was
throwing away all of the error logs, and I only
saw it when directly tailing the logs from the
container in k8s.

As to how the rails app was returning so many
header bytes:
The `Redirector` middleware generates a 301
whenever it sees a host that does not match a
predefined list. Notably, `api.rubygems.org`
points to prod, but is _not_ in that list.
The 301 includes a `Location` header with the
correct host, while maintaining the rest of the
path and query string. This means that the `Location`
header could contain up to the nginx limit of almost
8k bytes. In combination with the other headers returned
by the rails app, this was sufficient to exceed the
default limit of 4k bytes.

Signed-off-by: Samuel Giddins <[email protected]>
Copy link

codecov bot commented Jan 14, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 97.06%. Comparing base (cecf79b) to head (10cd26c).

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #5392   +/-   ##
=======================================
  Coverage   97.06%   97.06%           
=======================================
  Files         451      451           
  Lines        9391     9391           
=======================================
  Hits         9115     9115           
  Misses        276      276           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@simi
Copy link
Member

simi commented Jan 14, 2025

Are those valid requests (according to app logic) with such a huge headers?

@segiddins
Copy link
Member Author

They are valid HTTP, yes, as is the response.

@martinemde
Copy link
Member

Is there any chance we would instead disable buffering so that the response is sent to the client (fastly) synchronously without buffering the entire response and potentially writing files to disk?

https://nginx.org/en/docs/http/ngx_http_proxy_module.html#proxy_buffering

@segiddins
Copy link
Member Author

No, buffering (of the headers) is necessary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

Successfully merging this pull request may close these issues.

3 participants