-
Notifications
You must be signed in to change notification settings - Fork 89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Attempt to improve plug call exception handling and logging #398
Conversation
- Telementry events - Also emit span for throw - Also emit span for exit - Logger - Lets logger handlers know about the original crash reason when handling bandit errors
lib/bandit/pipeline.ex
Outdated
rescue | ||
error -> handle_error(error, __STACKTRACE__, transport, span, opts) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know it's redundant with the catch block below (raise
being realized by try/catch internally), but I don't think it's part of official spec; can we leave this clause here to make it explicit? Alternately, if the language is authoritative on this, that's fine too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right.
Despite working like that doesn't seem to be actually documented in https://hexdocs.pm/elixir/try-catch-and-rescue.html that you can catch all (including :error
) with just catch
.
Re-instating the rescue
clause just in case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call. It sounds like it's an implementation detail then, so we're safer to be explicit in case they ever change it in the language.
lib/bandit/telemetry.ex
Outdated
* `reason`: The exception, throw value or exit reason which caused this unexpected termination | ||
* `stacktrace`: The stacktrace of the location which caused this unexpected termination | ||
|
||
* `[:bandit, :request, :exception]` (DEPRECATED) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realize that exception
is (and always has been) the wrong term here, but I've deliberately been using it to mirror the naming conventions of :telemetry.span/3.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotcha.
I only introduced a new event name just to not break the shape of the actual one when supporting more kind
s.
Updating back to only [:bandit, :request, :exception]
, with the correct kind
and reason
field, and maybe keeping the exception
field in case is_exception(reason)
, so that this doesn't break peoples telemetry handlers listening and matching on that metadata field.
It will break anyone matching on kind: :exit
and exception: exception
at the same time. This will now be corrected to kind: :error
and exception: exception
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's fine
A couple of changes requested to keep things consistent and explicit; other than those changes the rest of this looks great! |
7765728
to
c6347e1
Compare
12d7cc5
to
8dc344f
Compare
Fixed format. Can't reproduce locally the |
@mtrudel Just realized that Bandit was originally taking the approach of re-raising plug call errors in 1.4 instead of just "logging or silence" with changes in 1.5 with https://github.com/mtrudel/bandit/pull/342/files#diff-68626306df11d3e10964941997c6cdafbf587a9ec7de087141840655a47a1855L65. Curious about the reason not to continue re-raising? Similar to what
In case there's the possibility of introducing re-raise again, this pull request could be re-worked. |
Reading this thread (https://groups.google.com/g/elixir-lang-core/c/pWz-uTVMEVM/m/QGXncxD1AQAJ) also got me thinking about it. |
Interesting! It seems like they're using exactly the 'catch The main purpose of that PR was to ensure that logging of errors takes into consideration the accompanying result code (ie: only log errors that get raised with a 500..599 response code). This generally means that we do the logging ourselves (as we do in that PR) instead of reraising and letting the runtime log it All of which is to say that I think we're doing the right thing by not re-raising? |
Mmm... not sure to he honest. I'm exploring the "re-raising approach" in a separate branch to compare. I'll let you know how it goes. |
Opening draft with what I got from playing with that alternative approach: #400. |
error -> handle_error(:error, error, __STACKTRACE__, transport, span, opts) | ||
catch | ||
kind, reason -> | ||
handle_error(kind, reason, __STACKTRACE__, transport, span, opts) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not right.
It will catch and handle "normal" exists as errors.
Logger.error( | ||
Exception.format(kind, reason, stacktrace), | ||
domain: [:bandit], | ||
crash_reason: crash_reason(kind, reason, stacktrace) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the record, reported separate issue for "missing crash_reason
" so it can be tackled separately: #413.
Supersedes #396.
Telementry events
Logger
crash_reason
: https://github.com/elixir-lang/elixir/blob/4bbddcc053498ca28ea5ebb5039de9b3a97ddf82/lib/logger/lib/logger.ex#L130-L135crash_reason
: https://github.com/dashbitco/broadway/blob/9eaf2140b6c1f36b25f990a7c74052bece26aadc/lib/broadway/acknowledger.ex#L104-L113crash_reason
: https://github.com/elixir-plug/plug_cowboy/blob/746d848baca3436ca74fdab2750d14070b92002b/lib/plug/cowboy/translator.ex#L42-L46