Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add logger filtering on process level #2384

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

hauleth
Copy link
Contributor

@hauleth hauleth commented Sep 10, 2019

This adds possibility to filter logger messages on process basis. This feature is mostly useful for disabling logging in some processes or to increase logging verbosity during debugging.

@CLAassistant
Copy link

CLAassistant commented Sep 10, 2019

CLA assistant check
All committers have signed the CLA.

@garazdawi
Copy link
Contributor

If you do this:

logger:add_primary_filter(pid_filter,
    {fun(_Log,Pid) -> if self() =:= Pid -> stop; true -> ignore end end, PidToIgnore}).

or possibly a variant that uses the process dictionary you can achieve the same result as this PR.

It is also possible to raise the log level in a filter, though the primary filters are run after the primary level check so the level filtering then needs to be done in the handler.

Is this good enough or is using pid as a filter common enough that it would be a good thing to add a standardized way to do it for all users of logger?

@hauleth
Copy link
Contributor Author

hauleth commented Sep 11, 2019

It came out as a proposal for core while working on elixir-lang/elixir#9333 which supports per-process disable of logging (which with this change can be achieved by logger:set_process_level(:none)). Currently I have implemented that as a primary filter in form:

logger:add_primary_filter(process_disable, {fun (_Log,_Conf) ->
  case get(logger_disabled) of
    true -> stop;
    _ -> ignore
  end
end, []}).

I cannot say anything about popularity of that functionality in Elixir community, I was just porting it to merge loggers of both worlds.

@garazdawi
Copy link
Contributor

garazdawi commented Sep 11, 2019

The thing with adding this check into the logger API and not as a primary filter is that then we have to check the process dictionary for each log call done everywhere which is not free.

So unless many people find this a very useful feature I'm going to reject this PR.

@hauleth hauleth force-pushed the kernel/per-process-log-level branch from 77246b9 to 4215f41 Compare September 11, 2019 09:12
@hauleth
Copy link
Contributor Author

hauleth commented Sep 11, 2019

@garazdawi I understand, it was just a proposal. Alternative solution could be to merge process metadata and that new entry into one, but that will still require lookup into process dictionary on each request. I will try to bench it to see if there is any noticeable impact.

@essen
Copy link
Contributor

essen commented Sep 11, 2019

I think it's very useful to be able to do this, and I think the process dictionary solution is good (because you can choose to silence a specific part of a process execution, like a noisy shutdown) but also not so good (can't always enable it easily from outside the process) so I do not think it should be in the API as it currently stands. For the feature to be truly useful I would need something equivalent to logger:silence_pid(Pid) that I can call from anywhere, including the shell.

@garazdawi
Copy link
Contributor

Silencing something is achievable with a primary filter. What I'm wondering about is the use-case of making a process more verbose. It is not possible to do with today without building multiple filters and using the module logger level.

@josevalim
Copy link
Contributor

but that will still require lookup into process dictionary on each request. I will try to bench it to see if there is any noticeable impact.

To clarify, this means we will read from the process dictionary one time (instead of zero) whenever there is a filter that skips but the number of reads whenever you log would remain the same.

It is not possible to do with today without building multiple filters and using the module logger level.

Oh, I haven't thought about this and it would actually be really nice. Yesterday I was having a discussion with someone who wanted SASL reports during shutdown for a specific supervisor, but those require debug level or similar, which meant you had to enable them for the whole system. If we have something like logger:set_process_level and it could be called externally like @essen proposes, then it should just work?

But the issue is that I don't know how to implement logger:set_process_level for external processes efficiently. It would require a process to coordinate and more ets reads. Maybe a good compromise would be:

  1. Make it pdict based and store the level alongside the metadata in pdict to remove the number of reads

  2. Add sys:set_logger_level and sys:set_logger_metadata functions which allows us to set those for existing behaviours

Thoughts?

@essen
Copy link
Contributor

essen commented Sep 11, 2019

Silencing something is achievable with a primary filter. What I'm wondering about is the use-case of making a process more verbose. It is not possible to do with today without building multiple filters and using the module logger level.

Sorry, lack of sleep. I should have written logger:set_pid_level(Pid, Level). Basically sometimes it would be useful to disable logging when it's too much, but the opposite is also true, for example when debugging (and of course if you enable debug logs for a process in prod temporarily you probably want to disable them when you're done). I've avoided debug logging many times in the past because I can't easily target what I really want to look at, compared to tracing where I can trace a single process. I have not used logger enough so far to know how it'd fit though.

I suppose there's overlap with sys:debug, but in this case it wouldn't be limited to debug logging.

@hauleth
Copy link
Contributor Author

hauleth commented Sep 11, 2019

There is one more possible option - additional server that would watch over the filtered-out processes and would update filter argument accordingly. Unfortunately that still has disadvantage of not being able to increase verbosity over primary/module level, as these will always have precedence.

@hauleth hauleth force-pushed the kernel/per-process-log-level branch 2 times, most recently from df2e6f0 to 853e61a Compare September 11, 2019 11:56
@ferd
Copy link
Contributor

ferd commented Sep 11, 2019

One of the big thing I have in mind when working on a larger system with multiple containerized nodes is really about how to turn all of these options about what to log and what not to log in a way that is long term maintainable and versioned.

To me, the weakness of approaches that are global and remote such as logger:set_pid_level(Pid, Level) or setting a global filter with a predefined list of pids, is that it is ultimately a very local decision that isn't set by a policy: if a process restarts, the pid refreshes and then it starts logging again. If after a few weeks or months the process counter rolls over after the death of a process, you may start silencing processes you never expected to.

So to me the only safe logical and tractable approach is something that is set locally by the process executing: have the process put a value in its local logging context or within the process dictionary. Then I can audit it with the code, and if the worker restarts, its logging level is configured consistently.

The weakness of this approach is obviously when you want to silence a process for which you do not own the code, but I feel that if we had a structured approach to logging, then adding semantic filters would make sense (i.e. "filter out the logs related to the inets application, but keep those related to httpc"), and be more robust to code changes.

But then again maybe I'm debugging things an entirely different way than folks asking for this feature are. I pretty much never increase the logging level at run-time (I'll usually use tracing or add a temporary sys module probe instead), and when I do I'm usually not connected to stdout anyway and only consume things off a file, where manual filtering is likely easy.

My gut feeling here is that adding pid-based filtering at a distance and sticking it in the logger API is addressing deeper (and broader) design issues in the wrong place. I keep thinking that the existing filter mechanisms ought to be good enough.

@josevalim
Copy link
Contributor

josevalim commented Sep 11, 2019

@ferd my concern is that whatever we have today doesn't address libraries which provide you a behaviour that you end-up running. Those are things like poolboy, cowboy, etc. I can have many poolboy pools in the system and I want to debug one of them. How to do that? I can do it per module but they all have the same module. The same applies to supervisors. I don't want to sys:trace a supervisor, it will print a lot more information than I actually care about. And while I would possibly be ok with debugging a supervisor at this level, that's not the pattern I would generally recommend.

So unless I am missing something, I really don't buy we are addressing design issues at the wrong place, given the issue happens with a built-in component such as supervisor. I agree though we don't need to be able to do it at a distance. A combination of pdict + sys:set_logger_level I mentioned earlier would also do the trick.

This adds possibility to filter logger messages on process basis. This
feature is mostly useful for disabling logging in some processes or to
increase logging verbosity during debugging.
@hauleth hauleth force-pushed the kernel/per-process-log-level branch from 853e61a to 0cea267 Compare September 11, 2019 13:11
@essen
Copy link
Contributor

essen commented Sep 11, 2019

There's two types of uses, yours (@ferd) is controlling what logs will be printed on a per app/process basis, and general debugging (not necessarily in production). Obviously in the latter case it doesn't matter if the level gets reset when the process restarts, it's no longer the same process or state that you were looking into and may not display the behavior you were investigating anymore.

Let's say I have a customer connected with a session process named {session, ID}. The customer reports an issue, I set debug level temporarily for the session of that customer, ask them to reproduce it, then set the level back to normal (or just let the level as-is until the session expires) and look into the logs. This is a simplified example of course, as when investigating I may enable this in more processes.

I see this as a great alternative to -ifdef(DEBUG) in that scenario.

That said as long as processes handle sys messages even a pdict approach would work, you could use sys:replace_state to update the log level for the other process. It only breaks when normal processes are involved.

@hauleth
Copy link
Contributor Author

hauleth commented Sep 11, 2019

To me, the weakness of approaches that are global and remote such as logger:set_pid_level(Pid, Level) or setting a global filter with a predefined list of pids, is that it is ultimately a very local decision that isn't set by a policy: if a process restarts, the pid refreshes and then it starts logging again. If after a few weeks or months the process counter rolls over after the death of a process, you may start silencing processes you never expected to.

That is why this proposal uses pdict. Alternatively, if we decide that there should be "filtered out" list, then there should be process that will monitor all processes that have altered logging level (as this is meant mostly as a debugging and testing tool there should not be many of such processes) and would remove these pids from list when the process dies. This will prevent your second scenario.

then adding semantic filters would make sense (i.e. "filter out the logs related to the inets application, but keep those related to httpc"), and be more robust to code changes.

It is possible nowadays via logger:set_module_level/2 and logger:set_application_level/2 (while second function one does that indirectly by using first one). However as @josevalim and @essen said, this is only reasonable if you want to monitor everything happening within given module, but if one module is shared between processes (as it is done in poolboy, cowboy, ecto, and a lot of other libraries) it ends up being not helpful at all, as you still need to somehow filter the messages later.

I keep thinking that the existing filter mechanisms ought to be good enough.

It is almost possible to do what is requested by just using filters. Only thing not possible right now is to increase logging level for a process over primary/module level. So while the "original" intent, which was implementing Elixir Logger.disable/1 and Logger.enable/1 functionality. But since then it expanded a little to fit more to logger:*_module_level functions.

@hauleth
Copy link
Contributor Author

hauleth commented Sep 11, 2019

Also I have started Twitter poll to check how many people used that functionality (or were aware of it). So far it shows that it wasn't commonly known or used (at least directly). I know that this isn't representative sample of the community, but it is something.

cc @josevalim @garazdawi

@ferd
Copy link
Contributor

ferd commented Sep 11, 2019

@ferd my concern is that whatever we have today doesn't address libraries which provide you a behaviour that you end-up running. Those are things like poolboy, cowboy, etc. I can have many poolboy pools in the system and I want to debug one of them. How to do that? I can do it per module but they all have the same module. The same applies to supervisors. I don't want to sys:trace a supervisor, it will print a lot more information than I actually care about. And while I would possibly be ok with debugging a supervisor at this level, that's not the pattern I would generally recommend.

So unless I am missing something, I really don't buy we are addressing design issues at the wrong place, given the issue happens with a built-in component such as supervisor. I agree though we don't need to be able to do it at a distance. A combination of pdict + sys:set_logger_level I mentioned earlier would also do the trick.

I usually just use the pid in the metadata as part of what the formatter does by default, and then I can filter based off that, either from the text logs, or if doing point-in-time debugging, by adding a handler to do just that.

The one thing this does not address is what @hauleth referred in allowing to increase the perceived level of a process, but this is the kind of stuff I never do in prod, just because I tend to assume that the data that goes in debug is sure as hell not data that is safe to put in production. If it's not logged by default, it likely hasn't been reviewed properly, has likely not seen proper load testing, and it could contain all kinds of PII and implementation details I do not want to be sent to log files or forwarded to external services.

Now that's probably a question of preferences and we're unlikely to get an agreement. If I had to do it though despite all of that, I would probably set per-handler log levels (i.e. default and disk are set to info, crash logs have a higher level), and forego the base logger level (set it to debug so everything goes). When I do want to debug a particular thing, I would add a handler just for that setup, where I can filter things in a custom manner, and remove it once I'm done.

But usually I never make it there; by the time I reach that point, I usually had the time to trace something and figure it out without going through logs.

@garazdawi garazdawi added feature team:VM Assigned to OTP team VM labels Sep 12, 2019
@garazdawi garazdawi self-assigned this Sep 12, 2019
@garazdawi
Copy link
Contributor

Hello all! I'm putting this PR on hold for now as we need to figure out what direction we want to take logger in relation with what sys and what erlang:trace/3 can do before proceeding.

@garazdawi garazdawi added the stalled waiting for input by the Erlang/OTP team label Sep 19, 2019
@hauleth
Copy link
Contributor Author

hauleth commented Oct 28, 2019

@ferd @garazdawi what about another approach that will not allow to increase verbosity of the system, but only to reduce verbosity for given process? So user could silence noisy processes if needed, but not increase the level.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature stalled waiting for input by the Erlang/OTP team team:VM Assigned to OTP team VM
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants