Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Instrument some real world projects #9

Open
ethercrow opened this issue Apr 19, 2020 · 14 comments
Open

Instrument some real world projects #9

ethercrow opened this issue Apr 19, 2020 · 14 comments

Comments

@ethercrow
Copy link
Owner

Would be nice to find and fix pain points in the process and build a portfolio of instrumentation examples to point users to.

@mpickering do you have some project in mind?

@mpardalos you might also be interested.

@mpickering
Copy link
Collaborator

mpickering commented Apr 19, 2020

@ethercrow I can try on GHC if you tell me how to emit the right traces. If it doesn't involve a library that would be better or I should be able to copy the library into the GHC tree somehow.

The other project it would be useful on would be ghcide but I'll leave that for now..

@ethercrow
Copy link
Owner Author

@mpickering

I think these two files should be enough for GHC

https://github.com/ethercrow/opentelemetry-haskell/blob/master/opentelemetry/src/OpenTelemetry/Eventlog.hs
https://github.com/ethercrow/opentelemetry-haskell/blob/master/opentelemetry/src/OpenTelemetry/SpanContext.hs

but you'd have to replace the Control.Monad.Catch.generalBracket with the GHC-specific one.

After you have an eventlog, you can use eventlog-to-zipkin or eventlog-to-lightstep to export it somewhere, the README has some examples.

Maybe hie-bios is a nicer first target since it can depend on exceptions?

@mpickering
Copy link
Collaborator

mpickering commented Apr 20, 2020

I instrumented GHC and it seems there is something broken in the zipkin exporter because each event seems to end up in its own trace.

Also problems with

  • Eventlog parsing logic assumes threaded runtime (Just tid in UserMessage case)
  • Zipkin returned 202 response code rather than 200

I will email you the eventlog.

image

@ethercrow
Copy link
Owner Author

"202 Accepted" is fine, I'll fix the status code check.

Yeah, I haven't thought of non-threaded runtimes yet, I'll take a look, thanks for the eventlog!

@ethercrow
Copy link
Owner Author

@mpickering with this change in place it looks better, some traces look meaningful.

> git diff opentelemetry-extra/
diff --git a/opentelemetry-extra/src/OpenTelemetry/EventlogStreaming_Internal.hs b/opentelemetry
-extra/src/OpenTelemetry/EventlogStreaming_Internal.hs
index 77dd6c8..fd4e961 100644
--- a/opentelemetry-extra/src/OpenTelemetry/EventlogStreaming_Internal.hs
+++ b/opentelemetry-extra/src/OpenTelemetry/EventlogStreaming_Internal.hs
@@ -101,7 +101,7 @@ processEvent (Event ts ev m_cap) st@(S {..}) =
         (EndGC, _, _) -> popSpansAcrossAllThreads now st
         (HeapAllocated {allocBytes}, _, Just tid) ->
           (modifySpan tid (addEvent now "heap_alloc_bytes" (showT allocBytes)) st, [])
-        (UserMessage {msg}, _, Just tid) -> case T.words msg of
+        (UserMessage {msg}, _, fromMaybe 0 -> tid) -> case T.words msg of
           ("ot1" : "begin" : "span" : name) ->
             (pushSpan tid (T.intercalate " " name) now st, [])
           ("ot1" : "end" : "span" : _) -> popSpan tid now st

Screen Shot 2020-04-20 at 12 48 31 PM

@mpickering
Copy link
Collaborator

mpickering commented Apr 20, 2020

@ethercrow That is the change I made locally. How did you view that trace? If I search for "serviceName = ghc" then I only see 10 traces but there should probably just be one?

@mpickering
Copy link
Collaborator

mpickering commented Apr 20, 2020

Perhaps the issue is that there is no span in the log which covers all of the child spans? I will try adding that.

EDIT: This fixes the import. Unfortunatey Zipkin does not deal with visualising how many traces there are well, would be better to make the tracing more fine-grained. In fact, I am not sure what I am supposed to discover from looking at the trace, it seems that I can just view it without doing anything meaningful.

@ethercrow
Copy link
Owner Author

The UX of all these tools is a problem to me as well. Zipkin and Jaeger I don't get at all and Lightstep is I think the most sane, possibly because I've been using it at work for a year.

@ethercrow
Copy link
Owner Author

ghc-eventlog

@ethercrow
Copy link
Owner Author

@mpickering I just added the eventlog-to-chrome executable that I used to convert your ghc eventlog into a thing above. All the traces are on a single timeline.

@ethercrow
Copy link
Owner Author

ghc-eventlog-tracy

@ethercrow
Copy link
Owner Author

An attempt to instrument stack is here commercialhaskell/stack#5260

@ethercrow
Copy link
Owner Author

@broadwaylamb
Copy link

broadwaylamb commented May 17, 2020

I've been working on almost the same thing in GHC itself.

The implementation: https://gitlab.haskell.org/broadwaylamb/ghc/-/tree/timetrace-better-metadata

The ghc-devs post: https://mail.haskell.org/pipermail/ghc-devs/2020-May/018827.html

The difference is that you don't need to compile GHC with -eventlog to get it to work, the Chrome Trace Event file can be generated by passing the -ddump-time-trace flag to GHC. Also there is support for merging logs generated by multiple compiler invocations (with this script).

I've also modified the withTiming function a little bit, so that an event's metadata is not encoded in the event's name, but is stored separately, which allows sorting events just by their name. For example, instead of generating events with name Parser [Main], events are generated with the name Parser and metadata "Main", so if we're compiling multiple modules, we can see how much time parsing took overall. Here's how it looks:
Screenshot 2020-05-17 at 19 14 10

Because the feedback on the ghc-devs mailing list showed lack of interest in incorporating this feature in GHC as implemented, I'm not going to continue this effort. Feel free to use my code in your implementation if you want.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants