-
Notifications
You must be signed in to change notification settings - Fork 73
Add basic Open Telemetry instrumentation for all requests #729
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This commit wraps all LLM model calls in an Open Telemetry span that abides by the (still nascent) semantic conventions for Generative AI clients [0]. It's very similar in approach to what was done for `httr2`, and in fact the two of them complement one another nicely: r-lib/httr2#729. For example: library(otelsdk) Sys.setenv(OTEL_TRACES_EXPORTER = "stderr") chat <- ellmer::chat_databricks(model = "databricks-claude-3-7-sonnet") chat$chat("Tell me a joke in the form of an SQL query.") [0]: https://opentelemetry.io/docs/specs/semconv/gen-ai/gen-ai-spans/ Signed-off-by: Aaron Jacobs <[email protected]>
This commit wraps all LLM model calls in an Open Telemetry span that abides by the (still nascent) semantic conventions for Generative AI clients [0]. It's very similar in approach to what was done for `httr2`, and in fact the two of them complement one another nicely: r-lib/httr2#729. For example: library(otelsdk) Sys.setenv(OTEL_TRACES_EXPORTER = "stderr") chat <- ellmer::chat_databricks(model = "databricks-claude-3-7-sonnet") chat$chat("Tell me a joke in the form of an SQL query.") [0]: https://opentelemetry.io/docs/specs/semconv/gen-ai/gen-ai-spans/ Signed-off-by: Aaron Jacobs <[email protected]>
This is a significant rework over the original implementation after I discovered issues with it in practice. It ended up being pretty hard to get the scoping/side effect issues right, but when I run this PR with However: right now this PR doesn't support Finally, there aren't any unit tests. I'd like to add some (especially considering the likelihood of regressions), but right now the testing story for |
@atheriel If you are using concurrency on a single thread, then you'll need to use sessions to get the relationships between the spans right. Essentially, you'll need to manage the sessions manually and pass the correct session to
The API for this might get a little nicer soonish, I think the mechanics are pretty solid. There is an example for this kind of manual session management in the |
@gaborcsardi have you thought at all about testing? It would be nice if otel offered some way to collect the spans into a data frame (or other data structure), so that we could then inspect and have some simple tests that the right information is getting passed through. |
Yes, you can write them to a file, and then read them back, e.g. tmp <- tempfile(fileext = "otel")
trc_prv <- tracer_provider_stdstream_new(tmp)
[...]
spns <- parse_spans(tmp) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this! I think folks are going to be super excited once this all comes together.
Is there already an otel way to suppress instrumentation just for httr2?
@gaborcsardi maybe it's worth wrapping that up into a little helper? capture_spans <- function(code) {
tmp <- tempfile(fileext = "otel")
trc_prv <- tracer_provider_stdstream_new(tmp)
code
parse_spans(tmp)
} |
I suggest we move testing discussion to: r-lib/otelsdk#13 |
Good point. We could have an |
Unit tests are now included. |
Some improvements you can cherry-pick: gaborcsardi#1, including spans for |
This commit wraps all requests in an Open Telemetry span that abides by the semantic conventions for HTTP clients [0] (insofar as I understand them). We also propagate the trace context [1] when there is one. The main subtlety is that I had to tweak some of httr2's internals so that request signing can take into account new headers. Luckily there is fairly comprehensive test coverage so I'm fairly sure at this point that I haven't broken anything. Right now this instrumentation is opt in: `otel` is in `Suggests`, and tracing must be enabled (e.g. via the `OTEL_TRACES_EXPORTER` environment variable). Otherwise this is costless at runtime. For example: library(otelsdk) Sys.setenv(OTEL_TRACES_EXPORTER = "stderr") request("https://google.com") |> req_perform() I'm not sure that `otel` needs to move to `Imports`, because by design users actually need the `otelsdk` package to enable tracing anyway. Unit tests are included. [0]: https://opentelemetry.io/docs/specs/semconv/http/http-spans/#http-client-span [1]: https://www.w3.org/TR/trace-context/ Signed-off-by: Aaron Jacobs <[email protected]> Co-authored-by: Gábor Csárdi <[email protected]>
I've completely rewritten this in concert with Gabor as the
The main new subtlety is that I had to tweak some of httr2's internals so that request signing can take into account new headers while preserving the scope-centric needs of |
I'm lining up for a httr2 patch release in the next week or two. I assume that this probably won't make it, since we'll need sometime for otel/otelsdk to get to CRAN? |
Probably not. Should we aim to merge it after the patch, then? |
This commit wraps all requests in an Open Telemetry span that abides by the semantic conventions for HTTP clients (insofar as I understand them).
Right now this instrumentation is opt in:
otel
is inSuggests
, and tracing must be enabled (e.g. via theOTEL_TRACES_EXPORTER
environment variable). Otherwise this is costless at runtime.For example:
I'm not sure that
otel
needs to move toImports
, because by design users actually need theotelsdk
package to enable tracing anyway.The major limitation right now is that we don't propagate the trace context to the server, becauseotel
doesn't have an explicit mechanism for this yet.