Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loss of context using async-http-client #13023

Open
annettejanewilson opened this issue Jan 10, 2025 · 0 comments · May be fixed by #13041
Open

Loss of context using async-http-client #13023

annettejanewilson opened this issue Jan 10, 2025 · 0 comments · May be fixed by #13041
Labels
bug Something isn't working needs triage New issue that requires triage

Comments

@annettejanewilson
Copy link

annettejanewilson commented Jan 10, 2025

Describe the bug

When the async-http-client instrumentation is enabled, OpenTelemetry context is not propagated to code scheduled to run after an async-http-client request using .toCompletableFuture().thenAccept(...) or similar methods.

Steps to reproduce

See our minimal reproduction.

Enable at least the executors, netty and async-http-client instrumentations.

Set baggage in the current OpenTelemetry context, then make a network request with async-http-client and have the response handler observe the baggage in its OpenTelemetry context.

client
    .prepareGet("http://www.example.com/")
    .execute()
    .toCompletableFuture()
    .thenAccept(response -> {
        LOGGER.info("Response status code: {}", response.getStatusCode());
        LOGGER.info("Baggage now in `thenAccept` is {}", Baggage.current());
        observedBaggage.set(Baggage.current());
    })
    .get();

Expected behavior

We expected the response handling code to observe the same baggage that was previously set. In conjunction with the executors-instrumentation, we did not expect to have to manually propagate context in such scenarios. We expected that the callback would observe the same baggage as was observable by the code that initiated the request. This surprise is compounded when disabling the async-http-client instrumentation does allow context to be propagated.

Actual behavior

No context (e.g. baggage, trace) is present when the callback runs.

In actual fact, this does not appear to be strictly true. If somehow the entire HTTP request has already completed by the time .thenAccept(...) is invoked, the callback appears to be called synchronously on the current thread. This was observed when stepping through the code in a debugger, but does not typically occur in practice. This is even more upsetting: the context observed is in fact non-deterministic.

Javaagent or library instrumentation version

Agent v2.11.0 (also with v2.0.0)

Environment

JDK: temurin-22.0.1+8
OS: MacOS 14.7.1 (23H222)

Additional context

We tried this out with various combinations of instrumentations, as shown in the table below. It appears that it is the netty instrumentation that is actually propagating the context in the scenarios where it works, while the async-http-client instrumentation appears to be clobbering the context before netty can propagate it. We tested with and without the executors instrumentation but it doesn't make a difference.

Instrumentations         Executors on   Executors off   Executors on   Executors off
                         Netty on       Netty on        Netty off      Netty on
Async HTTP Client on       ❌FAIL         ❌FAIL          ❌FAIL         ❌FAIL
Async HTTP Client off      ❌FAIL         ✅PASS          ❌FAIL         ✅PASS

I'm not 100% clear on what is the intended behaviour, but this has surprised everyone here that's looked at it. We had hoped that tracing and baggage would propagate automatically without code changes so long as all relevant instrumentations were enabled.

This doesn't seem to be a recent change, and may have always behaved like this. We observed this first with opentelemetry-javaagent v2.0.0 and tested it again in the current v2.11.0.

@annettejanewilson annettejanewilson added bug Something isn't working needs triage New issue that requires triage labels Jan 10, 2025
@annettejanewilson annettejanewilson changed the title Surprising loss of opentelemetry context using async-http-client Loss of context using async-http-client Jan 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working needs triage New issue that requires triage
Projects
None yet
1 participant