feat: add more spans to opentelemetry plugin#12686
Conversation
Signed-off-by: Nic <[email protected]>
Signed-off-by: Nic <[email protected]>
Signed-off-by: Nic <[email protected]>
dba7342 to
d3f9bd4
Compare
9e29643 to
9616ff5
Compare
There was a problem hiding this comment.
Pull request overview
Adds a new “comprehensive request lifecycle tracing” mode for the OpenTelemetry integration, instrumenting more APISIX internals and documenting/testing the resulting spans.
Changes:
- Introduces a core tracing utility (
apisix/tracer.lua) and span model (apisix/utils/span.lua) to capture APISIX-internal spans across phases. - Instruments multiple request lifecycle points (SSL/SNI matching, DNS resolve, plugin/global-rule execution, phases) and injects them into OpenTelemetry export.
- Adds
apisix.tracingconfig flag, updates EN/ZH docs, and adds a new tracing-focused test.
Reviewed changes
Copilot reviewed 14 out of 14 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
t/plugin/opentelemetry6.t |
New test covering additional spans being emitted and exported. |
docs/zh/latest/plugins/opentelemetry.md |
Documents enabling comprehensive lifecycle tracing and shows updated collector output. |
docs/en/latest/plugins/opentelemetry.md |
Same as ZH docs, for English readers. |
conf/config.yaml.example |
Adds apisix.tracing example + explanation. |
apisix/cli/config.lua |
Adds default apisix.tracing = false in CLI default config. |
apisix/tracer.lua |
New internal tracer used to create/finish APISIX core spans. |
apisix/utils/span.lua |
New span container type used by the internal tracer. |
apisix/init.lua |
Starts/finishes internal spans across HTTP + SSL phases; finishes all spans in log phase and releases tracing state. |
apisix/plugin.lua |
Wraps plugin execution + global rules with spans. |
apisix/utils/upstream.lua |
Adds a resolve_dns span around domain resolution. |
apisix/ssl/router/radixtree_sni.lua |
Adds a sni_radixtree_match span around SNI routing. |
apisix/secret.lua |
Adds a fetch_secret span around secret retrieval. |
apisix/plugins/opentelemetry.lua |
Adds attributes + injects internal APISIX spans into OpenTelemetry export in log phase. |
apisix/core/response.lua |
Attempts to mark spans as error on >= 400 exits. |
Comments suppressed due to low confidence (1)
apisix/secret.lua:157
- A span is started (
span = tracer.start(...)) but on thepcall(require, ...)failure path the function returns without finishing the span, which will leak an unfinished span intongx.ctx.tracing. Finish the span with an error status before returning.
local span = tracer.start(ngx.ctx, "fetch_secret", tracer.kind.client)
local ok, sm = pcall(require, "apisix.secret." .. opts.manager)
if not ok then
return nil, "no secret manager: " .. opts.manager
end
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 14 out of 14 changed files in this pull request and generated 3 comments.
Comments suppressed due to low confidence (1)
apisix/secret.lua:157
- A span is started before
pcall(require, ...), but whenrequirefails this function returns without finishing the span. That leaves an open child span inngx.ctx.tracingand can skew parent/child tracking untilfinish_allruns. Finish the span on this error path as well (and set ERROR status/message).
local span = tracer.start(ngx.ctx, "fetch_secret", tracer.kind.client)
local ok, sm = pcall(require, "apisix.secret." .. opts.manager)
if not ok then
return nil, "no secret manager: " .. opts.manager
end
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
I have no more problems to say.
I've already expressed my views twice; this would be the third time.
A span is an instance. Whatever it means to Lua—a metatable or anything else—I don't care.
We can always terminate a span using the following way:
local span = tracer.start()
span:finsih(api_ctx.ngx_ctx or ngx.ctx, status, message)
-- Even `api_ctx.ngx_ctx` can be omitted, since `ngx.ctx` is always available in any context where `api_ctx` can be accessed.Simply attaching an identical metatable to the span table and obtaining the span instance from self is sufficient to achieve the same result internally as you are currently doing.
I don't understand why the API design persists in making function calls statelessly from the tracer. I've already repeated this three times, and I don't want to repeat it again.
Throughout the entire process, the PR never addressed my concerns regarding the API design.
I'm not sure how these design conflicts arose, as maintaining API compatibility with other popular implementations shouldn't be difficult. There is no external evidence to support this API design.
Since this API is unexpected for me, I won't approve this PR. However, I won't block the merge either—you'll need approval from other maintainers.
73eb186 to
83be609
Compare
Signed-off-by: Nic <[email protected]>
83be609 to
9be87a0
Compare
Description
Run jaeger in local by https://www.jaegertracing.io/docs/2.11/getting-started/#all-in-one
New Features
tracingconfiguration option to enable detailed observability.Tests
Checklist