Skip to content

Conversation

Frapschen
Copy link
Contributor

What type of PR is this?
/kind feature

What this PR does / why we need it:
Set up the trace instrument, and update the manifest of inferencepool chart.

This is the initial PR for tracing, which simply sets up a global TracerProvider via an init function. Subsequent tracing-related PRs can then focus solely on adding spans:

ctx, span := tracer.Start(r.Context(), "hello-span")
defer span.End()

// do some work to track with hello-span

Which issue(s) this PR fixes:
issue: #1520

Does this PR introduce a user-facing change?:

Set up the trace instrument, and update the manifest of inferencepool chart

@k8s-ci-robot k8s-ci-robot added the kind/feature Categorizes issue or PR as related to a new feature. label Sep 23, 2025
Copy link

netlify bot commented Sep 23, 2025

Deploy Preview for gateway-api-inference-extension ready!

Name Link
🔨 Latest commit c6a6e42
🔍 Latest deploy log https://app.netlify.com/projects/gateway-api-inference-extension/deploys/68e867fc2c18e6000884b062
😎 Deploy Preview https://deploy-preview-1638--gateway-api-inference-extension.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Sep 23, 2025
@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Sep 23, 2025
@k8s-ci-robot
Copy link
Contributor

Hi @Frapschen. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Sep 23, 2025
Copy link
Contributor

@JeffLuoo JeffLuoo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a nit for typo in logs to replace MeterProvider with TracerProvider and other LGTM. Will let @liu-cong @nirrozenbaum @ahg-g to take another review.

@ahg-g
Copy link
Contributor

ahg-g commented Sep 25, 2025

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Sep 25, 2025
@Frapschen
Copy link
Contributor Author

The error is:

go: downloading github.com/x448/float16 v0.8.4
go: downloading golang.org/x/text v0.17.0
# golang.org/x/tools/internal/tokeninternal
../../../pkg/mod/golang.org/x/[email protected]/internal/tokeninternal/tokeninternal.go:64:9: invalid array length -delta * delta (constant -256 of type int64)
make: *** [Makefile:400: /home/prow/go/src/sigs.k8s.io/gateway-api-inference-extension/bin/controller-gen] Error 1
+ EXIT_VALUE=2
+ set +o xtrace
Cleaning up after docker in docker.
================================================================================
Waiting 30 seconds for pods stopped with terminationGracePeriod:30

The make test works fine on my Mac

@JeffLuoo
Copy link
Contributor

/retest

Copy link
Contributor

@liu-cong liu-cong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JeffLuoo If the telemetry.go looks good to you, can you lgtm on that? I can lgtm on the flag and helm changes.

enabled: false
trace:
enabled: false
otelExporterEndpoint: "http://localhost:4317"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we think it's "rare" for users to configre the endpoint and sampling params? If so, I suggest not including them here, and just let those rare, advanced users to configure via the env var section directly.

Copy link
Contributor Author

@Frapschen Frapschen Sep 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@liu-cong Most of the time, users should set a real OTel collector address, so I think it's a very useful setting.

And for sampling, it is also a common setting for tracing configuration.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 that the endpoint and the sampling rate are commonly configured fields.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, naive question, the above means we are expecting a process to start within the pod at 4317, who is starting it? also, what does a "real OTel collector" mean? an external service?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be the user to start an opentelemetry collector that exposed 4317 and 4318 ports for gRPC and HTTP request.

The SDK will then send the trace to the collector, before forwarding to other backends. So yes, it's an external service that require manual deployment.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the default we expect is for the user to start it as a sidecar?

Copy link
Contributor

@liu-cong liu-cong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@liu-cong
Copy link
Contributor

/lgtm cancel

I have a question on the error return, left a comment

@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 30, 2025
@Frapschen Frapschen requested a review from liu-cong September 30, 2025 04:20
Copy link
Contributor

@JeffLuoo JeffLuoo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few nits from my side. Other than that it LGTM. @liu-cong please give it another look.

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: Frapschen, JeffLuoo
Once this PR has been reviewed and has the lgtm label, please assign danehans for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@liu-cong
Copy link
Contributor

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 30, 2025
@JeffLuoo
Copy link
Contributor

JeffLuoo commented Oct 6, 2025

@ahg-g for review and approval

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 7, 2025
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 9, 2025
@k8s-ci-robot
Copy link
Contributor

New changes are detected. LGTM label has been removed.

@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 9, 2025
@Frapschen
Copy link
Contributor Author

/test pull-gateway-api-inference-extension-test-unit-main

@JeffLuoo
Copy link
Contributor

JeffLuoo commented Oct 9, 2025

/retest

@Frapschen Frapschen requested review from ahg-g and JeffLuoo October 10, 2025 01:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants