Skip to content

chore(llmobs): dac strip io from vertex #13693

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 15 commits into
base: main
Choose a base branch
from

Conversation

jsimpher
Copy link
Contributor

@jsimpher jsimpher commented Jun 17, 2025

Remove potentially sensitive i/o data from apm spans. This way, prompt and completion data will only appear on the llm obs spans, which are/will be subject to data access controls.

Mostly, this just removes io tag sets. A few things (mostly metrics) have llmobs tags dependent on span tags, so there is a bit more refactoring there.

Let me know if I removed anything that should really stay, or if I missed something that should be restricted.

Checklist

  • PR author has checked that all the criteria below are met
  • The PR description includes an overview of the change
  • The PR description articulates the motivation for the change
  • The change includes tests OR the PR description describes a testing strategy
  • The PR description notes risks associated with the change, if any
  • Newly-added code is easy to change
  • The change follows the library release note guidelines
  • The change includes or references documentation updates if necessary
  • Backport labels are set (if applicable)

Reviewer Checklist

  • Reviewer has checked that all the criteria below are met
  • Title is accurate
  • All changes are related to the pull request's stated goal
  • Avoids breaking API changes
  • Testing strategy adequately addresses listed risks
  • Newly-added code is easy to change
  • Release note makes sense to a user of the library
  • If necessary, author has acknowledged and discussed the performance implications of this PR as reported in the benchmarks PR comment
  • Backport labels are set in a manner that is consistent with the release branch maintenance policy

@jsimpher jsimpher changed the title Jsimpher/dac strip io from vertex dac strip io from vertex Jun 17, 2025
Copy link
Contributor

github-actions bot commented Jun 17, 2025

CODEOWNERS have been resolved as:

.github/CODEOWNERS                                                      @DataDog/python-guild @DataDog/apm-core-python
ddtrace/contrib/internal/vertexai/_utils.py                             @DataDog/ml-observability
ddtrace/contrib/internal/vertexai/patch.py                              @DataDog/ml-observability
ddtrace/llmobs/_integrations/utils.py                                   @DataDog/ml-observability
ddtrace/llmobs/_integrations/vertexai.py                                @DataDog/ml-observability
tests/snapshots/tests.contrib.vertexai.test_vertexai.test_vertexai_completion.json  @DataDog/apm-python @DataDog/ml-observability
tests/snapshots/tests.contrib.vertexai.test_vertexai.test_vertexai_completion_error.json  @DataDog/apm-python @DataDog/ml-observability
tests/snapshots/tests.contrib.vertexai.test_vertexai.test_vertexai_completion_multiple_messages.json  @DataDog/apm-python @DataDog/ml-observability
tests/snapshots/tests.contrib.vertexai.test_vertexai.test_vertexai_completion_stream.json  @DataDog/apm-python @DataDog/ml-observability
tests/snapshots/tests.contrib.vertexai.test_vertexai.test_vertexai_completion_stream_error.json  @DataDog/apm-python @DataDog/ml-observability
tests/snapshots/tests.contrib.vertexai.test_vertexai.test_vertexai_completion_stream_tool.json  @DataDog/apm-python @DataDog/ml-observability
tests/snapshots/tests.contrib.vertexai.test_vertexai.test_vertexai_completion_system_prompt.json  @DataDog/apm-python @DataDog/ml-observability
tests/snapshots/tests.contrib.vertexai.test_vertexai.test_vertexai_completion_tool.json  @DataDog/apm-python @DataDog/ml-observability

Copy link
Contributor

github-actions bot commented Jun 17, 2025

Bootstrap import analysis

Comparison of import times between this PR and base.

Summary

The average import time from this PR is: 281 ± 5 ms.

The average import time from base is: 281 ± 5 ms.

The import time difference between this PR and base is: 0.1 ± 0.2 ms.

The difference is not statistically significant (z = 0.27).

Import time breakdown

The following import paths have shrunk:

ddtrace.auto 1.826 ms (0.65%)
ddtrace.bootstrap.sitecustomize 1.154 ms (0.41%)
ddtrace.bootstrap.preload 1.154 ms (0.41%)
ddtrace.internal.remoteconfig.client 0.615 ms (0.22%)
ddtrace 0.672 ms (0.24%)
ddtrace.internal._unpatched 0.028 ms (0.01%)
json 0.028 ms (0.01%)
json.decoder 0.028 ms (0.01%)
re 0.028 ms (0.01%)
enum 0.028 ms (0.01%)
types 0.028 ms (0.01%)

@pr-commenter
Copy link

pr-commenter bot commented Jun 17, 2025

Benchmarks

Benchmark execution time: 2025-06-25 15:29:34

Comparing candidate commit 1e5ac47 in PR branch jsimpher/dac-strip-io-from-vertex with baseline commit 40f2c37 in branch main.

Found 0 performance improvements and 2 performance regressions! Performance is the same for 559 metrics, 3 unstable metrics.

scenario:iastaspects-replace_aspect

  • 🟥 execution_time [+529.696ns; +610.039ns] or [+11.261%; +12.969%]

scenario:iastaspectsospath-ospathsplitdrive_aspect

  • 🟥 execution_time [+392.961ns; +447.254ns] or [+10.806%; +12.299%]

@jsimpher jsimpher changed the title dac strip io from vertex chore(llmobs): dac strip io from vertex Jun 22, 2025
@@ -3,6 +3,7 @@
from typing import Dict

import vertexai
from vertexai.generative_models import GenerativeModel # noqa:F401
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing this import (which used to exist on ./_utils.py but is no longer used there) breaks tests. As far as I can tell, it may have to do with vertex lazy loading, meaning that .generative_models may not exist yet when patching.
If this is a known thing that we have a standard way to deal with, let me know.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm interesting, I would probably put this import in the patch function directly (as long as that works) so that we only import it when we actually do the patching.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still fails. Looking closer, it will actually fail on the assertion that we are unpatched before calling .patch, since the thing we check isn't imported. There might be a way to move this to the test itself (a little tricky since its the base tests, not the vertex ones). Do you think that would be worth it?

@jsimpher jsimpher marked this pull request as ready for review June 24, 2025 16:01
@jsimpher jsimpher requested review from a team as code owners June 24, 2025 16:01
@jsimpher jsimpher requested review from juanjux and mabdinur June 24, 2025 16:01
Comment on lines -3 to -10
from vertexai.generative_models import GenerativeModel
from vertexai.generative_models import Part

from ddtrace.internal.utils import get_argument_value
from ddtrace.llmobs._integrations.utils import get_generation_config_google
from ddtrace.llmobs._integrations.utils import get_system_instructions_from_google_model
from ddtrace.llmobs._integrations.utils import tag_request_content_part_google
from ddtrace.llmobs._integrations.utils import tag_response_part_google
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we are still using these helpers in the google generative ai integration. Once we remove the APM tagging there, we can get rid of these helpers altogether :)

Copy link
Contributor

@ncybul ncybul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few comments which should hopefully be quick. Going to approve for now, but if there's anything you are unsure about, feel free to reach out for another review!

@brettlangdon
Copy link
Member

This should probably get a release note?

@jsimpher jsimpher requested review from a team as code owners June 25, 2025 14:39
@@ -177,6 +175,8 @@ tests/contrib/crewai @DataDog/ml-observ
tests/contrib/openai_agents @DataDog/ml-observability
tests/contrib/litellm @DataDog/ml-observability
.gitlab/tests/llmobs.yml @DataDog/ml-observability
# MLObs snapshot tests
tests/snapshots/tests.contrib.vertexai.* @DataDog/apm-python @DataDog/ml-observability
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think apm-python needs to be included on this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ended up pulling this into its own pr to reassign all the llmobs integration test snapshots; with your feedback integrated

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants