Skip to content

Conversation

tylfin
Copy link
Member

@tylfin tylfin commented Jul 30, 2025

This came up again so I'll backport to 2.21 - https://datadoghq.atlassian.net/browse/DEBUG-4257


While dogfooding, we noticed that code origins was still sending third-party code for bazel, e.g. this path:

/app/domains/dashboardsnotebooks_backend/sharing/apps/apis/shared_dd_query/shared_dd_query.runfiles/dd_source/external/pypi_linux_x86_64_ddtrace/site-packages/ddtrace/_trace/tracer.py

We have code specifically to handle the *.runfiles path, but we pass the string to a Path and call the resolve which removes the *.runfiles segment.

To fix this, we update the path resolution logic in ddtrace/internal/packages.py to improve handling of symlinks and add a new test to ensure correct behavior when symlinks are involved.

  • ddtrace/internal/packages.py: Updated _root_module to use path instead of path.resolve() to ensure symlinks are searched properly. Similarly, updated filename_to_package to use path for consistent path handling. [1] [2]

  • tests/internal/test_packages.py: Added test_third_party_packages_symlinks to verify that symlinks, particularly those with .runfiles in their paths, are correctly excluded from being identified as user code. This ensures the logic remains robust in environments with symlinked directories.

Refs: DEBUG-3926, DEBUG-4257

(cherry picked from commit 197dbe2)

Checklist

  • PR author has checked that all the criteria below are met
  • The PR description includes an overview of the change
  • The PR description articulates the motivation for the change
  • The change includes tests OR the PR description describes a testing strategy
  • The PR description notes risks associated with the change, if any
  • Newly-added code is easy to change
  • The change follows the library release note guidelines
  • The change includes or references documentation updates if necessary
  • Backport labels are set (if applicable)

Reviewer Checklist

  • Reviewer has checked that all the criteria below are met
  • Title is accurate
  • All changes are related to the pull request's stated goal
  • Avoids breaking API changes
  • Testing strategy adequately addresses listed risks
  • Newly-added code is easy to change
  • Release note makes sense to a user of the library
  • If necessary, author has acknowledged and discussed the performance implications of this PR as reported in the benchmarks PR comment
  • Backport labels are set in a manner that is consistent with the release branch maintenance policy

While dogfooding, we noticed that code origins was still sending
third-party code for bazel, e.g. this path:

```
/app/domains/dashboardsnotebooks_backend/sharing/apps/apis/shared_dd_query/shared_dd_query.runfiles/dd_source/external/pypi_linux_x86_64_ddtrace/site-packages/ddtrace/_trace/tracer.py
```

We have code specifically to handle the `*.runfiles` path, but we pass
the string to a Path and call the resolve which removes the `*.runfiles`
segment.

To fix this, we update the path resolution logic in
`ddtrace/internal/packages.py` to improve handling of symlinks and add a
new test to ensure correct behavior when symlinks are involved.

*
[`ddtrace/internal/packages.py`](diffhunk://#diff-362935abfa156805410cbcdbfa453d011dbc4c77ce21a20c68402ec130c2e76fL116-R117):
Updated `_root_module` to use `path` instead of `path.resolve()` to
ensure symlinks are searched properly. Similarly, updated
`filename_to_package` to use `path` for consistent path handling.
[[1]](diffhunk://#diff-362935abfa156805410cbcdbfa453d011dbc4c77ce21a20c68402ec130c2e76fL116-R117)
[[2]](diffhunk://#diff-362935abfa156805410cbcdbfa453d011dbc4c77ce21a20c68402ec130c2e76fL225-R226)

*
[`tests/internal/test_packages.py`](diffhunk://#diff-bd71856276db697cbc2cd49e26c7598838b0ca0c67af35876ad8f8944014cd10R107-R136):
Added `test_third_party_packages_symlinks` to verify that symlinks,
particularly those with `.runfiles` in their paths, are correctly
excluded from being identified as user code. This ensures the logic
remains robust in environments with symlinked directories.

Refs: DEBUG-3926, DEBUG-4257

(cherry picked from commit 197dbe2)
@tylfin tylfin requested review from a team as code owners July 30, 2025 19:13
@tylfin tylfin requested a review from emmettbutler July 30, 2025 19:13
Copy link
Contributor

CODEOWNERS have been resolved as:

ddtrace/internal/packages.py                                            @DataDog/apm-core-python
tests/debugging/origin/test_span.py                                     @DataDog/debugger-python
tests/internal/test_packages.py                                         @DataDog/apm-core-python

@tylfin tylfin added the changelog/no-changelog A changelog entry is not required for this PR. label Jul 30, 2025
@tylfin tylfin changed the title feat(packages): symlinked .runfile pkgs should not be usercode (#13474) feat(packages): symlinked .runfile pkgs should not be usercode (#13474) [backport 2.21] Jul 30, 2025
@tylfin tylfin enabled auto-merge (squash) August 1, 2025 14:38
@github-actions github-actions bot added the stale label Aug 31, 2025
Copy link
Contributor

github-actions bot commented Sep 1, 2025

This pull request has been automatically closed after a period of inactivity.
After this much time, it will likely be easier to open a new pull request with the
same changes than to update this one from the base branch. Please comment or reopen
if you think this pull request was closed in error.

@github-actions github-actions bot closed this Sep 1, 2025
auto-merge was automatically disabled September 1, 2025 00:09

Pull request was closed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
changelog/no-changelog A changelog entry is not required for this PR. stale
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants