-
Notifications
You must be signed in to change notification settings - Fork 405
Revert "Switch to OpenTracing's ContextVarsScopeManager (#18849)"
#19007
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Revert "Switch to OpenTracing's ContextVarsScopeManager (#18849)"
#19007
Conversation
This reverts commit 27fc338. Conflicts: synapse/logging/context.py tests/logging/test_opentracing.py
…_available` This test is still expected to fail with our current `LogContextScopeManager` implementation but there were some flaws in how we were handling the logcontext before (like pumping the reactor in a non-sentinel logcontext)
…pe_still_available`" This reverts commit 500a187.
| own previous custom `_LogContextScope.close(...)` would clear | ||
| `LoggingContext.scope` preventing further tracing spans from having the correct | ||
| parent. | ||
| """ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While I think ideally this test should work and it would be nice to keep.
It's expected to fail with our current implementation of LogContextScopeManager so it makes sense to remove in the revert.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exploring if we can keep this test and get this functionality with LogContextScopeManager in #19008
This functionality was originally introduced in #18932 but previously worked fine because we were using the `ContextVarsScopeManager` which had no problem tracking the tracing scope across logcontext boundaries. Now that we've switched back to our own custom `LogContextScopeManager` which tracks the tracing scope in the logcontext itself, we need to get the active tracing span before we switch logcontext. (we switch to the sentinel logcontext because of the `PreserveLoggingContext()`)
| # Since we track the tracing scope in the `LoggingContext`, before we move to the | ||
| # sentinel logcontext (or a new new `LoggingContext`), grab the currently active | ||
| # tracing span (if any) so that we can create a cross-link to the background process | ||
| # trace. | ||
| original_active_tracing_span = active_span(tracer=test_only_tracer) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In order to maintain the functionality introduced in #18932 (cross-links between the background process trace and currently active trace), we need this change.
Previously, when we were using ContextVarsScopeManager, it tracked the tracing scope across the logcontext changes without issue. Now that we're using our own custom LogContextScopeManager again, we need to capture the active span from the logcontext before we reset to the sentinel context because of the PreserveLoggingContext() below
| ["fixture_awaitable_return_func"], | ||
| ) | ||
|
|
||
| async def test_run_as_background_process_standalone(self) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added some tests to ensure we maintain the run_as_background tracing behavior regardless of the tracing scope manager we use.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree this was the best choice, though somewhat sad to see.
Good idea to salvage what we can in a separate PR.
Co-authored-by: Andrew Morgan <[email protected]>
Co-authored-by: Andrew Morgan <[email protected]>
|
Thanks for the review @anoadragon453 🐋 |
Deployments that make use of the [synapse-s3-storage-provider](https://github.com/matrix-org/synapse-s3-storage-provider) module must upgrade to [v1.6.0](https://github.com/matrix-org/synapse-s3-storage-provider/releases/tag/v1.6.0). Using older versions of the module with this release of Synapse will prevent users from being able to upload or download media. No significant changes since 1.140.0rc1. - Add [a new Media Query by ID Admin API](https://element-hq.github.io/synapse/v1.140/admin_api/media_admin_api.html#query-a-piece-of-media-by-id) that allows server admins to query and investigate the metadata of local or cached remote media via the `origin/media_id` identifier found in a [Matrix Content URI](https://spec.matrix.org/v1.14/client-server-api/#matrix-content-mxc-uris). ([\element-hq#18911](element-hq#18911)) - Add [a new Fetch Event Admin API](https://element-hq.github.io/synapse/v1.140/admin_api/fetch_event.html) to fetch an event by ID. ([\element-hq#18963](element-hq#18963)) - Update [MSC4284: Policy Servers](matrix-org/matrix-spec-proposals#4284) implementation to support signatures when available. ([\element-hq#18934](element-hq#18934)) - Add experimental implementation of the `GET /_matrix/client/v1/rtc/transports` endpoint for the latest draft of [MSC4143: MatrixRTC](matrix-org/matrix-spec-proposals#4143). ([\element-hq#18967](element-hq#18967)) - Expose a `defer_to_threadpool` function in the Synapse Module API that allows modules to run a function on a separate thread in a custom threadpool. ([\element-hq#19032](element-hq#19032)) - Fix room upgrade `room_config` argument and documentation for `user_may_create_room` spam-checker callback. ([\element-hq#18721](element-hq#18721)) - Compute a user's last seen timestamp from their devices' last seen timestamps instead of IPs, because the latter are automatically cleared according to `user_ips_max_age`. ([\element-hq#18948](element-hq#18948)) - Fix bug where ephemeral events were not filtered by room ID. Contributed by @frastefanini. ([\element-hq#19002](element-hq#19002)) - Update Synapse main process version string to include git info. ([\element-hq#19011](element-hq#19011)) - Explain how `Deferred` callbacks interact with logcontexts. ([\element-hq#18914](element-hq#18914)) - Fix documentation for `rc_room_creation` and `rc_reports` to clarify that a `per_user` rate limit is not supported. ([\element-hq#18998](element-hq#18998)) - Remove deprecated `LoggingContext.set_current_context`/`LoggingContext.current_context` methods which already have equivalent bare methods in `synapse.logging.context`. ([\element-hq#18989](element-hq#18989)) - Drop support for unstable field names from the long-accepted [MSC2732](matrix-org/matrix-spec-proposals#2732) (Olm fallback keys) proposal. ([\element-hq#18996](element-hq#18996)) - Cleanly shutdown `SynapseHomeServer` object, allowing artifacts of embedded small hosts to be properly garbage collected. ([\element-hq#18828](element-hq#18828)) - Update OEmbed providers to use 'X' instead of 'Twitter' in URL previews, following a rebrand. Contributed by @HammyHavoc. ([\element-hq#18767](element-hq#18767)) - Fix `server_name` in logging context for multiple Synapse instances in one process. ([\element-hq#18868](element-hq#18868)) - Wrap the Rust HTTP client with `make_deferred_yieldable` so it follows Synapse logcontext rules. ([\element-hq#18903](element-hq#18903)) - Fix the GitHub Actions workflow that moves issues labeled "X-Needs-Info" to the "Needs info" column on the team's internal triage board. ([\element-hq#18913](element-hq#18913)) - Disconnect background process work from request trace. ([\element-hq#18932](element-hq#18932)) - Reduce overall number of calls to `_get_e2e_cross_signing_signatures_for_devices` by increasing the batch size of devices the query is called with, reducing DB load. ([\element-hq#18939](element-hq#18939)) - Update error code used when an appservice tries to masquerade as an unknown device using [MSC4326](matrix-org/matrix-spec-proposals#4326). Contributed by @tulir @ Beeper. ([\element-hq#18947](element-hq#18947)) - Fix `no active span when trying to log` tracing error on startup (when OpenTracing is enabled). ([\element-hq#18959](element-hq#18959)) - Fix `run_coroutine_in_background(...)` incorrectly handling logcontext. ([\element-hq#18964](element-hq#18964)) - Add debug logs wherever we change current logcontext. ([\element-hq#18966](element-hq#18966)) - Update dockerfile metadata to fix broken link; point to documentation website. ([\element-hq#18971](element-hq#18971)) - Note that the code is additionally licensed under the [Element Commercial license](https://github.com/element-hq/synapse/blob/develop/LICENSE-COMMERCIAL) in SPDX expression field configs. ([\element-hq#18973](element-hq#18973)) - Fix logcontext handling in `timeout_deferred` tests. ([\element-hq#18974](element-hq#18974)) - Remove internal `ReplicationUploadKeysForUserRestServlet` as a follow-up to the work in element-hq#18581 that moved device changes off the main process. ([\element-hq#18988](element-hq#18988)) - Switch task scheduler from raw logcontext manipulation to using the dedicated logcontext utils. ([\element-hq#18990](element-hq#18990)) - Remove `MockClock()` in tests. ([\element-hq#18992](element-hq#18992)) - Switch back to our own custom `LogContextScopeManager` instead of OpenTracing's `ContextVarsScopeManager` which was causing problems when using the experimental `SYNAPSE_ASYNC_IO_REACTOR` option with tracing enabled. ([\element-hq#19007](element-hq#19007)) - Remove `version_string` argument from `HomeServer` since it's always the same. ([\element-hq#19012](element-hq#19012)) - Remove duplicate call to `hs.start_background_tasks()` introduced from a bad merge. ([\element-hq#19013](element-hq#19013)) - Split homeserver creation (`create_homeserver`) and setup (`setup`). ([\element-hq#19015](element-hq#19015)) - Swap near-end-of-life `macos-13` GitHub Actions runner for the `macos-15-intel` variant. ([\element-hq#19025](element-hq#19025)) - Introduce `RootConfig.validate_config()` which can be subclassed in `HomeServerConfig` to do cross-config class validation. ([\element-hq#19027](element-hq#19027)) - Allow any command of the `release.py` script to accept a `--gh-token` argument. ([\element-hq#19035](element-hq#19035)) * Bump Swatinem/rust-cache from 2.8.0 to 2.8.1. ([\element-hq#18949](element-hq#18949)) * Bump actions/cache from 4.2.4 to 4.3.0. ([\element-hq#18983](element-hq#18983)) * Bump anyhow from 1.0.99 to 1.0.100. ([\element-hq#18950](element-hq#18950)) * Bump authlib from 1.6.3 to 1.6.4. ([\element-hq#18957](element-hq#18957)) * Bump authlib from 1.6.4 to 1.6.5. ([\element-hq#19019](element-hq#19019)) * Bump bcrypt from 4.3.0 to 5.0.0. ([\element-hq#18984](element-hq#18984)) * Bump docker/login-action from 3.5.0 to 3.6.0. ([\element-hq#18978](element-hq#18978)) * Bump lxml from 6.0.0 to 6.0.2. ([\element-hq#18979](element-hq#18979)) * Bump phonenumbers from 9.0.13 to 9.0.14. ([\element-hq#18954](element-hq#18954)) * Bump phonenumbers from 9.0.14 to 9.0.15. ([\element-hq#18991](element-hq#18991)) * Bump prometheus-client from 0.22.1 to 0.23.1. ([\element-hq#19016](element-hq#19016)) * Bump pydantic from 2.11.9 to 2.11.10. ([\element-hq#19017](element-hq#19017)) * Bump pygithub from 2.7.0 to 2.8.1. ([\element-hq#18952](element-hq#18952)) * Bump regex from 1.11.2 to 1.11.3. ([\element-hq#18981](element-hq#18981)) * Bump serde from 1.0.224 to 1.0.226. ([\element-hq#18953](element-hq#18953)) * Bump serde from 1.0.226 to 1.0.228. ([\element-hq#18982](element-hq#18982)) * Bump setuptools-rust from 1.11.1 to 1.12.0. ([\element-hq#18980](element-hq#18980)) * Bump twine from 6.1.0 to 6.2.0. ([\element-hq#18985](element-hq#18985)) * Bump types-pyyaml from 6.0.12.20250809 to 6.0.12.20250915. ([\element-hq#19018](element-hq#19018)) * Bump types-requests from 2.32.4.20250809 to 2.32.4.20250913. ([\element-hq#18951](element-hq#18951)) * Bump typing-extensions from 4.14.1 to 4.15.0. ([\element-hq#18956](element-hq#18956))
Revert #18849
Go back to our custom
LogContextScopeManagerafter trying OpenTracing'sContextVarsScopeManager.Fix #19004
Why revert?
For reference, with the normal reactor,
ContextVarsScopeManagerworked just as good as our customLogContextScopeManageras far as I can tell (and even better in some cases). But since Twisted appears to not fully supportContextVar's, it doesn't work as expected in all cases. Compounding things,ContextVarsScopeManagerwas causing errors with the experimentalSYNAPSE_ASYNC_IO_REACTORoption.Since we're not getting the full benefit that we originally desired, we might as well revert and figure out alternatives for extending the logcontext lifetimes to support the use case we were trying to unlock (c.f. #18804).
See #19004 (comment) for more info.
Does this require backporting and patch releases?
No. Since
ContextVarsScopeManageroperates just as good with the normal reactor and was only causing actual errors with the experimentalSYNAPSE_ASYNC_IO_REACTORoption, I don't think this requires us to backport and make patch releases at all.Pull Request Checklist
EventStoretoEventWorkerStore.".code blocks.