Remove `sentinel` logcontext in `Clock` utilities (`looping_call`, `looping_call_now`, `call_later`) #18907

MadLittleMods · 2025-09-10T22:25:03Z

Remove sentinel logcontext in Clock utilities (looping_call, looping_call_now, call_later)

Lints for ensuring we use Clock.call_later instead of reactor.callLater, etc are coming in #18944

Testing strategy

Configure Synapse to log at the DEBUG level
Start Synapse: poetry run synapse_homeserver --config-path homeserver.yaml
Wait 10 seconds for the database profiling loop to execute
Notice the logcontext being used for the Total database time log line

Before (sentinel):

2025-09-10 16:36:58,651 - synapse.storage.TIME - 707 - DEBUG - sentinel - Total database time: 0.646% {room_forgetter_stream_pos(2): 0.131%, reap_monthly_active_users(1): 0.083%, get_device_change_last_converted_pos(1): 0.078%}

After (looping_call):

2025-09-10 16:36:58,651 - synapse.storage.TIME - 707 - DEBUG - looping_call - Total database time: 0.646% {room_forgetter_stream_pos(2): 0.131%, reap_monthly_active_users(1): 0.083%, get_device_change_last_converted_pos(1): 0.078%}

Dev notes

SYNAPSE_TEST_LOG_LEVEL=INFO poetry run trial tests.util.test_logcontext.LoggingContextTestCase

Fix logcontext leak in http pusher test, a267c2e (Fix some logcontext leaks matrix-org/synapse#4204)

Other related fixes matrix-org/synapse#4209

Pull Request Checklist

Pull request is based on the develop branch
Pull request includes a changelog file. The entry should:
- Be a short description of your change which makes sense to users. "Fixed a bug that prevented receiving messages from other servers." instead of "Moved X method from EventStore to EventWorkerStore.".
- Use markdown where necessary, mostly for code blocks.
- End with either a period (.) or an exclamation mark (!).
- Start with a capital letter.
- Feel free to credit yourself, by adding a sentence "Contributed by @github_username." or "Contributed by [Your Name]." to the end of the entry.
Code style is correct (run the linters)

We will need this in #18868 but not for now

synapse/util/__init__.py

MadLittleMods · 2025-09-11T01:24:38Z

tests/util/test_logcontext.py

        # test is done once d2 finishes
        return d2

+    @logcontext_clean


I added @logcontext_clean everywhere as it makes sense to fail when we see logcontext_error used in the logcontext tests.

Interestingly, I do still see some logcontext_error (ex. 2025-09-10 20:20:17-0500 [-] synapse.logging.context - 90 - WARNING - one - Re-starting finished log context one) logs when running these run_in_background test logs but the tests still aren't failing 🤔

These logs also happen on develop (even develop from days ago before any logcontext changes) so this is a long-standing problem.

Something to solve in an another PR ⏩

MadLittleMods · 2025-09-11T01:46:54Z

synapse/util/__init__.py

-            with context.PreserveLoggingContext():
-                callback(*args, **kwargs)
-
-        with context.PreserveLoggingContext():


This PreserveLoggingContext() doesn't seem necessary but please double-check. We aren't calling anything at this point. Only scheduling something for the reactor to run later

Unless we think that a reactor implementation might call immediately when delay is 0 (reactor.callLater(0, func)). Although, I'm assuming this has the same semantics as setTimeout(code, 0) in JavaScript where it should be run as soon as possible but not right away (put it on the queue).

Please cross-check with whether the test_call_later is good enough to stress this.

seems fair to me. I can't rule out some weird non-obvious interaction but if that was the case, there should've been a comment to that extent.

This reverts commit 1c001c9.

MadLittleMods · 2025-09-11T02:24:49Z

synapse/util/__init__.py

+            assert context.current_context() is context.SENTINEL_CONTEXT, (
+                "Expected `call_later` callback from the reactor to start with the sentinel logcontext "
+                f"but saw {context.current_context()}. In other words, another task shouldn't have "
+                "leaked their logcontext to us."
+            )


Are we ok with breaking the world when we run into this kind of problem?

It would be much nicer to catch this and get a report (what we're doing now) than logging in the background.

For example, this has already caught the problem in our email pusher tests where we were leaking logcontexts -> 9116e74

MadLittleMods · 2025-09-11T02:26:09Z

tests/util/test_logcontext.py

+        self.assertTrue(
+            callback_finished,
+            "Callback never finished which means the test probably didn't wait long enough",
+        )


Added this to ensure we're actually getting through everything as expected.

Previously, we didn't run through the callback completely and missed the competing assertion for example.

MadLittleMods · 2025-09-11T05:59:28Z

tests/push/test_email.py

            d: Deferred = Deferred()
            self.email_attempts.append((d, args, kwargs))
-            return d
+            return make_deferred_yieldable(d)


We were leaking the logcontext into the reactor

This took some debug diving to find the culprit but turns out we already had the same fix in the HTTP pusher tests since a267c2e (matrix-org/synapse#4204)

Previously, this was causing some tests to fail:

SYNAPSE_TEST_LOG_LEVEL=INFO poetry run trial tests.push.test_email.EmailPusherTests.test_invite_to_empty_room_sends_email

_trial_temp/test.log

2025-09-11 01:01:23-0500 [-] synapse.util - 308 - ERROR - emailpush.process-2 - Looping call died Traceback (most recent call last): File "/virtualenvs/matrix-synapse-xCtC9ulO-py3.13/lib/python3.13/site-packages/twisted/internet/defer.py", line 216, in maybeDeferred result = f(*args, **kwargs) File "synapse/synapse/util/__init__.py", line 202, in wrapped_f assert context.current_context() is context.SENTINEL_CONTEXT, ( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ AssertionError: Expected `looping_call` callback from the reactor to start with the sentinel logcontext but saw emailpush.process-2. In other words, another task shouldn't have leaked their logcontext to us.

We caught this because we added that new sentinel logcontext assertion (discussed in another thread above)

synapse/util/__init__.py

tests/util/test_logcontext.py

See #18907 (comment)

See #18907 (comment) Prior art here uses `one` here but I find it difficult to write about in a comment. For example, "in one context" is kinda generic sounding when I'm talking about the specific context.

See #18907 (comment) > We should wrap this in a `try: ... finally: callback_finished = True` so we can see any potential underlying error that occurs in the `competing_callback()`

reivilibre · 2025-09-22T13:04:12Z

synapse/util/__init__.py

+        Note that the function will be called with generic `looping_call` logcontext, so
+        if it is anything other than a trivial task, you probably want to wrap it in
+        `run_as_background_process` to give it more specific label and track metrics.


this is not your problem to deal with, but I've often wondered why we don't just make looping_call take a name and do this wrapping itself. I can't think of a generic looping_call ctx ever being the desirable outcome.

A possible future 👍

reivilibre · 2025-09-22T13:06:04Z

synapse/util/__init__.py

-            with context.PreserveLoggingContext():
-                callback(*args, **kwargs)
-
-        with context.PreserveLoggingContext():


seems fair to me. I can't rule out some weird non-obvious interaction but if that was the case, there should've been a comment to that extent.

Conflicts: synapse/util/__init__.py

See #18914 for more docs on how deferreds interact with logcontexts

MadLittleMods · 2025-09-22T19:51:29Z

Thanks for the review @reivilibre 🦔

MadLittleMods added 9 commits September 10, 2025 16:08

Add logcontext to looping calls

6170762

Remove server_name arg

5d2e4c4

We will need this in #18868 but not for now

No need for a description

414b178

Fix db loop

788cd19

Revert keyword arg changes

77b3228

Multiple paragraphs

91f7bb3

Update docstring

2e88eb3

Apply the same treatment to call_later

fbf5946

Add comments for why we PreserveLoggingContext()

3809f3f

MadLittleMods added the A-Logging label Sep 10, 2025

MadLittleMods commented Sep 10, 2025

View reviewed changes

synapse/util/__init__.py Outdated Show resolved Hide resolved

Add changelog

442dbc0

MadLittleMods mentioned this pull request Sep 10, 2025

Remove sentinel logcontext where we log in Synapse #18905

Closed

MadLittleMods changed the title ~~Add logcontext to Clock utilities (looping_call, looping_call_now, call_later)~~ Remove sentinel logcontext in Clock utilities (looping_call, looping_call_now, call_later) Sep 10, 2025

MadLittleMods added 7 commits September 10, 2025 18:29

Sanity check that we start in the sentinel logcontext

13b938f

Improve Clock.sleep test

04825eb

Fix copy/paste typo

f4ad07d

Fixup looping_call

a8e66e7

All logcontext tests should use @logcontext_clean

0780183

Add test_looping_call

c535d8a

Add test_call_later and align call_later with looping_call

aec7065

MadLittleMods commented Sep 11, 2025

View reviewed changes

Align tests

2aa15b0

MadLittleMods commented Sep 11, 2025

View reviewed changes

MadLittleMods added 3 commits September 10, 2025 20:48

No need to return something

1c001c9

Revert "No need to return something"

13a5a36

This reverts commit 1c001c9.

Be more specific on what to return

eae83e8

MadLittleMods commented Sep 11, 2025

View reviewed changes

Fix typo in assertion message

3797515

Fix logcontext leak in email pusher test

9116e74

MadLittleMods commented Sep 11, 2025

View reviewed changes

MadLittleMods mentioned this pull request Sep 11, 2025

Remove sentinel logcontext where we log in setup, start and exit #18870

Merged

5 tasks

MadLittleMods marked this pull request as ready for review September 11, 2025 06:15

MadLittleMods requested a review from a team as a code owner September 11, 2025 06:15

MadLittleMods commented Sep 12, 2025

View reviewed changes

synapse/util/__init__.py Outdated Show resolved Hide resolved

MadLittleMods mentioned this pull request Sep 12, 2025

Explain how Deferred callbacks interact with logcontexts #18914

Merged

4 tasks

MadLittleMods commented Sep 12, 2025

View reviewed changes

tests/util/test_logcontext.py Outdated Show resolved Hide resolved

MadLittleMods commented Sep 12, 2025

View reviewed changes

tests/util/test_logcontext.py Outdated Show resolved Hide resolved

MadLittleMods added 4 commits September 12, 2025 18:03

Clarify why

57cc675

See #18907 (comment)

Use "foo" instead of "one"

45e6b78

See #18907 (comment) Prior art here uses `one` here but I find it difficult to write about in a comment. For example, "in one context" is kinda generic sounding when I'm talking about the specific context.

Wrap callback to see underlying failure

4145160

See #18907 (comment) > We should wrap this in a `try: ... finally: callback_finished = True` so we can see any potential underlying error that occurs in the `competing_callback()`

Merge branch 'develop' into madlittlemods/looping-call-logcontext

f2ae33a

This was referenced Sep 19, 2025

Introduce Clock.call_when_running(...) to include logcontext by default #18944

Merged

Introduce Clock.add_system_event_trigger(...) to include logcontext by default #18945

Merged

reivilibre approved these changes Sep 22, 2025

View reviewed changes

MadLittleMods added 3 commits September 22, 2025 11:19

Merge branch 'develop' into madlittlemods/looping-call-logcontext

160eb63

Conflicts: synapse/util/__init__.py

Cancelling the deferred doesn't matter AFACT

c8d0f97

See #18914 for more docs on how deferreds interact with logcontexts

Merge branch 'develop' into madlittlemods/looping-call-logcontext

774d25f

MadLittleMods merged commit e7d98d3 into develop Sep 22, 2025
76 of 78 checks passed

MadLittleMods deleted the madlittlemods/looping-call-logcontext branch September 22, 2025 19:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Remove `sentinel` logcontext in `Clock` utilities (`looping_call`, `looping_call_now`, `call_later`) #18907

Remove `sentinel` logcontext in `Clock` utilities (`looping_call`, `looping_call_now`, `call_later`) #18907

Uh oh!

MadLittleMods commented Sep 10, 2025 •

edited

Loading

Uh oh!

Uh oh!

MadLittleMods Sep 11, 2025 •

edited

Loading

Uh oh!

MadLittleMods Sep 11, 2025 •

edited

Loading

Uh oh!

reivilibre Sep 22, 2025

Uh oh!

MadLittleMods Sep 11, 2025 •

edited

Loading

Uh oh!

MadLittleMods Sep 11, 2025

Uh oh!

MadLittleMods Sep 11, 2025 •

edited

Loading

Uh oh!

MadLittleMods Sep 11, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

reivilibre Sep 22, 2025

Uh oh!

MadLittleMods Sep 22, 2025

Uh oh!

reivilibre Sep 22, 2025

Uh oh!

Uh oh!

MadLittleMods commented Sep 22, 2025

Uh oh!

Uh oh!

Remove sentinel logcontext in Clock utilities (looping_call, looping_call_now, call_later) #18907

Remove sentinel logcontext in Clock utilities (looping_call, looping_call_now, call_later) #18907

Uh oh!

Conversation

MadLittleMods commented Sep 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Testing strategy

Dev notes

Pull Request Checklist

Uh oh!

Uh oh!

MadLittleMods Sep 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

MadLittleMods Sep 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

reivilibre Sep 22, 2025

Choose a reason for hiding this comment

Uh oh!

MadLittleMods Sep 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

MadLittleMods Sep 11, 2025

Choose a reason for hiding this comment

Uh oh!

MadLittleMods Sep 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

MadLittleMods Sep 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

reivilibre Sep 22, 2025

Choose a reason for hiding this comment

Uh oh!

MadLittleMods Sep 22, 2025

Choose a reason for hiding this comment

Uh oh!

reivilibre Sep 22, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

MadLittleMods commented Sep 22, 2025

Uh oh!

Uh oh!

Remove `sentinel` logcontext in `Clock` utilities (`looping_call`, `looping_call_now`, `call_later`) #18907

Remove `sentinel` logcontext in `Clock` utilities (`looping_call`, `looping_call_now`, `call_later`) #18907

MadLittleMods commented Sep 10, 2025 •

edited

Loading

MadLittleMods Sep 11, 2025 •

edited

Loading

MadLittleMods Sep 11, 2025 •

edited

Loading

MadLittleMods Sep 11, 2025 •

edited

Loading

MadLittleMods Sep 11, 2025 •

edited

Loading

MadLittleMods Sep 11, 2025 •

edited

Loading