Skip to content

Rooms cant be purged if the server is still trying to backfill #18379

Open
@ll-SKY-ll

Description

@ll-SKY-ll

Description

Got an invite to a matrix.org room, accepted it, joined, my server cant get state from matrix.org (error 403)

Matrix-Codestorm matrix-synapse[418]: 2025-04-25 09:20:44,223 - synapse.handlers.federation - 1994 - ERROR - sync_partial_state_room-0 - Failed to get state for !UmYaLNyAOwqMkqWkjY:matrix.org at <FrozenEventV3 event_id=$1IT0igWCYZpC3sNM8EMLcK3tcCTG_IX7l5rK-ket3ok, type=m.room.member, state_key=@sky:codestorm.net, membership=join, outlier=False> from matrix.org because ERROR 403: We can't get valid state history., giving up!

Matrix-Codestorm matrix-synapse[415]: 2025-04-20 02:28:06,960 - synapse.http.matrixfederationclient - 857 - WARNING - sync_partial_state_room-0-$1IT0igWCYZpC3sNM8EMLcK3tcCTG_IX7l5rK-ket3ok-$Qyl-AtVEmAWEiWwLqezNhxSpcpd2S4clxCHePKWzW1Q - {GET-O-1} [matrix.org] Request failed: GET matrix-federation://matrix.org/_matrix/federation/v1/state_ids/%21UmYaLNyAOwqMkqWkjY%3Amatrix.org?event_id=%24Qyl-AtVEmAWEiWwLqezNhxSpcpd2S4clxCHePKWzW1Q: HttpResponseException('403: Forbidden')

All events of that room from the events table:

synapse=# select * from events where room_id='!UmYaLNyAOwqMkqWkjY:matrix.org' order by origin_server_ts asc;
 topological_ordering |                   event_id                   |           type            |            room_id             | content | unrecognized_keys | processed | outlier | depth | origin_server_ts |  received_ts  |        sender        | contains_url | instance_name  | stream_ordering |      state_key       | rejection_reason 
----------------------+----------------------------------------------+---------------------------+--------------------------------+---------+-------------------+-----------+---------+-------+------------------+---------------+----------------------+--------------+----------------+-----------------+----------------------+------------------
                    1 | $CD_Hq8n-95SONfg65LfWqm9rixBhIq2jVTlTe-JJH6g | m.room.create             | !UmYaLNyAOwqMkqWkjY:matrix.org |         |                   | t         | t       |     1 |    1744978496904 | 1744978515323 | @zan34342:matrix.org | f            | stream_writer1 |       -22682921 |                      | 
                    2 | $_naci6p9GzjfG6XaDzn41XqOr7m2Rk_GmuwWzSFvukw | m.room.member             | !UmYaLNyAOwqMkqWkjY:matrix.org |         |                   | t         | t       |     2 |    1744978497274 | 1744978515323 | @zan34342:matrix.org | f            | stream_writer1 |       -22682922 | @zan34342:matrix.org | 
                    3 | $gNm1xEuj4TeZVOpLwN9fMUkWs_6mAe01HdMX4L8vw0g | m.room.power_levels       | !UmYaLNyAOwqMkqWkjY:matrix.org |         |                   | t         | t       |     3 |    1744978497578 | 1744978515323 | @zan34342:matrix.org | f            | stream_writer1 |       -22682923 |                      | 
                    4 | $f_d-0jS_CN-Kn57S21SAWX-vn8F8X_eNWQz_sv4FPKc | m.room.join_rules         | !UmYaLNyAOwqMkqWkjY:matrix.org |         |                   | t         | t       |     4 |    1744978497605 | 1744978515323 | @zan34342:matrix.org | f            | stream_writer1 |       -22682927 |                      | 
                    5 | $bBOjD-1Uz-B1ZWPSlkVwjXSYz-3tznYKHRcr6uYYmvg | m.room.history_visibility | !UmYaLNyAOwqMkqWkjY:matrix.org |         |                   | t         | t       |     5 |    1744978497606 | 1744978515323 | @zan34342:matrix.org | f            | stream_writer1 |       -22682926 |                      | 
                    7 | $Tu9QsgZZNPgbp_9Lcrpx9Qn9yWmM9P1pGKK7UxONuvw | m.room.encryption         | !UmYaLNyAOwqMkqWkjY:matrix.org |         |                   | t         | t       |     7 |    1744978497607 | 1744978515323 | @zan34342:matrix.org | f            | stream_writer1 |       -22682925 |                      | 
                    6 | $5nybUiWum_43Ago6MYGM_g2X6Jt51F4tcIFiQEeV0xc | m.room.guest_access       | !UmYaLNyAOwqMkqWkjY:matrix.org |         |                   | t         | t       |     6 |    1744978497607 | 1744978515323 | @zan34342:matrix.org | f            | stream_writer1 |       -22682924 |                      | 
                    8 | $SZ9XPJmeVjv3tqPKx1lL5ZZYbAzVX_eov8VTiybhG6M | m.room.member             | !UmYaLNyAOwqMkqWkjY:matrix.org |         |                   | t         | t       |     8 |    1744978498587 | 1744978498874 | @zan34342:matrix.org | f            | stream_writer1 |         2168407 | @sky:codestorm.net   | 
                   11 | $1IT0igWCYZpC3sNM8EMLcK3tcCTG_IX7l5rK-ket3ok | m.room.member             | !UmYaLNyAOwqMkqWkjY:matrix.org |         |                   | t         | f       |    11 |    1744978514311 | 1744978515384 | @sky:codestorm.net   | f            | stream_writer1 |         2168408 | @sky:codestorm.net   | 
                   12 | $vxHpFmdI6br8ETiikv9STkFm1vund0Xp6XTUgSZlHMg | m.room.member             | !UmYaLNyAOwqMkqWkjY:matrix.org |         |                   | t         | f       |    12 |    1744987930052 | 1744987930079 | @sky:codestorm.net   | f            | stream_writer1 |         2168799 | @sky:codestorm.net   | 

left the room and wanted to purge from db as i usually do to keep the db clean/small

Used the v2 of the room delete api

purge job is infinitely stuck

from the postgres logs i can see that it complains that the key (lock_name, lock_key, write_lock)=(purge_pagination_lock, !UmYaLNyAOwqMkqWkjY:matrix.org, t) is not present in the "worker_read_write_locks_mode" table.

Here are the relevant tables from my db:

synapse=# select * from worker_read_write_locks;
       lock_name       |            lock_key            | instance_name | write_lock | token  | last_renewed_ts 
-----------------------+--------------------------------+---------------+------------+--------+-----------------
 purge_pagination_lock | !UmYaLNyAOwqMkqWkjY:matrix.org | master        | f          | hGGoKQ |   1746089930283
synapse=# select * from worker_read_write_locks_mode;
       lock_name       |            lock_key            | write_lock | token  
-----------------------+--------------------------------+------------+--------
 purge_pagination_lock | !UmYaLNyAOwqMkqWkjY:matrix.org | f          | hGGoKQ

I believe that the "maybe_backfill" function is the source of the problem here which still has a lock on that room, but with "write_lock: f", see here:

PURGE_PAGINATION_LOCK_NAME, room_id, write=False

A log output from after server restart is attached below, it looks like the server is first trying to backfill the events, which causes the lock on the room and after that the purge job is picked up again, but by then the room is already locked again

Steps to reproduce

Join room and have the remote server not return events for it to you, then leave and try to purge the room

Homeserver

codestorm.net

Synapse Version

1.128

Installation Method

Debian packages from packages.matrix.org

Database

PostgreSQL | single server | not ported | yes, during migrations

Workers

Multiple workers

Platform

Debian LXC container on Proxmox

Configuration

No response

Relevant log output

Matrix-Codestorm matrix-synapse[129813]: 2025-05-01 11:18:12,478 - synapse.http.matrixfederationclient - 857 - WARNING - sync_partial_state_room-0-$1IT0igWCYZpC3sNM8EMLcK3tcCTG_IX7l5rK-ket3ok-$Qyl-AtVEmAWEiWwLqezNhxSpcpd2S4clxCHePKWzW1Q - {GET-O-1} [matrix.org] Request failed: GET matrix-federation://matrix.org/_matrix/federation/v1/state_ids/%21UmYaLNyAOwqMkqWkjY%3Amatrix.org?event_id=%24Qyl-AtVEmAWEiWwLqezNhxSpcpd2S4clxCHePKWzW1Q: HttpResponseException('403: Forbidden')
Matrix-Codestorm matrix-synapse[129813]: 2025-05-01 11:18:12,478 - synapse.handlers.federation_event - 1192 - WARNING - sync_partial_state_room-0-$1IT0igWCYZpC3sNM8EMLcK3tcCTG_IX7l5rK-ket3ok - Error attempting to resolve state at missing prev_events: 403: Forbidden
Matrix-Codestorm matrix-synapse[129813]: 2025-05-01 11:18:12,478 - synapse.handlers.federation - 1994 - ERROR - sync_partial_state_room-0 - Failed to get state for !UmYaLNyAOwqMkqWkjY:matrix.org at <FrozenEventV3 event_id=$1IT0igWCYZpC3sNM8EMLcK3tcCTG_IX7l5rK-ket3ok, type=m.room.member, state_key=@sky:codestorm.net, membership=join, outlier=False> from matrix.org because ERROR 403: We can't get valid state history., giving up!
Matrix-Codestorm postgres[130006]: [7-2] 2025-05-01 11:19:13.402 CEST [130006] synapse_user@synapse DETAIL:  Key (lock_name, lock_key, write_lock)=(purge_pagination_lock, !UmYaLNyAOwqMkqWkjY:matrix.org, t) is not present in table "worker_read_write_locks_mode".
Matrix-Codestorm postgres[130006]: [7-3] 2025-05-01 11:19:13.402 CEST [130006] synapse_user@synapse STATEMENT:  INSERT INTO worker_read_write_locks (lock_name, lock_key, write_lock, instance_name, token, last_renewed_ts) VALUES('purge_pagination_lock', '!UmYaLNyAOwqMkqWkjY:matrix.org', true, 'client_worker1', 'PzRlse', 1746091153401)
Matrix-Codestorm postgres[130020]: [7-2] 2025-05-01 11:19:13.496 CEST [130020] synapse_user@synapse DETAIL:  Key (lock_name, lock_key, write_lock)=(purge_pagination_lock, !UmYaLNyAOwqMkqWkjY:matrix.org, t) is not present in table "worker_read_write_locks_mode".
Matrix-Codestorm postgres[130020]: [7-3] 2025-05-01 11:19:13.496 CEST [130020] synapse_user@synapse STATEMENT:  INSERT INTO worker_read_write_locks (lock_name, lock_key, write_lock, instance_name, token, last_renewed_ts) VALUES('purge_pagination_lock', '!UmYaLNyAOwqMkqWkjY:matrix.org', true, 'client_worker1', 'hnypGP', 1746091153495)
Matrix-Codestorm postgres[130018]: [7-2] 2025-05-01 11:19:18.836 CEST [130018] synapse_user@synapse DETAIL:  Key (lock_name, lock_key, write_lock)=(purge_pagination_lock, !UmYaLNyAOwqMkqWkjY:matrix.org, t) is not present in table "worker_read_write_locks_mode".
Matrix-Codestorm postgres[130018]: [7-3] 2025-05-01 11:19:18.836 CEST [130018] synapse_user@synapse STATEMENT:  INSERT INTO worker_read_write_locks (lock_name, lock_key, write_lock, instance_name, token, last_renewed_ts) VALUES('purge_pagination_lock', '!UmYaLNyAOwqMkqWkjY:matrix.org', true, 'client_worker1', 'kUeGJG', 1746091158835)
Matrix-Codestorm postgres[130020]: [8-2] 2025-05-01 11:19:29.835 CEST [130020] synapse_user@synapse DETAIL:  Key (lock_name, lock_key, write_lock)=(purge_pagination_lock, !UmYaLNyAOwqMkqWkjY:matrix.org, t) is not present in table "worker_read_write_locks_mode".
Matrix-Codestorm postgres[130020]: [8-3] 2025-05-01 11:19:29.835 CEST [130020] synapse_user@synapse STATEMENT:  INSERT INTO worker_read_write_locks (lock_name, lock_key, write_lock, instance_name, token, last_renewed_ts) VALUES('purge_pagination_lock', '!UmYaLNyAOwqMkqWkjY:matrix.org', true, 'client_worker1', 'pLgfKa', 1746091169835)
Matrix-Codestorm postgres[130021]: [7-2] 2025-05-01 11:19:48.484 CEST [130021] synapse_user@synapse DETAIL:  Key (lock_name, lock_key, write_lock)=(purge_pagination_lock, !UmYaLNyAOwqMkqWkjY:matrix.org, t) is not present in table "worker_read_write_locks_mode".
Matrix-Codestorm postgres[130021]: [7-3] 2025-05-01 11:19:48.484 CEST [130021] synapse_user@synapse STATEMENT:  INSERT INTO worker_read_write_locks (lock_name, lock_key, write_lock, instance_name, token, last_renewed_ts) VALUES('purge_pagination_lock', '!UmYaLNyAOwqMkqWkjY:matrix.org', true, 'client_worker1', 'kCpOdx', 1746091188483)
Matrix-Codestorm postgres[130015]: [7-2] 2025-05-01 11:20:29.519 CEST [130015] synapse_user@synapse DETAIL:  Key (lock_name, lock_key, write_lock)=(purge_pagination_lock, !UmYaLNyAOwqMkqWkjY:matrix.org, t) is not present in table "worker_read_write_locks_mode".
Matrix-Codestorm postgres[130015]: [7-3] 2025-05-01 11:20:29.519 CEST [130015] synapse_user@synapse STATEMENT:  INSERT INTO worker_read_write_locks (lock_name, lock_key, write_lock, instance_name, token, last_renewed_ts) VALUES('purge_pagination_lock', '!UmYaLNyAOwqMkqWkjY:matrix.org', true, 'client_worker1', 'ipHmVk', 1746091229518)

Anything else that would be useful to know?

No response

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions