Skip to content

Crash on closing windows: heap-use-after-free in transaction_apply animation code #519

@sim590

Description

@sim590

Summary

SwayFX crashes when multiple windows are closed rapidly. The root cause is a heap-use-after-free in the animation code added by PR #388. When transaction_apply processes container instructions, it accesses con->animation_state on a container that has already been freed by view_unmap.

This is likely the same underlying bug reported in #477 and #490, both of which were closed without a fix.

SwayFX version

swayfx version 0.5.3 (based on sway 1.11.0) — commit 434d9924 (current master)

Steps to reproduce

  1. Open ~10 windows from the same application (e.g., Qutebrowser, or any app that can open/close multiple windows at once)
  2. Close all windows simultaneously

A reliable reproduction using qb-sway-session, a Qutebrowser userscript that saves/restores Sway sessions:

  1. Open Qutebrowser with ~10 windows across multiple workspaces
  2. Run :wq — Qutebrowser closes all windows at once
  3. SwayFX crashes

Importantly, this crash does NOT occur on upstream Sway 1.11. I verified this by temporarily installing the sway package (which replaces swayfx-git) and repeatedly closing ~10 Qutebrowser windows with :wq — no crash.

Root cause analysis

The crash was identified by building swayfx-git with AddressSanitizer (-fsanitize=address).

ASan report:

==131487==ERROR: AddressSanitizer: heap-use-after-free on address 0x7bb826ace428 at pc 0x556a4336a484 bp 0x7ffd8802b970 sp 0x7ffd8802b960
READ of size 1 at 0x7bb826ace428 thread T0
    #0 0x556a4336a483 in add_animation ../swayfx/sway/animation_manager.c:53
    #1 0x556a4336a483 in transaction_apply ../swayfx/sway/desktop/transaction.c:926
    #2 0x556a4336a483 in transaction_progress ../swayfx/sway/desktop/transaction.c:951
    #3 0x556a4336a5d4 in handle_timeout ../swayfx/sway/desktop/transaction.c:970
    #4 0x7f8828f1d426 in wl_event_loop_dispatch
    ...

0x7bb826ace428 is located 24 bytes inside of 32-byte region [0x7bb826ace410,0x7bb826ace430)
freed by thread T0 here:
    #0 0x7f882891f79d in free
    #1 0x556a43410399 in view_unmap ../swayfx/sway/tree/view.c:962
    #2 0x556a4336d8c3 in handle_unmap ../swayfx/sway/desktop/xdg_shell.c:465
    #3 0x7f8828f1b4cf in wl_signal_emit_mutable
    #4 0x7f882805d892 in wlr_surface_unmap ../wlroots0.19/types/wlr_compositor.c:868
    ...

previously allocated by thread T0 here:
    #0 0x7f8828920cb5 in malloc
    #1 0x556a433fd8df in container_create ../swayfx/sway/tree/container.c:179
    ...

SUMMARY: AddressSanitizer: heap-use-after-free ../swayfx/sway/animation_manager.c:53 in add_animation

What happens:

  1. When windows close, view_unmap (view.c:962) frees the container
  2. A queued transaction still holds an instruction referencing this now-freed container
  3. When transaction_progresstransaction_apply fires (via handle_timeout), it iterates instructions and reaches the freed container
  4. At line 914, should_con_new_animation(con, ...) dereferences the freed container
  5. At line 926, add_animation(con->animation_state.animation) reads animation->initialized (offset +24) from freed memory
  6. Depending on the state of the freed memory, this causes SIGSEGV, SIGABRT ("corrupted double-linked list"), or corruption in unrelated data structures (like pixman regions in SceneFX)

The animation code was introduced in PR #388 and does not check whether the container's node is being destroyed before accessing it. Upstream Sway's transaction_apply simply calls apply_container_state without any animation logic, which is why upstream Sway is not affected.

Coredump patterns observed (without ASan)

Without ASan, the corruption manifests in various ways depending on timing:

  1. SIGSEGV in __strdup / json_object_object_add_ex — IPC serialization accesses corrupted node data
  2. SIGABRT "corrupted double-linked list"free() detects heap corruption from earlier write to freed memory
  3. SIGABRT in pixman_region32_intersect via SceneFX — scene tree traversal hits corrupted node->visible region

All three patterns stem from the same use-after-free in the animation code.

Fix

The fix is a one-line guard — skip the animation setup when the node is being destroyed:

--- a/sway/desktop/transaction.c
+++ b/sway/desktop/transaction.c
@@ -911,7 +911,8 @@ static void transaction_apply(struct sway_transaction *transaction) {
 			break;
 		case N_CONTAINER:
 			struct sway_container *con = node->sway_container;
-			if (should_con_new_animation(con, &instruction->container_state)) {
+			if (!node->destroying &&
+					should_con_new_animation(con, &instruction->container_state)) {
 				should_start_new_animation = true;
 
 				// TODO: reset animation state on going to scratchpad

With this fix applied, the crash no longer occurs and ASan reports no errors.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions