Binary patching rust hot-reloading, sub-second rebuilds, independent server/client hot-reload #3797

jkelleyrtp · 2025-02-25T05:44:01Z

Inlines the work from https://github.com/jkelleyrtp/ipbp to bring pure rust hot-reloading to Dioxus.

fast_reload.mp4

The approach we're taking works across all platforms though each will require some bespoke logic. The object crate is thankfully generic over mac/win/linux/wasm, though we need to handle system linkers differently.

This change also enables dx to operate as a faster linker allowing sub-second (in many cases, sub 200ms) incremental rebuilds.

Todo:

Notes:

This unfortunately brings a very large refactor to the build system since we need to persist app bundles while allowing new builds to be "merged" into them. I ended up flattening BuildRequest + Bundle together and Runner + Builder together since we need knowledge of previous bundles and currently running processes to get patching to work properly.

jkelleyrtp · 2025-03-18T21:13:39Z

progress update

I've migrated everything over from ipbp so now anyone should be able to run the demos in macOS/iOS. Going to add linux + android support next.

I've been tinkering with the syntax for subsecond a bit and am generally happy now with the API. You can wrap any closure with ::call() and that closure is now "hot":

pub fn launch() {
    loop {
        std::thread::sleep(std::time::Duration::from_secs(1));
        subsecond::call(|| tick());
    }
}

fn tick() {
    println!("boom!");
}

If you need more granular support over "hot" functions then you'll want to use ::current(closure) which gives you a HotFn with extra flags and methods for running a callback. It also lets you run closures which are FnOnce since ::call() currently does not.

::call() taking an FnMut is meant to provide an "unwind" point that our assembly-diffing logic can bounce up to by emitting panics. This is meant to support cases where you might add a field to a struct and need to "rebuild" the app from a higher checkpoint (aka re-instancing).

For example, a TUI app with some state:

struct App {
    should_exit: bool,
    temperatures: Vec<u8>,
}

might implement a "run" method that calls subsecond:

    fn run(&mut self, terminal: &mut DefaultTerminal) -> Result<()> {
        while !self.should_exit {
            subsecond::call(|| self.tick(terminal))?;
        }
        Ok(())
    }

If the struct's size/layout change, then we want to rebuild the app from scratch. Alternatively, we could somehow migrate it, which is out of scope for this PR, but implementations can be found in libraries like dexterous. We might end up taking an approach that unwinds the stack to the app's constructor and then copies it to a new size/layout, merging the new fields in. TODO on what this should look like.

Here's a vide of the tui_demo in the subsecond_harness crate:

subsecond-tui.mp4

runtime integration

Originally I wanted to use LLDB to drive the patching system - and we still might need to for proper "patching" - but I ran into a bunch of segfaults and LLDB crashes when we sigstopped the program in the middle of malloc/demalloc. Apparently there's a large list of things you cannot do when a program is sigstopped and using allocators is one such thing. We could look into using a dedicated bump allocator and continue using lldb, but for now I have an adapter build on websockets. We might end up migrating to a shared-memory system such that the HOST and DUT can share the patch table freely. The challenge with these approaches is that they're not very portable and websockets seem to be available literally everywhere.

zero-link / thinlink

One cool thing spun out of this work is "zerolink" (thinlink maybe?): our new approach for drastically speeding up rust compile times by automatically using dynamic linking. This is super useful for tests, benchmarks, and general development since we can automatically split your workspace crates from your "true" dependencies and skip linking your dependencies on every build.

This means you can turn up opt levels and leave debug symbols (two things that generally slow down builds) which incurs a one-time cost and then continuously dynamically link your incremental object files against the dependencies dylib. Most OSes support a dyld_cache equivalent which keeps your dependencies.dylib memory mapped and cached between invocations which also greatly speed up launch times.

ZeroLink isn't really an "incremental linker" per se, but it behaves like one thanks to Rust's incremental compile system. In spirit it's very similar to marking a crate as a dylib crate in your crate graph (see bevy/dynamic) but it doesn't require you to change any of your crates and it supports WASM.

dx is standalone

I wanted to use zerolink with non-dioxus projects, so this PR also makes dx a standalone rust runner. You can dx run your project and dioxus does not need to be part of your crate graph for it to work. This lets us bootstrap dx by running dx with itself and making it easy to update the TUI without fully rebuilding the CLI.

wasm work

WASM does not support dynamic linking so we need to mess with the binaries ourselves. Fortunately this is as simple as linking the deps together to a relocatable object file, lifting the symbols into the export table, and recording the element segments.

When the patches load they need two things

addresses within the ifunc table for ifuncs
imports from the main module

unfortunately the wasm-bindgen pass runs ::gc so I don't think there's any cool combination of flags we can use against wasm-ld to do this for us automatically. However, all the work we put into wasm_split really comes in handy.

What's left

There's three avenues of work left here:

Propagating the change graph through the HotFn points
More platform support (windows, wasm, server_fn)
Bugs (better handling of statics, destructors, renaming symbols, changing signatures, and dioxus integration like Global)

I expect Windows + WASM to take the longest to get proper support and will prioritize that over propagating the change graph. Dioxus can function properly without a sophisticated change graph, but other libraries will want the richer detail available.

DrewRidley · 2025-03-19T19:41:13Z

Awesome work here! I might recommend adding .arg("-Zcodegen-backend=cranelift") as an optional user-facing argument when hot reloading.

I found on my M3 Pro Macbook it brings down the average times from ~600ms to ~300ms. The backend ships as a cargo component now so it should be a drop in replacement for desktop or possibly mobile platforms.

jkelleyrtp · 2025-03-19T20:56:59Z

Awesome work here! I might recommend adding .arg("-Zcodegen-backend=cranelift") as an optional user-facing argument when hot reloading.

I found on my M3 Pro Macbook it brings down the average times from ~600ms to ~300ms. The backend ships as a cargo component now so it should be a drop in replacement for desktop or possibly mobile platforms.

Wow that's incredible!

On my M1 I've been getting around 900ms on the dioxus harness with default dev profile and then 500-600 with the subsecond-dev profile:

[profile.subsecond-dev]
inherits = "dev"
debug = 0
strip = "debuginfo"

I'll add the cranelift backend option and then report back. In the interim you can check to see if that profile speeds up your cranelift builds at all.

I did some profiling of rustc and about 100-300ms is spent copying incremental artifacts on disk. That's pretty substantial given the whole process is like 500ms. Hopefully this is improved here:

rust-lang/rust#128320

I would like to see that time drop to 0ms at some point and then we'd basically have "blink and you miss it" hotpatching.

DrewRidley · 2025-03-19T21:48:14Z

I tried the profile and with or without it, its consistently ~300ms on my Mac. ~~When doing self-profile I noticed that register allocation takes a huge portion of the total time spent~~

~~I discovered this (https://docs.wasmtime.dev/api/cranelift_codegen/settings/enum.RegallocAlgorithm.html) which might help if its been backported to codegen_clif as a flag or option.~~

That seemed to be a fluke in testing and actually the remaining time is mostly incremental cache related file IO. Not sure how much can be done about that.

Regardless, this is super exciting work, let me know if there's any other way I can help.

jkelleyrtp · 2025-03-22T06:30:15Z

I switched to a slightly modified approach (lower level, faster, more reliable, more complex).

This is implemented to work around a number of very challenging android issues

pointer tagging
mte
linker namespaces
read/write permissions

Since this is more flexible it should work across linux and windows (android and linux are the same). Last target is wasm.

Here's the android demo:

hotpatch-android.mp4

iOS:

ios-binarypatch.mp4

jkelleyrtp added 3 commits February 24, 2025 21:40

wip: inline object diffing logic

ed53d2f

inline more logic

309e5e6

fix new linker

5e2c247

jkelleyrtp changed the title ~~Binary patching rust hot-reloading~~ Binary patching rust hot-reloading, sub-second rebuilds Feb 25, 2025

jkelleyrtp changed the title ~~Binary patching rust hot-reloading, sub-second rebuilds~~ Binary patching rust hot-reloading, sub-second rebuilds, independent server/client hot-reload Feb 25, 2025

jkelleyrtp added 3 commits February 25, 2025 18:21

wip

609f595

mostly applied

dda3167

smaller refactor

68becb3

jkelleyrtp force-pushed the jk/binary-patch branch from e0d52a4 to 68becb3 Compare February 27, 2025 20:26

jkelleyrtp added 12 commits February 27, 2025 16:39

dont need app

2d6bb37

finish wiring some stuff together

981bd1a

wip....

846d999

fix missing statics with --whole-archive

72a76c3

wip: wire up jump table

db3aaaf

drop hot-fn

88efabf

rt in core

12ad69c

mostly finish migration

320dee8

clean up prototype code a bit

41eec1b

more cleanups

fe61fee

arbitrary callbacks work, switch to websocket

862c6ea

docs, panic, wip unwind system

8fa7259

jkelleyrtp added 2 commits March 19, 2025 01:58

macos, ios, working. windows/linux not tested by theoretically working?

b08f255

more formal target support

b87c665

jkelleyrtp added 3 commits March 19, 2025 18:00

wip: fairly thick refactor

35096fb

get back to compiling state

dcf5446

... fix compilation whoops

429cb87

jkelleyrtp added 5 commits March 20, 2025 22:31

switch to using proper patching instead of runtime lookups

37bceae

bidirectional websocket

1e095a9

i somehow broke it... one sec

79b82cc

checkpoint - mac and ios

4f44029

hot-patch work on android

63ab8f2

jkelleyrtp added 8 commits March 22, 2025 00:17

wip, get harness compiling on wasm

6c8c072

build wasm

dc6b110

wip: wasm

c11d76c

wasm is very very close

6eec643

goly gee wasm hot patching works

5297bbb

clean up impl a bit

4eccd45

auto assemble jump table

438e782

add websocket support for web

b22ca5d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Binary patching rust hot-reloading, sub-second rebuilds, independent server/client hot-reload #3797

Binary patching rust hot-reloading, sub-second rebuilds, independent server/client hot-reload #3797

jkelleyrtp commented Feb 25, 2025 •

edited

Loading

jkelleyrtp commented Mar 18, 2025 •

edited

Loading

DrewRidley commented Mar 19, 2025

jkelleyrtp commented Mar 19, 2025 •

edited

Loading

DrewRidley commented Mar 19, 2025 •

edited

Loading

jkelleyrtp commented Mar 22, 2025 •

edited

Loading

Binary patching rust hot-reloading, sub-second rebuilds, independent server/client hot-reload #3797

Are you sure you want to change the base?

Binary patching rust hot-reloading, sub-second rebuilds, independent server/client hot-reload #3797

Conversation

jkelleyrtp commented Feb 25, 2025 • edited Loading

Notes:

jkelleyrtp commented Mar 18, 2025 • edited Loading

progress update

runtime integration

zero-link / thinlink

dx is standalone

wasm work

What's left

DrewRidley commented Mar 19, 2025

jkelleyrtp commented Mar 19, 2025 • edited Loading

DrewRidley commented Mar 19, 2025 • edited Loading

jkelleyrtp commented Mar 22, 2025 • edited Loading

jkelleyrtp commented Feb 25, 2025 •

edited

Loading

jkelleyrtp commented Mar 18, 2025 •

edited

Loading

jkelleyrtp commented Mar 19, 2025 •

edited

Loading

DrewRidley commented Mar 19, 2025 •

edited

Loading

jkelleyrtp commented Mar 22, 2025 •

edited

Loading