Skip to content

feat: implement JIT runtime#139

Draft
DaniPopes wants to merge 155 commits intomainfrom
dani/runtime
Draft

feat: implement JIT runtime#139
DaniPopes wants to merge 155 commits intomainfrom
dani/runtime

Conversation

@DaniPopes
Copy link
Copy Markdown
Contributor

Add a new revmc::runtime module implementing a coordinator-owned JIT compilation system with O(1) compiled-function lookup.

What

A reusable runtime for reth and other revm/revmc embedders that provides:

  • Startup AOT preload: loads pre-compiled artifacts from an ArtifactStore into an in-memory DashMap at startup.
  • O(1) lookup: JitCoordinatorHandle::lookup() only probes the resident map — no storage, no blocking, no waiting.
  • Fire-and-forget tracking: every lookup emits a non-blocking event to the coordinator thread.
  • Background JIT compilation: the coordinator tracks hotness per key and promotes hot bytecodes to JIT compilation on a worker pool when a configurable threshold is crossed.
  • Correct lifetime management: JIT-compiled function pointers are kept alive via Arc<WorkerBacking> — worker threads block until all program references are dropped.

Architecture

  • mod.rs: JitCoordinator (owns thread + map) and JitCoordinatorHandle (clonable lookup handle).
  • coordinator.rs: single-threaded event loop processing lookup-observed events and worker results.
  • worker.rs: thread pool with per-worker LLVM compilers, round-robin job dispatch.
  • api.rs: public types (LookupRequest, LookupDecision, CompiledProgram, etc.).
  • config.rs: RuntimeConfig and RuntimeTuning with sensible defaults.
  • storage.rs: ArtifactStore trait and data model (ArtifactKey, StoredArtifact, ArtifactManifest).
  • stats.rs: atomic counters with point-in-time snapshots.

Testing

  • Unit tests covering startup, lookup, enable/disable, event tracking, store failures.
  • State test integration via CompileMode::Runtime in revmc-statetest — AOT-compiles all bytecodes, loads through the coordinator, and verifies correct execution against the interpreter.

Status

Implements Phase 0 (AOT preload + O(1) lookup) and Phase 1 (coordinator hotness tracking + JIT background compile) from the plan. Phase 2 (explicit prepare_aot APIs) and Phase 3 (eviction, lifecycle hardening) are future work.

Startup AOT preload from ArtifactStore::load_all() into immutable
FxHashMap. O(1) lookup via JitCoordinatorHandle. Fire-and-forget
lookup-observed events to coordinator thread via bounded sync_channel.
Add CompileMode::Runtime that boots a JitCoordinator with an empty
store, runs all lookups through JitCoordinatorHandle::lookup(), and
falls back to CompileCache for misses (including CREATE/CREATE2).
Exercises the full lookup → miss → event tracking path.
On Linux, load shared libraries via memfd_create + dlopen(/proc/self/fd/{fd})
instead of writing to temp files. Entirely in-memory, no filesystem I/O.
Non-Linux platforms fall back to tempfile.
Remove memfd/tempfile machinery from the coordinator. The storage trait
now owns dylib files on disk and returns paths. The coordinator simply
dlopen's the path and resolves symbols.

- StoredArtifact: dylib_bytes Vec<u8> -> dylib_path PathBuf
- ArtifactStore::store() takes raw bytes, store writes to disk
- LoadedLibraryOwner simplified to LoadedLibrary (no backing variants)
- Remove rustix and tempfile runtime dependencies
Use a done-signal channel with recv_timeout instead of blocking
indefinitely on thread::join. Configurable via RuntimeTuning::shutdown_timeout
(default 5s).
- Coordinator tracks per-key hotness from lookup miss events.
- When hotness crosses jit_hot_threshold (default 8), enqueue JIT compile.
- Worker threads own long-lived EvmCompiler<EvmLlvmBackend> instances.
- Workers block on exit until all Arc<WorkerBacking> refs are dropped,
  ensuring JIT function pointers remain valid.
- Resident map uses ArcSwap for lock-free reads + coordinator-only writes.
- Coordinator publishes new map snapshot after each successful JIT.
- Miss events carry bytecode (Arc<[u8]>) so coordinator can compile.
- New tuning knobs: jit_hot_threshold, max_pending_jit_jobs,
  jit_worker_count, jit_worker_queue_capacity, jit_opt_level.
Coordinator inserts directly into the shared DashMap. Handles read
from it via DashMap::get. No more snapshot cloning or publish step.
Remove custom RuntimeError and StorageError types in favor of
eyre::Result throughout the runtime module.
@codspeed-hq
Copy link
Copy Markdown

codspeed-hq bot commented Mar 18, 2026

Merging this PR will not alter performance

✅ 69 untouched benchmarks


Comparing dani/runtime (9d2692b) with main (d627e93)

Open in CodSpeed

Workers now handle AOT jobs separately from JIT jobs. AOT jobs create
a temporary AOT compiler, translate bytecode, write an object file,
link it to a shared library, and return the raw .so bytes to the
coordinator.

The coordinator persists the artifact via ArtifactStore::store(), then
loads it back (dlopen + symbol resolution) into the resident map as
a ProgramKind::Aot entry.

Adds aot_opt_level to RuntimeTuning (defaults to Aggressive per plan).
- prepare_aot_persist_and_load: single AOT compile, persist, and load.
- prepare_aot_batch_persist_and_load: batch of 2.
- aot_artifacts_survive_restart: compile+persist, shut down, restart
  coordinator, verify artifact preloaded from store at startup.

Adds TempDirStore, a test ArtifactStore backed by a temp directory.
RuntimeHandler now borrows &CompileCache instead of cloning an Arc.
Eliminated the duplicate cache/cache_arc parameter threading.
…test

- Call clear_ir() after each JIT compilation in worker loop to reset
  the finalized module state, allowing subsequent compilations.
- Add get_compiled() to JitCoordinatorHandle for event-free resident
  map lookups.
- Rewrite statetest runtime mode to use the coordinator's JIT pipeline
  directly instead of CompileCache. Contracts are enqueued via
  compile_jit() and polled via get_compiled().
- Handle empty bytecode (EOA calls) by falling back to the interpreter
  instead of blocking forever on compilation.
DaniPopes added a commit that referenced this pull request Apr 7, 2026
- Add `ExecutionSessionRef::get_symbol_string_pool()` and derive `Clone
+ Copy` on `JITDylibRef`.
- Replace pool-based JITDylib recycling with per-compiler create/remove
lifecycle via `Arc<JitDylibGuard>`, ensuring JIT function pointers
remain valid as long as any guard holder exists.
- Add `take_last_resource_tracker()` and `jit_dylib_guard()` APIs.
- Periodically clear dead `SymbolStringPool` entries to avoid unbounded
interned-string growth.

Extracted from #139.
Replace JitBackend::start with JitBackend::new that defers thread
spawning until set_enabled(true). Add JitBackend::disabled() as a
convenience for a no-op default. Add enabled() getter, backend_mut(),
set_backend(), and JitEvm::disabled() helpers.
Replay logs and selfdestruct to the inspector after JIT execution
without calling step/step_end, and invoke frame_end on completion.
Add on_log field to EvmContext as Option<&mut dyn FnMut(&Log)> that the
LOG builtin invokes before Host::log. The inspect_frame_run override
installs a closure that forwards to Inspector::log during JIT execution.

Also adds call_with_interpreter_with to EvmCompilerFn for configuring
the EvmContext (e.g. installing callbacks) before the compiled function
runs.
DaniPopes added 20 commits April 9, 2026 02:49
When a JUMPI's fall-through block gets deduped, the leader mark on
the first dead instruction was silently lost. This caused the JUMPI
block to absorb the next alive instruction (e.g. INVALID) into the
same block. When INVALID became the terminator, DSE treated all exit
stack positions as dead, incorrectly NOOP-ing live PUSH instructions
needed for correct execution.

Add a `pending_leader` flag that remembers leader marks on dead
instructions and applies them to the next alive instruction, forcing
a correct block boundary.

Found via mainnet block 24,640,000 tx 0xc5accd8...c4365d82735148ad144b2c2847
where the JIT produced wrong gas (181,870, reverts) vs interpreter
(419,718, succeeds).
Addresses actionable findings from the cyclops full scan
(full-scan-20260410-152658). 10 fixes applied, 6 wontfix, 1 deferred.

## Fixes

- **invalidate_cache**: evict `Interpret` entries per-tx while retaining
`Compiled` entries; add invalidation to `system_call` and
`inspect_system_call` paths (stale cache bypass)
- **budget eviction**: skip AOT entries in Phase 2 since they don't
contribute to `jit_total_bytes()` (AOT cache wipe DoS)
- **compile_jit/compile_jit_sync/prepare_aot_batch**: call
`ensure_started()` so commands are never enqueued into an unstarted
backend (indefinite hang)
- **ensure_started**: restore `lazy_spawn` on preload failure so retries
work (permanent brick on transient failure)
- **handle_aot_success**: remove entry on failure instead of marking
`Failed`, preventing AOT errors from permanently poisoning JIT for a key
- **handle_compile_jit**: enforce `jit_max_bytecode_len` policy (policy
bypass)
- **handle_lookup_observed**: cap cold entry count to prevent unbounded
memory growth
- **`__revmc_builtin_difficulty`**: use `unwrap()` to match interpreter
panic semantics on missing prevrandao

## Wontfix

- AOT artifact key tamper / dylib integrity — requires compromised
filesystem, by-design trust boundary
- Control commands dropped under saturation — by-design `try_send`
fire-and-forget
- code_hash spoofing via compile APIs — `JitEvm` always derives hash
from bytecode; internal API surface
- set_enabled bypass for compile APIs — `enabled` is documented as
lookup-only flag
- log vs log_full in compiled inspect — JIT can't provide `&mut
Interpreter` during execution; architectural limitation
- SharedBuffer divergence in CALL builtins — architectural limitation of
JIT memory model

## Deferred

- Stack section overflow/underflow ordering — requires richer section
analysis, complex architectural change
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant