feat(block): block-level COW overlay with dirty bitmap tracking#5
feat(block): block-level COW overlay with dirty bitmap tracking#5meAmitPatil wants to merge 9 commits into
Conversation
Signed-off-by: Amit Patil <iamitpatil2001@gmail.com>
Signed-off-by: Amit Patil <iamitpatil2001@gmail.com>
Signed-off-by: Amit Patil <iamitpatil2001@gmail.com>
…stence Signed-off-by: Amit Patil <iamitpatil2001@gmail.com>
Signed-off-by: Amit Patil <iamitpatil2001@gmail.com>
Signed-off-by: Amit Patil <iamitpatil2001@gmail.com>
…rruption Signed-off-by: Amit Patil <iamitpatil2001@gmail.com>
Signed-off-by: Amit Patil <iamitpatil2001@gmail.com>
The implementation diverged from our stub API in four ways: - DirtyBitmap::new takes (disk_size_bytes, block_size) not (num_blocks) - bm.set(offset, len) not bm.set(block_idx) - bm.clear() not bm.clear_all(); bm.dirty_count() not bm.dirty_block_count() - OverlayFileEngine::from_files() not with_block_size(); read/write take GuestMemoryMmap + GuestAddress not plain byte slices - No reset() method — reset bench now uses update_overlay() to swap in a fresh file, which is the actual public reset path Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
dirty_bitmap.rs and overlay_io.rs were TDD stubs written before the real implementation landed in feat/block-overlay-cow (PR #5). Now that PR #5 has the authoritative implementation, the stubs would only cause merge conflicts. The bench files are all that belongs here. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
pavitrabhalla
left a comment
There was a problem hiding this comment.
Code Review: Block-level COW overlay with dirty bitmap tracking
Overall: Strong implementation. The architecture is clean with well-separated modules (dirty_bitmap, overlay_io, delta). The COW pattern, delta format with CRC64 integrity, DISCARD support, and snapshot persistence are all well thought out. 40 unit tests provide good coverage of the core modules.
CRITICAL
1. write_block_deltas silently swallows per-device errors (src/vmm/src/device_manager/mod.rs:352-371)
The method logs errors with error!() but always returns Ok(()). If a delta write fails for any device, the snapshot completes successfully but the delta file is missing/incomplete. On clone-restore, apply_delta will fail with a confusing I/O error instead of failing at snapshot time where the problem actually occurred.
if let Err(e) = block.write_delta(&delta_path) {
error!("Failed to write block delta for {}: {:?}", block.id(), e);
// BUG: should return Err here, or at least collect errors
}Suggest: collect the first error and return it, or return a Vec<(String, DeltaError)> so the caller can decide.
MAJOR
2. FileEngine::file() panics for overlay (src/vmm/src/devices/virtio/block/virtio/io/mod.rs:98)
FileEngine::Overlay(_) => unimplemented!("overlay engine has two files"),Any code path that calls file() on an overlay device will panic. This should return a Result or at minimum provide a meaningful alternative (e.g., return the overlay file). Check all callers of file() to ensure none can reach this for overlay devices.
3. from_file for Overlay returns misleading error (src/vmm/src/devices/virtio/block/virtio/io/mod.rs:70-77)
Returns SizeMismatch { base_size: 0, overlay_size: 0 } which is semantically incorrect. Add a dedicated error variant like OverlayIoError::NotConstructibleFromFile to avoid confusion during debugging.
4. DISCARD length truncation (src/vmm/src/devices/virtio/block/virtio/request.rs:395-410)
let len = u64::from(num_sectors) << SECTOR_SHIFT;
if let Err(e) = disk.file_engine.discard(offset, len as u32) {len is u64 but discard() takes u32. For disks >2GB with a single large TRIM, len as u32 silently truncates. Either:
- Change
discard()to acceptu64, or - Validate/cap
lenbefore the cast and split large discards into multiple calls
5. update_overlay doesn't reset bitmap (src/vmm/src/devices/virtio/block/virtio/io/overlay_io.rs:421-423)
After update_overlay(new_file), the existing bitmap may not match the new overlay's contents. Since update_disk_image already rejects overlay devices (good!), document this method as internal-only or make it pub(crate).
MINOR
6. unsafe impl Send may be unnecessary (overlay_io.rs:394)
The safety comment ("File is send and ultimately a POD") is incorrect — File is already Send because it wraps a file descriptor. Check whether OverlayFileEngine auto-derives Send without the unsafe impl. If it does, remove the manual impl.
7. Unused variable start_block (overlay_io.rs:2584)
let start_block = offset / block_size; // computed but never read8. No sync_all() after delta write (delta.rs:1470)
write_delta calls writer.flush() but not sync_all(). If the system crashes between flush and the snapshot metadata write, the delta file may be partially written to the page cache but not persisted. The data CRC would catch corruption on apply, but the failure mode is "delta file exists but is truncated/corrupt" rather than "delta file doesn't exist" — the latter is easier to handle.
9. read_obj silently defaults to 0 on failure (request.rs:3222-3225)
let sector: u64 = mem.read_obj(seg_addr).unwrap_or(0);
let num_sectors: u32 = mem.read_obj(GuestAddress(seg_addr.0 + 8)).unwrap_or(0);If guest memory read fails, this silently discards offset 0 with length 0 instead of returning an error. This could mask guest memory corruption.
OBSERVATIONS
- Snapshot version bump 9.0.0 → 9.1.0 is correct for backward-compatible additions. The
#[serde(default)]onoverlay_stateensures old snapshots deserialize correctly. ✅ - DISCARD code path has no test coverage (only the overlay_io unit test covers discard indirectly through the bitmap).
- Benchmark scripts use hardcoded paths (
~/firecracker/build/...) — fine for manual use but won't work in CI. - The
iter_dirtyfilter uses*bit == trueinstead of idiomatic**bit— cosmetic only.
Summary
The core COW overlay, dirty bitmap, and delta format are solid and well-tested. The main concern is the silent error swallowing in write_block_deltas (CRITICAL) and the DISCARD u32 truncation (MAJOR). Fix those and this is ready to merge.
Signed-off-by: Amit Patil <iamitpatil2001@gmail.com>
Signed-off-by: Amit Patil <iamitpatil2001@gmail.com>
f58bb4f to
440a303
Compare
|
Tested end-to-end on a real host (GCE us-central1, ext4 persistent disk, kernel 6.17.0). Built this branch at commit 440a303 for Cold-boot → snapshot → restore round-trip
LGTM — good to merge. |
Signed-off-by: Amit Patil <iamitpatil2001@gmail.com>
Summary
Adds a block-level copy-on-write overlay filesystem to Firecracker's virtio block device, enabling efficient snapshot/clone/reset for sandbox workloads.
io_engine: "Overlay",base_path,block_delta_diron create/load snapshotBenchmark Results (bare metal — AMD Ryzen 9 7950X3D, NVMe)
Snapshot / Clone
Reset (return to clean state)
Test plan
./tools/devtool checkbuild --all