Skip to content

[EVFS] Per-segment encryption, integrity, and secure deletion #54

Description

@Adel-Ayoub

Summary

Implement HKDF-based key derivation for domain-separated sub-keys, per-segment compress-then-encrypt pipeline with
generation-aware nonce derivation, BLAKE3 integrity checksums (on original plaintext, pre-compression) with
constant-time verification, pre-allocation with CSPRNG fill, and secure deletion (overwrite deleted regions with
random bytes).


What to Build

1. New file: rust/src/api/evfs/segment.rs

use crate::api::compression::{self, CompressionAlgorithm, CompressionConfig};
use crate::core::error::CryptoError;
use crate::core::format::Algorithm;
use crate::core::secret::SecretBuffer;
use std::fs::File;
use std::io::{Seek, SeekFrom, Write};

/// HKDF domain-separation info strings.
const CIPHER_KEY_INFO: &[u8] = b"msec-vault-cipher-key";
const NONCE_KEY_INFO: &[u8]  = b"msec-vault-nonce-key";
const INDEX_KEY_INFO: &[u8]  = b"msec-vault-index-key";

/// Derived sub-keys from the master key.
/// All fields use SecretBuffer for ZeroizeOnDrop.
pub struct VaultKeys {
    pub cipher_key: SecretBuffer,  // 32 bytes — for segment AEAD
    pub nonce_key: SecretBuffer,   // 32 bytes — for nonce derivation
    pub index_key: SecretBuffer,   // 32 bytes — for index AEAD
}

/// Derive three sub-keys from the user's master key via HKDF-SHA256.
///
/// Each sub-key uses a unique info string for domain separation:
/// - cipher_key: HKDF-Expand(master, info="msec-vault-cipher-key", 32)
/// - nonce_key:  HKDF-Expand(master, info="msec-vault-nonce-key",  32)
/// - index_key:  HKDF-Expand(master, info="msec-vault-index-key",  32)
pub fn derive_vault_keys(master_key: &[u8]) -> Result { ... }

/// Derive a deterministic nonce for a segment.
///
/// nonce = HKDF-Expand(nonce_key, info = segment_index_le_bytes || generation_le_bytes, nonce_len)
///
/// Including the generation counter ensures that overwriting a segment
/// at the same index produces a different nonce, preventing nonce reuse.
///
/// nonce_len: 12 for AES-GCM / ChaCha20-Poly1305.
pub fn derive_segment_nonce(
    nonce_key: &[u8],
    segment_index: u64,
    generation: u64,
    nonce_len: usize,
) -> Result<Vec, CryptoError> { ... }

/// Compress-then-encrypt a segment's plaintext data.
///
/// Pipeline:
/// 1. If compression != None AND segment name is not an already-compressed
///    format (MIME-aware skip via `should_skip_compression`), compress the data
/// 2. Derive nonce from (nonce_key, segment_index, generation)
/// 3. Encrypt compressed data with cipher_key using the derived nonce
/// 4. Return: (nonce || ciphertext || tag, effective_algorithm)
///
/// The effective algorithm may differ from the requested one if MIME-aware
/// skip was triggered (e.g., requested Zstd but segment name is "photo.jpg"
/// → effective = None).
///
/// BLAKE3 checksum is computed by the caller on the original plaintext
/// BEFORE calling this function, so integrity covers user data, not
/// the compressed form.
pub fn encrypt_segment(
    cipher_key: &[u8],
    nonce_key: &[u8],
    algorithm: Algorithm,
    segment_index: u64,
    generation: u64,
    plaintext: &[u8],
    segment_name: &str,
    compression: &CompressionConfig,
) -> Result<(Vec, CompressionAlgorithm), CryptoError> {
    // Determine effective compression (MIME-aware skip)
    let effective_algo = if compression.algorithm != CompressionAlgorithm::None
        && compression::should_skip_compression(segment_name)
    {
        CompressionAlgorithm::None
    } else {
        compression.algorithm
    };

    // Compress
    let data = if effective_algo != CompressionAlgorithm::None {
        compression::compress(plaintext, &CompressionConfig {
            algorithm: effective_algo,
            level: compression.level,
        })?
    } else {
        plaintext.to_vec()
    };

    // Derive nonce and encrypt
    let nonce = derive_segment_nonce(nonce_key, segment_index, generation, 12)?;
    // ... AEAD encrypt `data` with nonce ...
    // Return (nonce || ciphertext || tag, effective_algo)
    Ok((encrypted, effective_algo))
}

/// Decrypt-then-decompress a segment's encrypted data.
///
/// Pipeline:
/// 1. Extract nonce from the front of the encrypted data
/// 2. Derive the expected nonce from (nonce_key, segment_index, generation)
/// 3. Verify the extracted nonce matches the derived nonce
/// 4. Decrypt with cipher_key
/// 5. If compression != None, decompress
/// 6. Return original plaintext
///
/// The `compression` argument comes from the `SegmentEntry.compression`
/// field stored in the index — no guessing needed.
pub fn decrypt_segment(
    cipher_key: &[u8],
    nonce_key: &[u8],
    algorithm: Algorithm,
    segment_index: u64,
    generation: u64,
    encrypted: &[u8],
    compression: CompressionAlgorithm,
) -> Result<Vec, CryptoError> {
    // Derive expected nonce
    let expected_nonce = derive_segment_nonce(nonce_key, segment_index, generation, 12)?;
    // ... extract nonce, verify, AEAD decrypt ...

    // Decompress if needed
    let plaintext = if compression != CompressionAlgorithm::None {
        compression::decompress(&decrypted, compression)?
    } else {
        decrypted
    };

    Ok(plaintext)
}

/// Encrypt the segment index using the index sub-key.
///
/// Uses a fixed info string for nonce derivation so the index
/// can be decrypted without knowing any segment metadata.
pub fn encrypt_index(
    index_key: &[u8],
    algorithm: Algorithm,
    plaintext: &[u8],
) -> Result<Vec, CryptoError> { ... }

/// Decrypt the segment index using the index sub-key.
pub fn decrypt_index(
    index_key: &[u8],
    algorithm: Algorithm,
    encrypted: &[u8],
) -> Result<Vec, CryptoError> { ... }

/// Compute BLAKE3 checksum of plaintext data (pre-compression).
pub fn compute_checksum(data: &[u8]) -> [u8; 32] {
    blake3::hash(data).into()
}

/// Verify BLAKE3 checksum using constant-time comparison.
///
/// Uses subtle::ConstantTimeEq to prevent timing side-channels.
pub fn verify_checksum(data: &[u8], expected: &[u8; 32]) -> bool {
    use subtle::ConstantTimeEq;
    let actual = compute_checksum(data);
    actual.ct_eq(expected).into()
}

/// Pre-allocate a vault file filled with CSPRNG random data.
///
/// Writes in 64KB chunks to keep memory constant for large vaults.
pub fn preallocate_vault(path: &str, total_size: u64) -> Result { ... }

/// Securely erase a region of the vault file by overwriting with CSPRNG bytes.
///
/// Writes random data in 64KB chunks, then fsyncs.
/// Used when deleting a segment to destroy old ciphertext.
pub fn secure_erase_region(
    file: &mut File,
    offset: u64,
    size: u64,
) -> Result { ... }

2. Update rust/src/api/evfs/mod.rs

Add pub mod segment; (if not already present).


Files to Create / Modify

File Action
rust/src/api/evfs/segment.rs Create — key derivation, compress+encrypt/decrypt+decompress, checksums, pre-allocation, secure erase
rust/src/api/evfs/mod.rs Edit — add pub mod segment;

Tests

Key derivation

  • test_derive_vault_keys_deterministic — same master key always produces same sub-keys
  • test_derive_vault_keys_domain_separation — cipher_key, nonce_key, index_key are all different
  • test_derive_vault_keys_different_master — different master keys produce different sub-keys

Nonce derivation

  • test_nonce_deterministic — same (nonce_key, index, generation) always produces same nonce
  • test_nonce_unique_by_index — different segment indices produce different nonces
  • test_nonce_unique_by_generation — same index with different generation produces different nonces
  • test_nonce_length_12 — produces 12-byte nonce for AES-GCM / ChaCha20

Segment encrypt/decrypt (no compression)

  • test_segment_encrypt_decrypt_roundtrip — encrypt then decrypt with CompressionAlgorithm::None, verify identical plaintext
  • test_segment_wrong_generation_fails — encrypt with gen=0, decrypt with gen=1 fails
  • test_segment_wrong_key_fails — decrypt with wrong cipher_key fails
  • test_segment_tampered_ciphertext_fails — flip one byte in ciphertext, decryption fails

Segment compress+encrypt/decrypt+decompress

  • test_segment_zstd_roundtrip — compress+encrypt then decrypt+decompress with Zstd, verify identical plaintext
  • test_segment_brotli_roundtrip — compress+encrypt then decrypt+decompress with Brotli, verify identical plaintext
  • test_segment_mime_skip — segment named "photo.jpg" with Zstd config results in effective_algo = None, roundtrip still works
  • test_segment_compressed_smaller — compressible data with Zstd produces smaller encrypted output than CompressionAlgorithm::None

Index encryption

  • test_index_encrypt_decrypt_roundtrip — encrypt index bytes, decrypt, verify identical

Checksums

  • test_checksum_roundtrip — compute and verify returns true
  • test_checksum_tampered — flip one byte in data, verify returns false
  • test_checksum_covers_original_plaintext — checksum computed on pre-compression data; decompress+verify succeeds

Pre-allocation and secure erase

  • test_preallocate_size — file is exactly total_size bytes
  • test_preallocate_random_fill — file is not all zeros
  • test_preallocate_streaming — 10MB allocation succeeds without large memory spike
  • test_secure_erase_region — after erase, overwritten region contains no original bytes
  • test_secure_erase_fsyncs — verify fsync is called (integration test)

How to Run

cd rust
cargo test evfs::segment --features compression

Acceptance Criteria

  • HKDF derives three distinct sub-keys with domain separation from a single master key
  • Nonce derivation includes generation counter — overwrite produces a different nonce
  • Encrypting with gen=0 and decrypting with gen=1 fails (nonce mismatch detected)
  • Compress-then-encrypt pipeline works with Zstd, Brotli, and None
  • MIME-aware skip: segment named "photo.jpg" bypasses compression even when Zstd is requested
  • decrypt_segment reads CompressionAlgorithm from the SegmentEntry — no guessing
  • BLAKE3 checksum is computed on original plaintext (pre-compression), not compressed data
  • BLAKE3 checksum comparison uses constant-time via subtle::ConstantTimeEq
  • Index encryption/decryption roundtrips correctly using separate index_key
  • Pre-allocation writes in 64KB chunks (constant memory for large vaults)
  • secure_erase_region overwrites deleted data with CSPRNG bytes and calls fsync
  • cargo test passes all tests

Metadata

Metadata

Assignees

Labels

rustthis kind of tasks is related to Rust part
No fields configured for Feature.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions