Skip to content

feat: add an option to disable caching#49

Open
antouhou wants to merge 1 commit into
mainfrom
feat-option-disable-cache
Open

feat: add an option to disable caching#49
antouhou wants to merge 1 commit into
mainfrom
feat-option-disable-cache

Conversation

@antouhou
Copy link
Copy Markdown
Owner

@antouhou antouhou commented Apr 18, 2026

Summary by CodeRabbit

Release Notes

  • New Features

    • Added configurable cache control to optimize memory usage based on your needs.
    • Added cache and pool statistics reporting for performance monitoring.
  • API Changes

    • Introduced RendererOptions to the public API for advanced configuration.
    • Added new renderer constructors (new_with_options, try_new_headless_with_options, new_headless_with_options) for explicit cache settings.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 18, 2026

📝 Walkthrough

Walkthrough

This PR introduces conditional cache management across the renderer by adding a new RendererOptions struct that controls whether reusable caches (tessellation, gradient, bind group, and texture pooling) are enabled. The option is threaded through renderer construction and affects cache behavior in multiple components via runtime flags.

Changes

Cohort / File(s) Summary
Cache Infrastructure
src/cache.rs, src/util.rs
Made tessellation and gradient caches optional via Option<LruCache> fields. Both Cache::new and GradientCache::new now accept an enabled: bool parameter. Cache read/write operations gracefully skip when caches are disabled. PoolManager::new threads enable_reusable_caches to both caches.
Renderer Options & Public API
src/renderer.rs, src/lib.rs
Introduced new RendererOptions struct with enable_reusable_caches: bool field, Default impl, and without_reusable_caches() constructor. Exported RendererOptions as part of public API. Extended Renderer to store renderer_options field.
Renderer Construction
src/renderer/construction.rs
Added new_with_options(), try_new_headless_with_options(), and new_headless_with_options() constructors accepting RendererOptions. Existing constructors delegate to new variants with RendererOptions::default(). Threaded enable_reusable_caches to TextureManager and PoolManager. Enhanced print_memory_usage_info() with deduplicated geometry tracking, MSAA texture estimates, and pool statistics.
Cache Application in Rendering
src/renderer/rendering.rs, src/texture_manager.rs
Added conditional caching logic: offscreen texture recycling respects enable_reusable_caches flag (clears pool when disabled). TextureManager conditionally caches bind groups based on cache_bind_groups parameter wired from renderer_options.
Offscreen Pool Diagnostics
src/effect.rs
Added OffscreenTexturePoolStats struct with texture count and byte estimates. Implemented OffscreenTexturePool::clear() and stats() methods for pool introspection.
Test Updates
src/shape.rs
Updated rect_tessellation_uses_shared_quad_corners test to pass new enable_reusable_caches parameter to PoolManager.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Poem

🐰 Caches now dance to a configurable beat,
Optional tessellations, gradients so sweet,
Bind groups and textures can choose their own way,
Reuse or clear—we'll decide every day!
✨🎨

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title directly and accurately summarizes the main objective of the changeset: introducing a runtime option to disable caching throughout the renderer.
Docstring Coverage ✅ Passed Docstring coverage is 94.59% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat-option-disable-cache

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (4)
src/texture_manager.rs (1)

271-320: Behavior correct; the cache is still touched from allocate_texture/remove_texture.

When cache_bind_groups is false, get_or_create_shape_bind_group correctly skips both the read and the insert, so no entries ever accumulate. However, allocate_texture (Line 129) and remove_texture (Line 229) still take a write lock on shape_bind_group_cache and run retain(...) on every call. On an empty map this is a cheap no-op, but it still takes an exclusive RwLock write. For hot texture churn, gating those retain calls behind self.cache_bind_groups would avoid unnecessary write-lock contention when caching is disabled.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/texture_manager.rs` around lines 271 - 320, The allocate_texture and
remove_texture methods still take a write lock and call retain on
shape_bind_group_cache even when cache_bind_groups is false; change both
allocate_texture and remove_texture to first check self.cache_bind_groups and
only then acquire the write lock and call retain (i.e., guard the retain(...)
and the write() unwrap behind if self.cache_bind_groups { ... }) so no exclusive
RwLock write is taken when caching is disabled, leaving other logic unchanged.
src/renderer.rs (1)

63-82: Consider #[non_exhaustive] + builder-style constructors for forward compatibility.

RendererOptions exposes enable_reusable_caches as a public field with Default set to true. Any future option added here will be a breaking change for callers that construct the struct with a struct literal (e.g., RendererOptions { enable_reusable_caches: false }). Two low-cost options:

♻️ Suggested shape
+#[non_exhaustive]
 #[derive(Debug, Clone, Copy, PartialEq, Eq)]
 pub struct RendererOptions {
     pub enable_reusable_caches: bool,
 }

 impl RendererOptions {
+    pub const fn new() -> Self {
+        Self { enable_reusable_caches: true }
+    }
+
+    pub const fn with_reusable_caches(mut self, enabled: bool) -> Self {
+        self.enable_reusable_caches = enabled;
+        self
+    }
+
     pub const fn without_reusable_caches() -> Self {
         Self { enable_reusable_caches: false }
     }
 }

With #[non_exhaustive], external crates must go through Default/constructors, so adding fields later stays source-compatible.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/renderer.rs` around lines 63 - 82, RendererOptions currently exposes a
public field and Default, which makes adding fields a breaking change; mark the
struct with #[non_exhaustive] and make fields private, then provide
constructor(s) and builder-style methods (e.g., RendererOptions::default(),
RendererOptions::without_reusable_caches(), and a chainable
.enable_reusable_caches(bool) or .set_enable_reusable_caches(bool)) so external
callers must use the provided APIs rather than struct literals; update impl
Default and the existing without_reusable_caches() to construct via the private
fields and ensure Clone/Copy/Debug/PartialEq/Eq derive remain valid.
src/renderer/rendering.rs (1)

391-397: Cache-disable path looks correct; minor efficiency note.

When enable_reusable_caches is off you both clear textures_to_recycle (which would otherwise be appended into the pool) and call offscreen_texture_pool.clear(). The second call is only necessary if some other path can leave entries behind (e.g., trim on resize). If acquire is the only producer and everything acquired ends up in textures_to_recycle, the pool is already empty here and clear() is redundant — but keeping it as a defensive invariant is fine. Consider a brief comment to note the intent ("ensure no pool retention when caches are disabled").

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/renderer/rendering.rs` around lines 391 - 397, The cache-disable branch
currently clears textures_to_recycle and calls
self.offscreen_texture_pool.clear(), which is redundant if acquire is the only
producer but intentionally defensive; add a brief comment near the
enable_reusable_caches conditional (referencing
renderer_options.enable_reusable_caches, offscreen_texture_pool.recycle,
textures_to_recycle, and offscreen_texture_pool.clear) explaining that clear()
is kept intentionally to ensure no pool retention when caches are disabled
(covers other producers like trim on resize) so future readers understand the
invariant.
src/renderer/construction.rs (1)

653-669: new_transparent silently ignores custom RendererOptions.

Since new_transparent now delegates through newnew_with_options(.., RendererOptions::default()), callers have no way to disable reusable caches for transparent windows. Consider adding a new_transparent_with_options counterpart for API symmetry with the other constructors introduced in this PR.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/renderer/construction.rs` around lines 653 - 669, new_transparent
currently forces RendererOptions::default() via the new → new_with_options path,
preventing callers from supplying custom RendererOptions (e.g., to disable
reusable caches); add a new constructor new_transparent_with_options(window:
impl Into<SurfaceTarget<'static>>, physical_size: (u32,u32), scale_factor: f64,
vsync: bool, msaa_samples: u32, options: RendererOptions) that mirrors
new_transparent but forwards the provided options into new_with_options(...,
options, true) (or the equivalent parameter order), so callers can pass custom
RendererOptions for transparent windows and preserve API symmetry with the other
constructors.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/effect.rs`:
- Around line 199-219: The stats() implementation in effect.rs assumes 4 bytes
per pixel for color and depth/stencil which is incorrect for variable formats;
update stats() to compute per-texture byte sizes using each texture's actual
format instead of hard-coded 4 (e.g., call
texture.color_texture.format().block_copy_size(None) or equivalent to get
bytes-per-block and multiply by pixel count and sample_count), and similarly
derive the depth/stencil size from the created depth texture's format used in
create_pooled_texture (or document that the numbers are approximate if you
prefer not to query formats); ensure OffscreenTexturePoolStats fields
(estimated_color_bytes, estimated_resolve_bytes, estimated_depth_stencil_bytes)
accumulate these computed sizes and keep pooled_textures = self.available.len()
unchanged so acquire callers relying on format selection are accurately
reflected.

In `@src/renderer/construction.rs`:
- Around line 484-493: The diagnostic byte estimate for depth/stencil uses the
color-texture formula (size.width * size.height * 4 * self.msaa_sample_count)
which is wrong for depth formats; update the logic in the block that checks
self.depth_stencil_texture (and prints "Depth/stencil texture") to query the
texture format (e.g. via the texture's format()/descriptor or matching the
format used when creating the depth texture) and compute bytes_per_sample
accordingly (use 8 for Depth32FloatStencil8, 4 for Depth24PlusStencil8 or
platform-dependent formats, otherwise fall back to a conservative/approximate
value), multiply by width*height*msaa_sample_count, and include the actual
format or mark the printed "~bytes" as an approximation in the message (also
reference performance_measurement / DEPTH32FLOAT_STENCIL8 where relevant).

In `@src/util.rs`:
- Around line 137-159: get_or_create_default_ramp_texture currently skips
storing the singleton default_ramp_texture when self.enabled is false, causing
create_default_ramp_texture to be called on every draw; change the logic so the
default ramp (constructed via create_default_ramp_texture and wrapped in
Arc<CachedGradientRampTexture>) is cached into self.default_ramp_texture
unconditionally (regardless of the self.enabled flag) and returned, while
keeping the existing behavior for other cached items controlled by enabled;
update references to self.enabled and self.default_ramp_texture and ensure the
function still returns Arc<CachedGradientRampTexture>.

---

Nitpick comments:
In `@src/renderer.rs`:
- Around line 63-82: RendererOptions currently exposes a public field and
Default, which makes adding fields a breaking change; mark the struct with
#[non_exhaustive] and make fields private, then provide constructor(s) and
builder-style methods (e.g., RendererOptions::default(),
RendererOptions::without_reusable_caches(), and a chainable
.enable_reusable_caches(bool) or .set_enable_reusable_caches(bool)) so external
callers must use the provided APIs rather than struct literals; update impl
Default and the existing without_reusable_caches() to construct via the private
fields and ensure Clone/Copy/Debug/PartialEq/Eq derive remain valid.

In `@src/renderer/construction.rs`:
- Around line 653-669: new_transparent currently forces
RendererOptions::default() via the new → new_with_options path, preventing
callers from supplying custom RendererOptions (e.g., to disable reusable
caches); add a new constructor new_transparent_with_options(window: impl
Into<SurfaceTarget<'static>>, physical_size: (u32,u32), scale_factor: f64,
vsync: bool, msaa_samples: u32, options: RendererOptions) that mirrors
new_transparent but forwards the provided options into new_with_options(...,
options, true) (or the equivalent parameter order), so callers can pass custom
RendererOptions for transparent windows and preserve API symmetry with the other
constructors.

In `@src/renderer/rendering.rs`:
- Around line 391-397: The cache-disable branch currently clears
textures_to_recycle and calls self.offscreen_texture_pool.clear(), which is
redundant if acquire is the only producer but intentionally defensive; add a
brief comment near the enable_reusable_caches conditional (referencing
renderer_options.enable_reusable_caches, offscreen_texture_pool.recycle,
textures_to_recycle, and offscreen_texture_pool.clear) explaining that clear()
is kept intentionally to ensure no pool retention when caches are disabled
(covers other producers like trim on resize) so future readers understand the
invariant.

In `@src/texture_manager.rs`:
- Around line 271-320: The allocate_texture and remove_texture methods still
take a write lock and call retain on shape_bind_group_cache even when
cache_bind_groups is false; change both allocate_texture and remove_texture to
first check self.cache_bind_groups and only then acquire the write lock and call
retain (i.e., guard the retain(...) and the write() unwrap behind if
self.cache_bind_groups { ... }) so no exclusive RwLock write is taken when
caching is disabled, leaving other logic unchanged.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 31ce16e7-f55e-41e8-8100-c1b6dcf838ee

📥 Commits

Reviewing files that changed from the base of the PR and between f7de809 and b445e20.

📒 Files selected for processing (9)
  • src/cache.rs
  • src/effect.rs
  • src/lib.rs
  • src/renderer.rs
  • src/renderer/construction.rs
  • src/renderer/rendering.rs
  • src/shape.rs
  • src/texture_manager.rs
  • src/util.rs

Comment thread src/effect.rs
Comment on lines +199 to +219
pub(crate) fn stats(&self) -> OffscreenTexturePoolStats {
let mut estimated_color_bytes = 0_u64;
let mut estimated_resolve_bytes = 0_u64;
let mut estimated_depth_stencil_bytes = 0_u64;

for texture in &self.available {
let pixels = texture.width as u64 * texture.height as u64;
estimated_color_bytes += pixels * 4 * texture.sample_count as u64;
if texture.resolve_texture.is_some() {
estimated_resolve_bytes += pixels * 4;
}
estimated_depth_stencil_bytes += pixels * 4 * texture.sample_count as u64;
}

OffscreenTexturePoolStats {
pooled_textures: self.available.len(),
estimated_color_bytes,
estimated_resolve_bytes,
estimated_depth_stencil_bytes,
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Memory estimate assumes 4 B/pixel color & depth/stencil.

estimated_color_bytes multiplies by a hard-coded 4, but pooled textures use the surface config.format (see acquire call sites), which can be e.g. Rgba16Float (8 B) or Rgba8Unorm (4 B). Similarly, the depth/stencil is hard-coded Depth24PlusStencil8 in create_pooled_texture, which on most backends is 4 B — fine in practice but driver‑dependent. Since this is reported via print_memory_usage_info, consider either documenting it as an approximation or deriving the color byte size from color_texture.format().block_copy_size(None) to keep the number meaningful when formats change.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/effect.rs` around lines 199 - 219, The stats() implementation in
effect.rs assumes 4 bytes per pixel for color and depth/stencil which is
incorrect for variable formats; update stats() to compute per-texture byte sizes
using each texture's actual format instead of hard-coded 4 (e.g., call
texture.color_texture.format().block_copy_size(None) or equivalent to get
bytes-per-block and multiply by pixel count and sample_count), and similarly
derive the depth/stencil size from the created depth texture's format used in
create_pooled_texture (or document that the numbers are approximate if you
prefer not to query formats); ensure OffscreenTexturePoolStats fields
(estimated_color_bytes, estimated_resolve_bytes, estimated_depth_stencil_bytes)
accumulate these computed sizes and keep pooled_textures = self.available.len()
unchanged so acquire callers relying on format selection are accurately
reflected.

Comment on lines +484 to +493
if let Some(tex) = &self.depth_stencil_texture {
let size = tex.size();
println!(
"Depth/stencil texture: {}x{} samples={} (~{} bytes)",
size.width,
size.height,
self.msaa_sample_count,
size.width as u64 * size.height as u64 * 4 * self.msaa_sample_count as u64
);
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Depth/stencil byte estimate uses a color-texture formula.

The estimate width * height * 4 * sample_count reuses the MSAA color formula, but depth/stencil formats aren't 4 bytes per sample (Depth32FloatStencil8 is 8 bytes per sample including padding; Depth24PlusStencil8 is 4 but backend-dependent). Given this is purely diagnostic output, it's low impact, but the "~bytes" number will be noticeably off for the performance_measurement feature which forces DEPTH32FLOAT_STENCIL8. Consider labeling it as an approximation or matching against the actual format if available.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/renderer/construction.rs` around lines 484 - 493, The diagnostic byte
estimate for depth/stencil uses the color-texture formula (size.width *
size.height * 4 * self.msaa_sample_count) which is wrong for depth formats;
update the logic in the block that checks self.depth_stencil_texture (and prints
"Depth/stencil texture") to query the texture format (e.g. via the texture's
format()/descriptor or matching the format used when creating the depth texture)
and compute bytes_per_sample accordingly (use 8 for Depth32FloatStencil8, 4 for
Depth24PlusStencil8 or platform-dependent formats, otherwise fall back to a
conservative/approximate value), multiply by width*height*msaa_sample_count, and
include the actual format or mark the printed "~bytes" as an approximation in
the message (also reference performance_measurement / DEPTH32FLOAT_STENCIL8
where relevant).

Comment thread src/util.rs
Comment on lines 137 to 159
fn get_or_create_default_ramp_texture(
&mut self,
device: &wgpu::Device,
queue: &wgpu::Queue,
) -> Arc<CachedGradientRampTexture> {
if let Some(default_ramp_texture) = &self.default_ramp_texture {
return default_ramp_texture.clone();
if self.enabled {
if let Some(default_ramp_texture) = &self.default_ramp_texture {
return default_ramp_texture.clone();
}
}

let (texture, view) = create_default_ramp_texture(device, queue);
let default_ramp_texture = Arc::new(CachedGradientRampTexture {
_texture: texture,
view: Arc::new(view),
});
self.default_ramp_texture = Some(default_ramp_texture.clone());

if self.enabled {
self.default_ramp_texture = Some(default_ramp_texture.clone());
}

default_ramp_texture
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Performance: default ramp texture is reallocated on every call when caches are disabled.

With enabled == false, default_ramp_texture is never stored, so every gradient draw with a constant color reaches this path and calls create_default_ramp_texture(device, queue) — creating a fresh wgpu::Texture + view + queue upload per shape, per frame. This can be a significant hidden cost compared to the "enabled" path and is a behavior-level regression for users that just want to disable reusable content caches but still get the same per-frame work budget.

Consider always caching the default ramp texture (it has no key/memory pressure dimension — it's a singleton) even when enabled is false:

♻️ Proposed fix
     fn get_or_create_default_ramp_texture(
         &mut self,
         device: &wgpu::Device,
         queue: &wgpu::Queue,
     ) -> Arc<CachedGradientRampTexture> {
-        if self.enabled {
-            if let Some(default_ramp_texture) = &self.default_ramp_texture {
-                return default_ramp_texture.clone();
-            }
+        if let Some(default_ramp_texture) = &self.default_ramp_texture {
+            return default_ramp_texture.clone();
         }

         let (texture, view) = create_default_ramp_texture(device, queue);
         let default_ramp_texture = Arc::new(CachedGradientRampTexture {
             _texture: texture,
             view: Arc::new(view),
         });
-
-        if self.enabled {
-            self.default_ramp_texture = Some(default_ramp_texture.clone());
-        }
-
+        self.default_ramp_texture = Some(default_ramp_texture.clone());
         default_ramp_texture
     }

The default ramp is a fixed, tiny singleton — it does not participate in the LRU/memory-budget concern that motivated disabling the caches.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
fn get_or_create_default_ramp_texture(
&mut self,
device: &wgpu::Device,
queue: &wgpu::Queue,
) -> Arc<CachedGradientRampTexture> {
if let Some(default_ramp_texture) = &self.default_ramp_texture {
return default_ramp_texture.clone();
if self.enabled {
if let Some(default_ramp_texture) = &self.default_ramp_texture {
return default_ramp_texture.clone();
}
}
let (texture, view) = create_default_ramp_texture(device, queue);
let default_ramp_texture = Arc::new(CachedGradientRampTexture {
_texture: texture,
view: Arc::new(view),
});
self.default_ramp_texture = Some(default_ramp_texture.clone());
if self.enabled {
self.default_ramp_texture = Some(default_ramp_texture.clone());
}
default_ramp_texture
}
fn get_or_create_default_ramp_texture(
&mut self,
device: &wgpu::Device,
queue: &wgpu::Queue,
) -> Arc<CachedGradientRampTexture> {
if let Some(default_ramp_texture) = &self.default_ramp_texture {
return default_ramp_texture.clone();
}
let (texture, view) = create_default_ramp_texture(device, queue);
let default_ramp_texture = Arc::new(CachedGradientRampTexture {
_texture: texture,
view: Arc::new(view),
});
self.default_ramp_texture = Some(default_ramp_texture.clone());
default_ramp_texture
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/util.rs` around lines 137 - 159, get_or_create_default_ramp_texture
currently skips storing the singleton default_ramp_texture when self.enabled is
false, causing create_default_ramp_texture to be called on every draw; change
the logic so the default ramp (constructed via create_default_ramp_texture and
wrapped in Arc<CachedGradientRampTexture>) is cached into
self.default_ramp_texture unconditionally (regardless of the self.enabled flag)
and returned, while keeping the existing behavior for other cached items
controlled by enabled; update references to self.enabled and
self.default_ramp_texture and ensure the function still returns
Arc<CachedGradientRampTexture>.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant