feat: add an option to disable caching#49
Conversation
📝 WalkthroughWalkthroughThis PR introduces conditional cache management across the renderer by adding a new Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (4)
src/texture_manager.rs (1)
271-320: Behavior correct; the cache is still touched fromallocate_texture/remove_texture.When
cache_bind_groupsisfalse,get_or_create_shape_bind_groupcorrectly skips both the read and the insert, so no entries ever accumulate. However,allocate_texture(Line 129) andremove_texture(Line 229) still take a write lock onshape_bind_group_cacheand runretain(...)on every call. On an empty map this is a cheap no-op, but it still takes an exclusiveRwLockwrite. For hot texture churn, gating thoseretaincalls behindself.cache_bind_groupswould avoid unnecessary write-lock contention when caching is disabled.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/texture_manager.rs` around lines 271 - 320, The allocate_texture and remove_texture methods still take a write lock and call retain on shape_bind_group_cache even when cache_bind_groups is false; change both allocate_texture and remove_texture to first check self.cache_bind_groups and only then acquire the write lock and call retain (i.e., guard the retain(...) and the write() unwrap behind if self.cache_bind_groups { ... }) so no exclusive RwLock write is taken when caching is disabled, leaving other logic unchanged.src/renderer.rs (1)
63-82: Consider#[non_exhaustive]+ builder-style constructors for forward compatibility.
RendererOptionsexposesenable_reusable_cachesas a public field withDefaultset totrue. Any future option added here will be a breaking change for callers that construct the struct with a struct literal (e.g.,RendererOptions { enable_reusable_caches: false }). Two low-cost options:♻️ Suggested shape
+#[non_exhaustive] #[derive(Debug, Clone, Copy, PartialEq, Eq)] pub struct RendererOptions { pub enable_reusable_caches: bool, } impl RendererOptions { + pub const fn new() -> Self { + Self { enable_reusable_caches: true } + } + + pub const fn with_reusable_caches(mut self, enabled: bool) -> Self { + self.enable_reusable_caches = enabled; + self + } + pub const fn without_reusable_caches() -> Self { Self { enable_reusable_caches: false } } }With
#[non_exhaustive], external crates must go throughDefault/constructors, so adding fields later stays source-compatible.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/renderer.rs` around lines 63 - 82, RendererOptions currently exposes a public field and Default, which makes adding fields a breaking change; mark the struct with #[non_exhaustive] and make fields private, then provide constructor(s) and builder-style methods (e.g., RendererOptions::default(), RendererOptions::without_reusable_caches(), and a chainable .enable_reusable_caches(bool) or .set_enable_reusable_caches(bool)) so external callers must use the provided APIs rather than struct literals; update impl Default and the existing without_reusable_caches() to construct via the private fields and ensure Clone/Copy/Debug/PartialEq/Eq derive remain valid.src/renderer/rendering.rs (1)
391-397: Cache-disable path looks correct; minor efficiency note.When
enable_reusable_cachesis off you bothcleartextures_to_recycle(which would otherwise be appended into the pool) and calloffscreen_texture_pool.clear(). The second call is only necessary if some other path can leave entries behind (e.g.,trimon resize). Ifacquireis the only producer and everything acquired ends up intextures_to_recycle, the pool is already empty here andclear()is redundant — but keeping it as a defensive invariant is fine. Consider a brief comment to note the intent ("ensure no pool retention when caches are disabled").🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/renderer/rendering.rs` around lines 391 - 397, The cache-disable branch currently clears textures_to_recycle and calls self.offscreen_texture_pool.clear(), which is redundant if acquire is the only producer but intentionally defensive; add a brief comment near the enable_reusable_caches conditional (referencing renderer_options.enable_reusable_caches, offscreen_texture_pool.recycle, textures_to_recycle, and offscreen_texture_pool.clear) explaining that clear() is kept intentionally to ensure no pool retention when caches are disabled (covers other producers like trim on resize) so future readers understand the invariant.src/renderer/construction.rs (1)
653-669:new_transparentsilently ignores customRendererOptions.Since
new_transparentnow delegates throughnew→new_with_options(.., RendererOptions::default()), callers have no way to disable reusable caches for transparent windows. Consider adding anew_transparent_with_optionscounterpart for API symmetry with the other constructors introduced in this PR.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/renderer/construction.rs` around lines 653 - 669, new_transparent currently forces RendererOptions::default() via the new → new_with_options path, preventing callers from supplying custom RendererOptions (e.g., to disable reusable caches); add a new constructor new_transparent_with_options(window: impl Into<SurfaceTarget<'static>>, physical_size: (u32,u32), scale_factor: f64, vsync: bool, msaa_samples: u32, options: RendererOptions) that mirrors new_transparent but forwards the provided options into new_with_options(..., options, true) (or the equivalent parameter order), so callers can pass custom RendererOptions for transparent windows and preserve API symmetry with the other constructors.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/effect.rs`:
- Around line 199-219: The stats() implementation in effect.rs assumes 4 bytes
per pixel for color and depth/stencil which is incorrect for variable formats;
update stats() to compute per-texture byte sizes using each texture's actual
format instead of hard-coded 4 (e.g., call
texture.color_texture.format().block_copy_size(None) or equivalent to get
bytes-per-block and multiply by pixel count and sample_count), and similarly
derive the depth/stencil size from the created depth texture's format used in
create_pooled_texture (or document that the numbers are approximate if you
prefer not to query formats); ensure OffscreenTexturePoolStats fields
(estimated_color_bytes, estimated_resolve_bytes, estimated_depth_stencil_bytes)
accumulate these computed sizes and keep pooled_textures = self.available.len()
unchanged so acquire callers relying on format selection are accurately
reflected.
In `@src/renderer/construction.rs`:
- Around line 484-493: The diagnostic byte estimate for depth/stencil uses the
color-texture formula (size.width * size.height * 4 * self.msaa_sample_count)
which is wrong for depth formats; update the logic in the block that checks
self.depth_stencil_texture (and prints "Depth/stencil texture") to query the
texture format (e.g. via the texture's format()/descriptor or matching the
format used when creating the depth texture) and compute bytes_per_sample
accordingly (use 8 for Depth32FloatStencil8, 4 for Depth24PlusStencil8 or
platform-dependent formats, otherwise fall back to a conservative/approximate
value), multiply by width*height*msaa_sample_count, and include the actual
format or mark the printed "~bytes" as an approximation in the message (also
reference performance_measurement / DEPTH32FLOAT_STENCIL8 where relevant).
In `@src/util.rs`:
- Around line 137-159: get_or_create_default_ramp_texture currently skips
storing the singleton default_ramp_texture when self.enabled is false, causing
create_default_ramp_texture to be called on every draw; change the logic so the
default ramp (constructed via create_default_ramp_texture and wrapped in
Arc<CachedGradientRampTexture>) is cached into self.default_ramp_texture
unconditionally (regardless of the self.enabled flag) and returned, while
keeping the existing behavior for other cached items controlled by enabled;
update references to self.enabled and self.default_ramp_texture and ensure the
function still returns Arc<CachedGradientRampTexture>.
---
Nitpick comments:
In `@src/renderer.rs`:
- Around line 63-82: RendererOptions currently exposes a public field and
Default, which makes adding fields a breaking change; mark the struct with
#[non_exhaustive] and make fields private, then provide constructor(s) and
builder-style methods (e.g., RendererOptions::default(),
RendererOptions::without_reusable_caches(), and a chainable
.enable_reusable_caches(bool) or .set_enable_reusable_caches(bool)) so external
callers must use the provided APIs rather than struct literals; update impl
Default and the existing without_reusable_caches() to construct via the private
fields and ensure Clone/Copy/Debug/PartialEq/Eq derive remain valid.
In `@src/renderer/construction.rs`:
- Around line 653-669: new_transparent currently forces
RendererOptions::default() via the new → new_with_options path, preventing
callers from supplying custom RendererOptions (e.g., to disable reusable
caches); add a new constructor new_transparent_with_options(window: impl
Into<SurfaceTarget<'static>>, physical_size: (u32,u32), scale_factor: f64,
vsync: bool, msaa_samples: u32, options: RendererOptions) that mirrors
new_transparent but forwards the provided options into new_with_options(...,
options, true) (or the equivalent parameter order), so callers can pass custom
RendererOptions for transparent windows and preserve API symmetry with the other
constructors.
In `@src/renderer/rendering.rs`:
- Around line 391-397: The cache-disable branch currently clears
textures_to_recycle and calls self.offscreen_texture_pool.clear(), which is
redundant if acquire is the only producer but intentionally defensive; add a
brief comment near the enable_reusable_caches conditional (referencing
renderer_options.enable_reusable_caches, offscreen_texture_pool.recycle,
textures_to_recycle, and offscreen_texture_pool.clear) explaining that clear()
is kept intentionally to ensure no pool retention when caches are disabled
(covers other producers like trim on resize) so future readers understand the
invariant.
In `@src/texture_manager.rs`:
- Around line 271-320: The allocate_texture and remove_texture methods still
take a write lock and call retain on shape_bind_group_cache even when
cache_bind_groups is false; change both allocate_texture and remove_texture to
first check self.cache_bind_groups and only then acquire the write lock and call
retain (i.e., guard the retain(...) and the write() unwrap behind if
self.cache_bind_groups { ... }) so no exclusive RwLock write is taken when
caching is disabled, leaving other logic unchanged.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 31ce16e7-f55e-41e8-8100-c1b6dcf838ee
📒 Files selected for processing (9)
src/cache.rssrc/effect.rssrc/lib.rssrc/renderer.rssrc/renderer/construction.rssrc/renderer/rendering.rssrc/shape.rssrc/texture_manager.rssrc/util.rs
| pub(crate) fn stats(&self) -> OffscreenTexturePoolStats { | ||
| let mut estimated_color_bytes = 0_u64; | ||
| let mut estimated_resolve_bytes = 0_u64; | ||
| let mut estimated_depth_stencil_bytes = 0_u64; | ||
|
|
||
| for texture in &self.available { | ||
| let pixels = texture.width as u64 * texture.height as u64; | ||
| estimated_color_bytes += pixels * 4 * texture.sample_count as u64; | ||
| if texture.resolve_texture.is_some() { | ||
| estimated_resolve_bytes += pixels * 4; | ||
| } | ||
| estimated_depth_stencil_bytes += pixels * 4 * texture.sample_count as u64; | ||
| } | ||
|
|
||
| OffscreenTexturePoolStats { | ||
| pooled_textures: self.available.len(), | ||
| estimated_color_bytes, | ||
| estimated_resolve_bytes, | ||
| estimated_depth_stencil_bytes, | ||
| } | ||
| } |
There was a problem hiding this comment.
Memory estimate assumes 4 B/pixel color & depth/stencil.
estimated_color_bytes multiplies by a hard-coded 4, but pooled textures use the surface config.format (see acquire call sites), which can be e.g. Rgba16Float (8 B) or Rgba8Unorm (4 B). Similarly, the depth/stencil is hard-coded Depth24PlusStencil8 in create_pooled_texture, which on most backends is 4 B — fine in practice but driver‑dependent. Since this is reported via print_memory_usage_info, consider either documenting it as an approximation or deriving the color byte size from color_texture.format().block_copy_size(None) to keep the number meaningful when formats change.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/effect.rs` around lines 199 - 219, The stats() implementation in
effect.rs assumes 4 bytes per pixel for color and depth/stencil which is
incorrect for variable formats; update stats() to compute per-texture byte sizes
using each texture's actual format instead of hard-coded 4 (e.g., call
texture.color_texture.format().block_copy_size(None) or equivalent to get
bytes-per-block and multiply by pixel count and sample_count), and similarly
derive the depth/stencil size from the created depth texture's format used in
create_pooled_texture (or document that the numbers are approximate if you
prefer not to query formats); ensure OffscreenTexturePoolStats fields
(estimated_color_bytes, estimated_resolve_bytes, estimated_depth_stencil_bytes)
accumulate these computed sizes and keep pooled_textures = self.available.len()
unchanged so acquire callers relying on format selection are accurately
reflected.
| if let Some(tex) = &self.depth_stencil_texture { | ||
| let size = tex.size(); | ||
| println!( | ||
| "Depth/stencil texture: {}x{} samples={} (~{} bytes)", | ||
| size.width, | ||
| size.height, | ||
| self.msaa_sample_count, | ||
| size.width as u64 * size.height as u64 * 4 * self.msaa_sample_count as u64 | ||
| ); | ||
| } |
There was a problem hiding this comment.
Depth/stencil byte estimate uses a color-texture formula.
The estimate width * height * 4 * sample_count reuses the MSAA color formula, but depth/stencil formats aren't 4 bytes per sample (Depth32FloatStencil8 is 8 bytes per sample including padding; Depth24PlusStencil8 is 4 but backend-dependent). Given this is purely diagnostic output, it's low impact, but the "~bytes" number will be noticeably off for the performance_measurement feature which forces DEPTH32FLOAT_STENCIL8. Consider labeling it as an approximation or matching against the actual format if available.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/renderer/construction.rs` around lines 484 - 493, The diagnostic byte
estimate for depth/stencil uses the color-texture formula (size.width *
size.height * 4 * self.msaa_sample_count) which is wrong for depth formats;
update the logic in the block that checks self.depth_stencil_texture (and prints
"Depth/stencil texture") to query the texture format (e.g. via the texture's
format()/descriptor or matching the format used when creating the depth texture)
and compute bytes_per_sample accordingly (use 8 for Depth32FloatStencil8, 4 for
Depth24PlusStencil8 or platform-dependent formats, otherwise fall back to a
conservative/approximate value), multiply by width*height*msaa_sample_count, and
include the actual format or mark the printed "~bytes" as an approximation in
the message (also reference performance_measurement / DEPTH32FLOAT_STENCIL8
where relevant).
| fn get_or_create_default_ramp_texture( | ||
| &mut self, | ||
| device: &wgpu::Device, | ||
| queue: &wgpu::Queue, | ||
| ) -> Arc<CachedGradientRampTexture> { | ||
| if let Some(default_ramp_texture) = &self.default_ramp_texture { | ||
| return default_ramp_texture.clone(); | ||
| if self.enabled { | ||
| if let Some(default_ramp_texture) = &self.default_ramp_texture { | ||
| return default_ramp_texture.clone(); | ||
| } | ||
| } | ||
|
|
||
| let (texture, view) = create_default_ramp_texture(device, queue); | ||
| let default_ramp_texture = Arc::new(CachedGradientRampTexture { | ||
| _texture: texture, | ||
| view: Arc::new(view), | ||
| }); | ||
| self.default_ramp_texture = Some(default_ramp_texture.clone()); | ||
|
|
||
| if self.enabled { | ||
| self.default_ramp_texture = Some(default_ramp_texture.clone()); | ||
| } | ||
|
|
||
| default_ramp_texture | ||
| } |
There was a problem hiding this comment.
Performance: default ramp texture is reallocated on every call when caches are disabled.
With enabled == false, default_ramp_texture is never stored, so every gradient draw with a constant color reaches this path and calls create_default_ramp_texture(device, queue) — creating a fresh wgpu::Texture + view + queue upload per shape, per frame. This can be a significant hidden cost compared to the "enabled" path and is a behavior-level regression for users that just want to disable reusable content caches but still get the same per-frame work budget.
Consider always caching the default ramp texture (it has no key/memory pressure dimension — it's a singleton) even when enabled is false:
♻️ Proposed fix
fn get_or_create_default_ramp_texture(
&mut self,
device: &wgpu::Device,
queue: &wgpu::Queue,
) -> Arc<CachedGradientRampTexture> {
- if self.enabled {
- if let Some(default_ramp_texture) = &self.default_ramp_texture {
- return default_ramp_texture.clone();
- }
+ if let Some(default_ramp_texture) = &self.default_ramp_texture {
+ return default_ramp_texture.clone();
}
let (texture, view) = create_default_ramp_texture(device, queue);
let default_ramp_texture = Arc::new(CachedGradientRampTexture {
_texture: texture,
view: Arc::new(view),
});
-
- if self.enabled {
- self.default_ramp_texture = Some(default_ramp_texture.clone());
- }
-
+ self.default_ramp_texture = Some(default_ramp_texture.clone());
default_ramp_texture
}The default ramp is a fixed, tiny singleton — it does not participate in the LRU/memory-budget concern that motivated disabling the caches.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| fn get_or_create_default_ramp_texture( | |
| &mut self, | |
| device: &wgpu::Device, | |
| queue: &wgpu::Queue, | |
| ) -> Arc<CachedGradientRampTexture> { | |
| if let Some(default_ramp_texture) = &self.default_ramp_texture { | |
| return default_ramp_texture.clone(); | |
| if self.enabled { | |
| if let Some(default_ramp_texture) = &self.default_ramp_texture { | |
| return default_ramp_texture.clone(); | |
| } | |
| } | |
| let (texture, view) = create_default_ramp_texture(device, queue); | |
| let default_ramp_texture = Arc::new(CachedGradientRampTexture { | |
| _texture: texture, | |
| view: Arc::new(view), | |
| }); | |
| self.default_ramp_texture = Some(default_ramp_texture.clone()); | |
| if self.enabled { | |
| self.default_ramp_texture = Some(default_ramp_texture.clone()); | |
| } | |
| default_ramp_texture | |
| } | |
| fn get_or_create_default_ramp_texture( | |
| &mut self, | |
| device: &wgpu::Device, | |
| queue: &wgpu::Queue, | |
| ) -> Arc<CachedGradientRampTexture> { | |
| if let Some(default_ramp_texture) = &self.default_ramp_texture { | |
| return default_ramp_texture.clone(); | |
| } | |
| let (texture, view) = create_default_ramp_texture(device, queue); | |
| let default_ramp_texture = Arc::new(CachedGradientRampTexture { | |
| _texture: texture, | |
| view: Arc::new(view), | |
| }); | |
| self.default_ramp_texture = Some(default_ramp_texture.clone()); | |
| default_ramp_texture | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/util.rs` around lines 137 - 159, get_or_create_default_ramp_texture
currently skips storing the singleton default_ramp_texture when self.enabled is
false, causing create_default_ramp_texture to be called on every draw; change
the logic so the default ramp (constructed via create_default_ramp_texture and
wrapped in Arc<CachedGradientRampTexture>) is cached into
self.default_ramp_texture unconditionally (regardless of the self.enabled flag)
and returned, while keeping the existing behavior for other cached items
controlled by enabled; update references to self.enabled and
self.default_ramp_texture and ensure the function still returns
Arc<CachedGradientRampTexture>.
Summary by CodeRabbit
Release Notes
New Features
API Changes
RendererOptionsto the public API for advanced configuration.new_with_options,try_new_headless_with_options,new_headless_with_options) for explicit cache settings.