Skip to content

Conversation

@lilith
Copy link
Contributor

@lilith lilith commented Dec 29, 2025

Summary

RCT (Reversible Color Transform) operations in modular decoding perform integer arithmetic that can overflow with certain input values. In debug mode, this causes a panic. In release mode, the behavior is undefined (though typically wraps on most platforms).

This PR:

  • Adds wrapping_add and wrapping_sub methods to the I32SimdVec trait
  • Implements these for all SIMD backends (scalar, SSE4.2, AVX, AVX-512, NEON)
  • Uses wrapping arithmetic in RCT operations to ensure deterministic behavior matching libjxl

For SIMD types, integer addition/subtraction already wraps (this is inherent to SIMD integer ops), so the new methods simply delegate to +/-. For scalar i32, they call the standard library's wrapping_add/wrapping_sub.

Test plan

  • Added unit tests for RCT overflow handling
  • Added unit tests verifying correct RCT operation results
  • Existing test suite passes

🤖 Generated with Claude Code

@google-cla
Copy link

google-cla bot commented Dec 29, 2025

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

Copy link
Member

@veluca93 veluca93 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should instead have the scalar implementations of the SIMD operations have wrapping semantics.

@github-actions
Copy link

Benchmark @ 90926e8


====================================================================================================
MULTI-FILE BENCHMARK RESULTS (4 files, 5 revisions)
  https://github.com/zond/jxl-perfhistory
  CPU architecture: x86_64
  WARNING: System appears noisy: high system load (2.46). Results may be unreliable.
====================================================================================================

Statistics:
  Revisions:                        5
  Confidence:                    99.0%
  Max relative error:             3.0%

[ 1] 9bf697f0 Merge 90926e8eaa01698211ae65c9b17ddd09dfac9a7d into 50794...
     vs 90926e8e fix: use wrapping arithmetic for RCT transforms to preven...
----------------------------------------------------------------------------------------------------------------------------------------------------------------
     bike.jxl                   |                            ├───────────│──────┤                                  |     23204746 px/s | 0.999 / prev 
     green_queen_modular_e3.jxl |                               ├───────│╂┤                                        |      6261117 px/s | 0.996 / prev 
     green_queen_vardct_e3.jxl  |                         ├───────────│──╂┤                                        |     19576745 px/s | 0.991 / prev 
     sunset_logo.jxl            |                                   ├────│┤                                        |      2238610 px/s | 0.999 / prev 
                                  Scale: 0.89 to 1.11 (1.0 = same speed, '┃' marks 1.0)

[ 2] 90926e8e fix: use wrapping arithmetic for RCT transforms to preven...
     vs 50794263 Report benchmark failures in comments
----------------------------------------------------------------------------------------------------------------------------------------------------------------
     bike.jxl                   |                                   ├────╂│─────┤                                  |     23227283 px/s | 1.002 / prev 
     green_queen_modular_e3.jxl |                                   ├────╂│┤                                       |      6286841 px/s | 1.002 / prev 
     green_queen_vardct_e3.jxl  |                               ├────────╂──│───┤                                  |     19753425 px/s | 1.009 / prev 
     sunset_logo.jxl            |                            ├───────────╂│┤                                       |      2241071 px/s | 1.002 / prev 
                                  Scale: 0.89 to 1.11 (1.0 = same speed, '┃' marks 1.0)

[ 3] 50794263 Report benchmark failures in comments
     vs f7d6dc98 Make AUTHORS check optional
----------------------------------------------------------------------------------------------------------------------------------------------------------------
     bike.jxl                   |                                  ├─────│──────┤                                  |     23184628 px/s | 0.999 / prev 
     green_queen_modular_e3.jxl |                                     ├─│╂┤                                        |      6275948 px/s | 0.997 / prev 
     green_queen_vardct_e3.jxl  |                       ├────────────│───╂┤                                        |     19582241 px/s | 0.989 / prev 
     sunset_logo.jxl            |                                ├──────│╂┤                                        |      2237646 px/s | 0.998 / prev 
                                  Scale: 0.89 to 1.11 (1.0 = same speed, '┃' marks 1.0)

[ 4] f7d6dc98 Make AUTHORS check optional
     vs 2268d66b Replace `std::cell::RefCell` with `AtomicRefCell` (#593)
----------------------------------------------------------------------------------------------------------------------------------------------------------------
     bike.jxl                   |                              ├────────│╂───┤                                     |     23204471 px/s | 0.998 / prev 
     green_queen_modular_e3.jxl |                                      ├─│─┤                                       |      6292703 px/s | 1.001 / prev 
     green_queen_vardct_e3.jxl  |                                   ├────╂────│─────┤                              |     19794369 px/s | 1.013 / prev 
     sunset_logo.jxl            |                                        ├│─┤                                      |      2242301 px/s | 1.002 / prev 
                                  Scale: 0.89 to 1.11 (1.0 = same speed, '┃' marks 1.0)

[ 5] 2268d66b Replace `std::cell::RefCell` with `AtomicRefCell` (#593) (oldest, baseline for comparisons)

================================================================================================================================================================

…antics

SIMD integer instructions wrap on overflow. The scalar fallback was using
bare i32, which panics on overflow in debug builds.

Changed scalar I32Vec from i32 to std::num::Wrapping<i32>. This makes +/-/*
operators wrap automatically, matching SIMD behavior.
@lilith lilith force-pushed the pr/fix-rct-overflow branch from 90926e8 to 0cdd96b Compare January 2, 2026 09:13
@lilith
Copy link
Contributor Author

lilith commented Jan 2, 2026

Updated to use std::num::Wrapping<i32> as the scalar I32Vec type instead of adding trait methods. Is this what you had in mind?

This makes +/-/* operators wrap automatically for the scalar path, matching SIMD hardware behavior.

@veluca93
Copy link
Member

veluca93 commented Jan 5, 2026

Can we do this with u32 too for consistency? Also, can you update the PR description?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants