Skip to content

Conversation

@hjanuschka
Copy link
Collaborator

@hjanuschka hjanuschka commented Dec 22, 2025

Summary

Adds optimized fast paths for all 8 orientation variants in the low-memory save stage.

Non-transposing orientations:

  • Identity: existing fast path (memcpy/SIMD interleave)
  • FlipVertical: reuses identity with reversed output row
  • FlipHorizontal: reverses pixel order within each row
  • Rotate180: combines both (reversed row + reversed pixels)

Transposing orientations (input row → output column):

  • Transpose: output_x = input_y, output_y = input_x
  • Rotate90Cw: output_x = height-1-input_y, output_y = input_x
  • Rotate90Ccw: output_x = input_y, output_y = width-1-input_x
  • AntiTranspose: output_x = height-1-input_y, output_y = width-1-input_x

Supports all data formats (U8, U16/F16, F32) with 1-4 channels.

Test plan

  • All 32 orientation tests pass

@github-actions
Copy link

github-actions bot commented Dec 22, 2025

Benchmark @ f6de453

MULTI-FILE BENCHMARK RESULTS (4 files)
  CPU architecture: x86_64
  WARNING: System appears noisy: high system load (2.10). Results may be unreliable.
Statistics:
  Confidence:               99.0%
  Max relative error:        3.0%

Comparing: 37288890 (Base) vs 0be4cd63 (PR)

File Base (MP/s) PR (MP/s) Δ%
bike.jxl 21.676 21.795 +0.55% ±1.6%
green_queen_modular_e3.jxl 8.281 8.307 +0.30% ±0.4%
green_queen_vardct_e3.jxl 19.908 20.044 +0.69% ±2.3%
sunset_logo.jxl 2.325 2.333 +0.32% ±0.9%

Copy link
Member

@veluca93 veluca93 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer to just add FlipVertical for now, and think later about how to best implement other transformations.

Non-transposing orientations:
- Identity: existing fast path (memcpy/SIMD interleave)
- FlipVertical: reuses identity with reversed output row
- FlipHorizontal: reverses pixel order within each row
- Rotate180: combines both (reversed row + reversed pixels)

Transposing orientations (input row → output column):
- Transpose: output_x = input_y, output_y = input_x
- Rotate90Cw: output_x = height-1-input_y, output_y = input_x
- Rotate90Ccw: output_x = input_y, output_y = width-1-input_x
- AntiTranspose: output_x = height-1-input_y, output_y = width-1-input_x

All formats (U8, U16/F16, F32) with 1-4 channels supported.
All 32 orientation tests pass.
Address review feedback: remove flip_horizontal.rs and transpose.rs,
keeping only the FlipVertical optimization for now. Other orientation
fast paths can be added in a follow-up PR.
@veluca93 veluca93 merged commit 4ac4460 into libjxl:main Jan 5, 2026
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants