Skip to content

Fallback fft_r2c #1708

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
May 29, 2025
Merged

Fallback fft_r2c #1708

merged 4 commits into from
May 29, 2025

Conversation

CuiYifeng
Copy link
Contributor

@CuiYifeng CuiYifeng commented May 28, 2025

  • Removed conditional compilation logic for USE_ONEMKL_XPU in _fft_r2c_xpu and _fft_r2c_xpu_out functions to perform the FFT computation on the CPU and transferring the result back to the XPU.
  • Added a test case, test_mem_eff_attention_fail_with_batch_size_geq_65536, to the skip list due to a known issue with CUDA not being enabled.

@Copilot Copilot AI review requested due to automatic review settings May 28, 2025 17:58
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR refactors the XPU FFT routines to always use a CPU-based MKL fallback for _fft_r2c_xpu and _fft_r2c_xpu_out, removing the USE_ONEMKL_XPU conditional paths and consolidating logic for CPU computation followed by device transfer.

  • Always call native::_fft_r2c_mkl on CPU and transfer result to XPU.
  • Removed conditional headers and unified includes for FFT ops.
  • Simplified output resizing and copying logic in the _out variant.
Comments suppressed due to low confidence (3)

src/ATen/native/xpu/SpectralOps.cpp:6

  • These FFT includes (_fft_c2c_native.h and _fft_c2r_native.h) are not used in the r2c fallback logic and can be removed to reduce unused dependencies.
#include <ATen/ops/_fft_c2c_native.h>

src/ATen/native/xpu/SpectralOps.cpp:90

  • Consider adding or updating tests to cover this CPU-to-XPU fallback path to ensure correctness and performance characteristics on XPU when MKL is not available.
Tensor out_cpu = native::_fft_r2c_mkl(

src/ATen/native/xpu/SpectralOps.cpp:87

  • [nitpick] The CPU fallback transfer logic is duplicated across both functions; consider extracting the pattern (CPU compute + XPU copy) into a shared helper to reduce duplication.
Tensor _fft_r2c_xpu(

@CuiYifeng CuiYifeng added the mkl label May 29, 2025
@xytintel xytintel added this pull request to the merge queue May 29, 2025
Merged via the queue into main with commit a66c0fd May 29, 2025
7 checks passed
@xytintel xytintel deleted the yifeng/fallback_r2c branch May 29, 2025 14:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants