Skip to content

Conversation

mayank31398
Copy link
Collaborator

No description provided.

Signed-off-by: Mayank Mishra <[email protected]>
Signed-off-by: Mayank Mishra <[email protected]>
Signed-off-by: Mayank Mishra <[email protected]>
Signed-off-by: Mayank Mishra <[email protected]>
Signed-off-by: Mayank Mishra <[email protected]>
Signed-off-by: Mayank Mishra <[email protected]>
Signed-off-by: Mayank Mishra <[email protected]>
Signed-off-by: Mayank Mishra <[email protected]>
Signed-off-by: Mayank Mishra <[email protected]>
Signed-off-by: Mayank Mishra <[email protected]>
Signed-off-by: Mayank Mishra <[email protected]>
Signed-off-by: Mayank Mishra <[email protected]>
Copy link
Contributor

Summary of Changes

Hello @mayank31398, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces foundational support for multi-accelerator environments by refactoring how kernel backends are managed and selected. It centralizes the KernelBackend enum, implements dynamic detection of the appropriate backend based on the device in use, and provides context managers for explicit backend control. This change streamlines the API for various operations by removing the need for explicit backend parameters, making the system more adaptable to different hardware configurations.

Highlights

  • Refactored Kernel Backend Management: The KernelBackend enum has been moved from fma/enums.py to a new dedicated module fma/kernel_backend.py, centralizing backend-related logic.
  • Dynamic Accelerator Detection: Introduced a new static method KernelBackend.get_kernel_backend_from_device that dynamically determines the appropriate kernel backend (e.g., CPU, CUDA, ROCm, TPU, XPU, Triton, Torch) based on the input tensor's device type.
  • Backend Forcing Context Managers: Added force_torch_backend and force_triton_backend context managers to allow explicit overriding of the dynamically detected kernel backend for specific code blocks, useful for testing or specific execution requirements.
  • Simplified API Signatures: The explicit kernel_backend parameter has been removed from the public API signatures of several functions and methods, including gru, bmm, gemm, and rmsnorm, as the backend is now determined automatically.
  • Updated Test Infrastructure: Existing tests for gru and bmm have been updated to utilize the new force_torch_backend context manager, ensuring compatibility and proper testing of the dynamic backend selection.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the kernel backend selection to support multiple accelerators by determining the backend from the device of the input tensor. This is a good architectural improvement.

I've found a few critical issues:

  • A NameError in the new fma/kernel_backend.py file due to a missing global variable definition.
  • A test in tests/ops/bmm_test.py is broken because it still passes the removed kernel_backend argument.
  • The rmsnorm op is not correctly updated, which will cause it to fail on non-CUDA devices.

Additionally, some tests in tests/modules/gru_test.py (test_gru_varlen_torch and test_gru_varlen) were not updated and will also fail due to passing the removed kernel_backend argument. Please update them to use the new force_..._backend context managers.

Once these issues are addressed, this will be a solid contribution.



_IS_ROCM_AVAILABLE = torch.version.hip is not None
_FORCE_TRITON_BACKEND = False
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The global variable _FORCE_TORCH_BACKEND is used in force_torch_backend and get_kernel_backend_from_device but it's not defined at the module level. This will cause a NameError when force_torch_backend is called. It should be defined and initialized to False here, similar to _FORCE_TRITON_BACKEND.

Comment on lines +81 to +90
output_expected = bmm(
A=A,
B=B,
C=C,
alpha=alpha,
beta=beta,
is_A_transposed=is_A_transposed,
is_B_transposed=is_B_transposed,
kernel_backend=KernelBackend.torch,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The bmm function no longer accepts the kernel_backend argument. This call will raise a TypeError. The force_torch_backend context manager already ensures the torch backend is used, so the kernel_backend argument should be removed from this call.

Signed-off-by: Mayank Mishra <[email protected]>
Signed-off-by: Mayank Mishra <[email protected]>
Signed-off-by: Mayank Mishra <[email protected]>
Signed-off-by: Mayank Mishra <[email protected]>
Signed-off-by: Mayank Mishra <[email protected]>
Signed-off-by: Mayank Mishra <[email protected]>
Signed-off-by: Mayank Mishra <[email protected]>
Signed-off-by: Mayank Mishra <[email protected]>
Signed-off-by: Mayank Mishra <[email protected]>
Signed-off-by: Mayank Mishra <[email protected]>
Signed-off-by: Mayank Mishra <[email protected]>
Signed-off-by: Mayank Mishra <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant