Backward register #423

StrongSpoon · 2025-01-16T09:22:46Z

PR Category

Operator

Type of Change

New Feature

Description

register backward functions as aten interfaces
implement threshold operator incidentally

Issue

Progress

Change is properly reviewed (1 reviewer required, 2 recommended).
Change is responded to an issue.
Change is fully covered by a UT.

Performance

tongxin · 2025-02-19T06:36:37Z

src/flag_gems/ops/batch_norm.py

+    save_invstd=None,
+    train=False,
+    eps=1e-05,
+    output_mask=None,


The last argument should be grad_input_mask.

the definition of native_batch_norm_backward in aten lib is like below and we keep it the same:
native_batch_norm_backward(Tensor grad_out, Tensor input, Tensor? weight, Tensor? running_mean, Tensor? running_var, Tensor? save_mean, Tensor? save_invstd, bool train, float eps, bool[3] output_mask) -> (Tensor, Tensor, Tensor)

tongxin · 2025-02-19T07:33:07Z

src/flag_gems/ops/batch_norm.py

-    affine: tl.constexpr,
+    input_grad_mask: tl.constexpr,
+    weight_grad_mask: tl.constexpr,
+    bias_grad_mask: tl.constexpr,


The backward kernel may need is_train arg also, to distinguish between train and non-train cases.

We can leave it for future work tho.

it's a bit complex. fix it later QAQ

tongxin · 2025-02-19T07:35:16Z

src/flag_gems/ops/batch_norm.py

+    running_var=None,
+    save_mean=None,
+    save_invstd=None,
+    train=False,


kernel should be able to handle train=True case.

tongxin · 2025-02-19T07:50:47Z

src/flag_gems/ops/dropout.py

-
-def native_dropout(x, p=0.5, train=True):
-    return NativeDropout.apply(x, p, train)
+def dropout(input, p, train):


Arg train is optional.

tongxin · 2025-02-19T07:55:35Z

src/flag_gems/ops/dropout.py

+    logging.debug("GEMS NATIVE DROPOUT FORWARD")
+    assert p > 0.0 and p < 1.0, "p must be in (0, 1)"
+    device = input.device
+    input = input.contiguous()


Add a note that we'll remove contiguous enforcement in the future.

tongxin · 2025-02-19T07:57:34Z

src/flag_gems/ops/embedding.py

+    indices = indices.contiguous()
+    weight = weight.contiguous()


Refactor this in TODOs.

tongxin · 2025-02-19T08:06:35Z

src/flag_gems/ops/groupnorm.py

+    mean = mean.contiguous()
+    rstd = rstd.contiguous()
+    weight = None if weight is None else weight.contiguous()
+    group_size = C // group


tongxin · 2025-02-19T08:09:10Z

src/flag_gems/ops/groupnorm.py

-                BLOCK_GROUP_SIZE=triton.next_power_of_2(C // num_groups),
-                BLOCK_HW_SIZE=triton.next_power_of_2(HW),
+                HxW,
+                BLOCK_GROUP_SIZE=triton.next_power_of_2(C // group),


cdiv(C, group)?

tongxin · 2025-02-19T08:52:41Z

src/flag_gems/ops/dropout.py

-
-def native_dropout(x, p=0.5, train=True):
-    return NativeDropout.apply(x, p, train)
+def dropout(input, p, train):


I realized we didn't handle we train=False correctly in the previous version. Let's fix that.

…e time

StrongSpoon force-pushed the bwd branch 2 times, most recently from 9f79739 to 01bee17 Compare February 6, 2025 09:26

StrongSpoon marked this pull request as ready for review February 11, 2025 02:04

StrongSpoon force-pushed the bwd branch from cdcef25 to d9aeb36 Compare February 19, 2025 03:13

tongxin reviewed Feb 19, 2025

View reviewed changes

StrongSpoon force-pushed the bwd branch from 0901df3 to cdcef25 Compare March 17, 2025 05:54

StrongSpoon added 20 commits March 17, 2025 13:56

[Operator] register backward independently for tanh

21a4b77

[Operator] register backward independently for gelu

05f3b6e

[Operator] implement threshold fwd and bwd, as bwd of relu at the sam…

b575bc9

…e time

[Operator] register sigmoid independently

68c4110

[Operator] register silu backward independently

ecc1b46

[Operator] register dropout backward independently

a33ff87

[Operator] register embedding backward

ae12fae

[Operator] register group_norm backward

334a8c0

[Operator] register layer_norm backward

8de4a2f

[Test] test backward with torch.ops.aten functions

421db01

[Operator] optimize group_norm_backward to allow larger input

fea5d90

[Bugfix] wrong call of threshold_backward

859c5bf

[Operator] register backward of softmax

2f0ace9

[Operator] register log_softmax backward

424a5ee

[Operator] register batch_norm backward

d9d3dd0

[Operator] register weightnorm_interface_backward

d970d16

[Operator] modify weight_norm

112ff98

[Bugfix] weight_norm test error

b8ec29a

[Bugfix] diagonal_backward

a3e5971

[Bugfix] initialize cuda context properly and reduce test cases

0eb24e2

StrongSpoon force-pushed the bwd branch from cdcef25 to 0eb24e2 Compare March 17, 2025 06:03

StrongSpoon added 2 commits March 17, 2025 14:36

remove backward for inplace ops

3eb81bd

impl dropout on train=False and fix error in groupnorm

7de406d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Backward register #423

Backward register #423

StrongSpoon commented Jan 16, 2025 •

edited

Loading

tongxin Feb 19, 2025

StrongSpoon Mar 19, 2025

tongxin Feb 19, 2025

tongxin Feb 19, 2025

StrongSpoon Mar 19, 2025

tongxin Feb 19, 2025

tongxin Feb 19, 2025

StrongSpoon Mar 19, 2025

tongxin Feb 19, 2025

StrongSpoon Mar 19, 2025

tongxin Feb 19, 2025

StrongSpoon Mar 19, 2025

tongxin Feb 19, 2025

StrongSpoon Mar 19, 2025

tongxin Feb 19, 2025

StrongSpoon Mar 19, 2025

tongxin Feb 19, 2025

StrongSpoon Mar 19, 2025

Backward register #423

Are you sure you want to change the base?

Backward register #423

Conversation

StrongSpoon commented Jan 16, 2025 • edited Loading

PR Category

Type of Change

Description

Issue

Progress

Performance

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

StrongSpoon commented Jan 16, 2025 •

edited

Loading