[POC] Support of dynamic shapes for the `fill_` op #3100

miladm · 2021-08-25T03:50:53Z

Op lowering of fill_ to study the requirements for supporting dynamic shapes in PT/XLA.

…with ExpandAsDynamicShape support

JackCaoG

You should probably do a git pull --all in your xla dir and rebase this branch. You should also do a submodule sync and update(For this pr you might need to revert the tf side change, but if it is not intended to merge I am fine with it). It seem like local tf version you have is out of date and every-time you submit a pr it will try to also update the tf version.

JackCaoG · 2021-08-30T20:50:15Z

torch_xla/csrc/tensor_methods.cpp

@@ -1213,6 +1213,7 @@ void XLATensor::eye_out(XLATensor& out, xla::int64 lines, xla::int64 cols) {
 void XLATensor::fill_(XLATensor& input, const at::Scalar& value) {
  ir::Value constant =
      GetIrValueForScalar(value, input.shape(), input.GetDevice());
+  constant = ir::ops::ExpandAsDynamicShapes(constant, input.GetIrValue());


I think you shouldn't pass input.shape() to GetIrValueForScalar in line above here. What happened is that GetIrValueForScalarwill a static expand to input.shape() which is not necessary.

JackCaoG · 2021-08-30T20:54:01Z

torch_xla/csrc/ops/ops.cpp

@@ -882,6 +882,27 @@ NodePtr NanToNum(const Value& input, const Value& nan, const Value& posinf,
                   input.shape(), std::move(lower_fn));
 }

+NodePtr ExpandAsDynamicShapes(const Value& static_input,


DynamicExpand might be a better name since we also have DynamicReshape in here. In the future we should just make expand support dynamic shape but that would require a frontend API change

I agree and it makes sense.

JackCaoG

Mostly LGTM, some naming nits

JackCaoG · 2021-08-31T17:55:17Z

torch_xla/csrc/data_ops.cpp

@@ -112,6 +112,23 @@ xla::XlaOp BuildExpand(xla::XlaOp input,
                             xla::util::Iota<xla::int64>(output_sizes.size()));
 }

+xla::XlaOp BuildDynamicExpand(xla::XlaOp static_input,


Nit, BuildDynamicExpandAs since target is a value tensor, not a shape tenspr.

JackCaoG · 2021-08-31T17:55:33Z

torch_xla/csrc/data_ops.h

@@ -37,6 +37,10 @@ xla::XlaOp SqueezeAllTrivialDimensions(xla::XlaOp input);
 xla::XlaOp BuildExpand(xla::XlaOp input,
                       absl::Span<const xla::int64> output_sizes);

+// Dynamic Shape version of BuildExpand()
+xla::XlaOp BuildDynamicExpand(xla::XlaOp static_input,


JackCaoG · 2021-08-31T17:56:59Z

torch_xla/csrc/ops/ops.cpp

@@ -882,6 +882,26 @@ NodePtr NanToNum(const Value& input, const Value& nan, const Value& posinf,
                   input.shape(), std::move(lower_fn));
 }

+NodePtr DynamicExpand(const Value& static_input, const Value& dynamic_target) {


nit DynamicExpandAs

JackCaoG

some ideas

JackCaoG · 2021-09-17T20:54:56Z

experimental/xla_tensor.py

+debug = False
+BYPASS_XLA = False  #Performance profiling experimentation
+
+class DynamicTensorXLASize(object):


One thing I am not sure right now is whether we need DynamicTensorXLASize as a class which I think is more expensive than a namedtuple(which is immutable).

size() doesn't accept a tuple return type. I will try namedtuple.

I made this change. It turns out the this optimization leads to a small improvement in speedup.

JackCaoG · 2021-09-17T20:58:57Z

experimental/xla_tensor.py

+    def __init__(self, data, **kwargs):
+        super().__init__()
+        if debug: print('[__init__]')
+        self.t_ = torch.as_tensor(data, dtype=torch.float32, **kwargs)


I think this can be removed once we pass the actual dynamic shape as tensor right?

With the current XLATensor inheritance configuration, we need to keep self.t_ since it's the only variable holding the data for this object.

PyTorch tensor with XLA device type will use pt/xla's tensor_impl, which carrys a c++ XLATensor. C++ XLATensor will a Data object which carry one of

ComputationData --> a handle refer to an allocation on the XRT server side

IR --> pending computation

at::Tenspr --> cpu tensor that we need to upload to XRT server in a future time

View

Python XLATensor should do the same thing, it should not always carry a python data. We need a way to hook in the tensor_impl like we do when we set device=xm.xla_device().

Thanks for the input! I have a couple of follow ups that I will ask you offline.

QQ: should we consider broadening the scope of the python XLATensor (using a tensor_impl-like module) now or after we have additional clarity on whether this methodology is viable for our goal?

let's get it working first before doing any benchmark. You can work with Ed and pytoch team to figure out how to make

torch.tensor([123], device='xla:0')

to be a python XLATensor while still use pt/xla's tensor_impl

With the code as it is today, self should already be a valid device='xla' tensor. I wonder if you were trying to work around some other problem with t_

JackCaoG · 2021-09-17T21:32:17Z

experimental/xla_tensor.py

+    def size(self):
+        if debug: print('[size] ')
+        static_size = super(XLATensor, self).size()
+        if self.device == xm.xla_device():


After we figured out how to pass the device to XLATensor, we should check for whether dynamic shape is included to decide which function to call(instead of checking for device).

That's a fair point. Can we check for the dynamic shape condition without calling the xla::GetDimensionSize() first?

JackCaoG · 2021-09-18T05:07:14Z

experimental/xla_tensor.py

+
+    def __repr__(self):
+        if debug: print ('[__repr__]')
+        return "XLATensor:\n{}\n".format(self.t_)


XLATensor should always be a tensor with XLA device type. This means that it will be a Lazy tensor should not carry a cpu storage. I think for a regular Pytorch tensor with xla device type, we will call a to_cpu() to get the scalar value of the tensor.

JackCaoG · 2021-09-18T05:11:04Z

experimental/xla_tensor.py

+        return func(*args, **kwargs)
+
+if debug: print ('[main] make b')
+t = XLATensor([[1], [2], [3]], device=xm.xla_device())


I think ed also mentioned this in your issues, ideally we should just do torch.tensor and auto cast all tensor to XLATensor when device is a xla device in the backend.

Yes. Though, AFAIU, such an auto-cast requires additional support from PyTorch that they kept as an open topic to investigate on their side. I'd like to seek some clarity from them next time we meet. Do you have a different understanding @JackCaoG?

…de (e.g. debug) to improve speedup, this commit assumes all input tensors have dynamic shape, added performance profiling code

ezyang · 2021-10-05T13:04:38Z

experimental/xla_tensor.py

+        return "XLATensor:\n{}\n".format(self.t_)
+
+    @classmethod
+    def __torch_function__(cls, func, types, args=(), kwargs=None):


This isn't doing anything nontrivial, so we can get rid of it. (If we do end up needing to override a torch namespace function, that will be a global tax everywhere, see pytorch/pytorch#62888 cc @anjali411 but nb the issue is about __torch_dispatch__ but the need here is for __torch_function__)

It seems like XLATensor might not need to be a tensor subclass at all. I am not sure if the size() overload is expected to be in C++. If yes, then it still needs to be a tensor subclass and we should use _make_wrapper_subclass if the size needs to be used in C++ as defined above. (cc. @albanD)

Also, @ezyang the overhead here won't be as bad because we don't call from C++ to python for ever single function, right?

There's a few prototypes floating around, but in the version where torch.empty(device='xla') transparently returns an XLATensor, it needs to be a tensor subclass

anjali411 · 2021-10-05T14:44:29Z

experimental/xla_tensor.py

+
+class XLATensor(torch.Tensor):
+    def __new__(cls, data, **kwargs):
+        return torch.Tensor._make_subclass(cls, torch.as_tensor(data, dtype=torch.float32, **kwargs))


It seems like you probably don't need the t_ below since you are storing it here anyway? (given you are using _make_subclass and not _make_wrapper_subclass)

cjolivier01 · 2021-10-13T20:14:57Z

torch_xla/csrc/ops/ops.cpp

@@ -882,26 +883,50 @@ NodePtr NanToNum(const Value& input, const Value& nan, const Value& posinf,
                   input.shape(), std::move(lower_fn));
 }

-NodePtr DynamicExpand(const Value& static_input, const Value& dynamic_target) {
-  auto lower_fn = [](const Node& node, LoweringContext* loctx) -> XlaOpVector {
+NodePtr DynamicExpand(const Value& input, const std::vector<xla::int64> static_size, const xla::int64 dynamic_size, const Value& dynamic_target) {


probably

, const std::vector<xla::int64> static_size

was meant to be passed by ref? Also seems passed by value into shape_fn?

miladm · 2022-05-21T15:56:38Z

An up to date version of this PR is at #3558.
Closing this PR.

miladm added the DO_NOT_MERGE Not for merging. label Aug 25, 2021

miladm requested a review from JackCaoG August 25, 2021 03:50

miladm self-assigned this Aug 25, 2021

miladm added 14 commits August 30, 2021 16:44

initial implementation of fill lowering + dynamic shapes API calls - WIP

3da7f3e

cleanup of debug code

ed37f47

fill lowering update

46a2d40

debug code cleanup

8e0fdb5

debug code cleanup

f199b17

adding a basic test command set - addded a test case

f4beeb4

test framework update

a232267

improved test

4b680e4

moved BuildExpand above the Get/SetDimensionSize calls

ebd35bb

dynamic shapes are fully supported for fill_

ddd5619

Improved the initial implementation by replacing the Fill op support …

dcd45b1

…with ExpandAsDynamicShape support

code cleaning, removal of Fill op lowering class

60c7cc4

code cleanup

06ba039

merge with master

64adec2

miladm force-pushed the fill_base_dynamic_shapes branch from f68e515 to 64adec2 Compare August 30, 2021 16:48

linter fix

5bbe822

JackCaoG requested changes Aug 30, 2021

View reviewed changes

miladm added 2 commits August 31, 2021 00:25

fixed code review comments

67681ed

linter fix

7448526

miladm requested a review from JackCaoG August 31, 2021 00:33

JackCaoG requested changes Aug 31, 2021

View reviewed changes

miladm added 6 commits September 14, 2021 07:49

adding the PyTorch overriding API as a temp independent script

73ee915

adding the PyTorch overriding API as a temp independent script

ceab958

cleanup of scrap code

81fa967

added device type parameters, __new__ method

63090c0

integrated _get_xla_tensor_dimension_size API

d055591

integrated _get_xla_tensor_dimension_size API

8affbca

miladm added 2 commits September 17, 2021 04:00

added support for torch_xla._XLAC._xla_dynamic_expand

943bac0

dummy commit

4d4038c

JackCaoG reviewed Sep 18, 2021

View reviewed changes

added namedtuple, removed DynamicTensorXLASize, commented optional co…

cb08879

…de (e.g. debug) to improve speedup, this commit assumes all input tensors have dynamic shape, added performance profiling code

ezyang reviewed Oct 5, 2021

View reviewed changes

anjali411 reviewed Oct 5, 2021

View reviewed changes

support for passing dynamic size of self.t_ instead of the tensor itself

e305cac

cjolivier01 reviewed Oct 13, 2021

View reviewed changes

miladm added this to the Dynamic Shape milestone Feb 10, 2022

miladm changed the title ~~Support of dynamic shapes for the fill_ op~~ [POC] Support of dynamic shapes for the fill_ op Feb 25, 2022

miladm marked this pull request as draft May 15, 2022 22:14

miladm closed this May 21, 2022

miladm added the dynamism Dynamic Shape Features label May 21, 2022

[POC] Support of dynamic shapes for the fill_ op #3100

[POC] Support of dynamic shapes for the fill_ op #3100

Uh oh!

Conversation

miladm commented Aug 25, 2021

Uh oh!

JackCaoG left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

JackCaoG left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

JackCaoG left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

miladm Sep 22, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

miladm Sep 20, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

miladm Sep 20, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

anjali411 Oct 5, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

miladm commented May 21, 2022

Uh oh!

Uh oh!

[POC] Support of dynamic shapes for the `fill_` op #3100

[POC] Support of dynamic shapes for the `fill_` op #3100

miladm Sep 22, 2021 •

edited

Loading

miladm Sep 20, 2021 •

edited

Loading

miladm Sep 20, 2021 •

edited

Loading

anjali411 Oct 5, 2021 •

edited

Loading