[ROCm] Hipify changes #398

pruthvistony · 2021-06-24T07:15:23Z

Add Hipify as a git submodule
Trigger hipify from cmake build
TP_USE_ROCM controls the trigger, which will be set to ON
when building on ROCm

pruthvistony · 2021-06-24T16:35:11Z

@lw @jeffdaily, @jithunnair-amd,
Please review the changes.

jithunnair-amd · 2021-06-24T16:59:02Z

@pruthvistony I think we discussed this before, but just to make sure: could the build_amd.py be part of hipify-torch so that it doesn't have to be added to the hipifying project's sources? Is there anything that's project-specific that cannot be passed in as arguments to build_amd.py?

pruthvistony · 2021-06-25T00:34:42Z

@pruthvistony I think we discussed this before, but just to make sure: could the build_amd.py be part of hipify-torch so that it doesn't have to be added to the hipifying project's sources? Is there anything that's project-specific that cannot be passed in as arguments to build_amd.py?

As discussed, build_amd.py can be moved to hipify-torch repo and triggered directly from cmake. Just that passing all parameters like list(regex) can be tricky within cmake. Currently I dont see anything specific problem as such in passing arguments through build_amd.py. I have kept current change same as it is used in other projects. Can update this change as an improvement, since it is not a blocker for other ROCm build related changes.

lw

I could be OK with merging this but (correct me if I'm wrong) I don't think this is yet usable is it? If anyone tried to run with TP_USE_ROCM the build would most likely fail. Should we fix all that and ensure everything works before we do that?

Also, while I guess we cannot run any ROCm tests on CircleCI (because it's lacking the necessary hardware), I think we should be able to at least add a ROCm build job. Do you think you could do that? Thanks!

lw · 2021-07-12T12:31:49Z

tools/amd_build/build_amd.py

@@ -0,0 +1,49 @@
+#!/usr/bin/env python


We have been putting some assorted tools in the tensorpipe/misc folder. It would be good to use a single directory for all these.

In the process of moving this file into hipify-torch repo, which is added a git submodule.
https://github.com/ROCmSoftwarePlatform/hipify-torch.git
Once the PRs into hipify-torch is merged, I will update this PR.

lw · 2021-07-12T12:35:00Z

tools/amd_build/build_amd.py

+sys.path.append(os.path.realpath(os.path.join(
+    os.path.dirname(__file__),
+    os.path.pardir,
+    os.path.pardir,
+    'third_party')))


This could work, though it means that this file "assumes" a certain structure in the repo, which means it's not really standalone and thus possibly brittle. I now wonder whether it could be more robust to have the path of the hipify module passed in via a command-line argument?

Since this file will be moved hipify-torch repo, the above code will not be required.

lw · 2021-07-12T12:35:33Z

tools/amd_build/build_amd.py

+    output_directory=args.output_directory,
+    includes=includes,
+    ignores=ignores,
+    is_pytorch_extension=True)


What does the is_pytorch_extension flag do? TensorPipe is not a PyTorch extension.

Yes, tensorpipe is not a extension, but this parameter is used internal in hipify.
I will let @jithunnair-amd to comment for info here.

That nomenclature made sense until now :) Essentially, we needed a way to preserve the "legacy" way of hipifying for PyTorch source code, but use the "new and improved" way for other code which used hipify, which were basically PyTorch extensions so far. We'll assess the best way to generalize this, but this flag is needed for now until that happens.

Ok, got it, just wanted to check. It would be nice if we could add a comment to explain this, so I won't wonder about it again the next time I read this code. (Though if this code is moved to the hipify repo it won't matter anyways)

lw · 2021-07-12T12:36:26Z

tools/amd_build/build_amd.py

+
+from hipify import hipify_python
+
+parser = argparse.ArgumentParser(description='Top-level script for HIPifying, filling in most common parameters')


It is generally more idiomatic to put all the logic of a runnable Python entry-point script in a def main(): function and then invoke it with if __name__ == "__main__": main().

Will check on it and update the code.

Updated the code as suggest and committed to hipify repo

beauby · 2021-07-12T17:12:44Z

cmake/Options.cmake

+if(TP_USE_CUDA AND TP_USE_ROCM)
+  message(FATAL_ERROR "Tensorpipe can be built either for CUDA or ROCm, TP_USE_CUDA and TP_USE_ROCM both are set, erroring out!!!!")
+endif()


I am not suggesting to do this now, but how difficult would that limitation be to lift?

For the first version of ROCm support we intend to still use the CudaBuffer name (and the cuda_ipc, ... ones) for the HIP version of TensorPipe as well, out of simplicity, hence using the CUDA and HIP versions together would cause name conflicts. Eventually I think we could/should fix this.

We are thinking tensorpipe build for CUDA or ROCm will be mutual exclusive, since pyTorch lib is also mutual exclusive.
Even for channels there will be only hip based channels, hip_basic, hip_gdr, hip_ipc, ... .
So the builds will have corresponding flags turned ON. So I didn't get whats the concern here. Please let me know more details.

I think the question here was motivated by the fact that in principle there shouldn't be any hard blocker to have both CUDA and ROCm (once we fix the name conflicts) right? Hence this is mainly a comment about "code style". In practice though yes, when used from PyTorch they will be mutually exclusive, hence no worries about this.

pruthvistony · 2021-07-13T06:35:15Z

I could be OK with merging this but (correct me if I'm wrong) I don't think this is yet usable is it? If anyone tried to run with TP_USE_ROCM the build would most likely fail. Should we fix all that and ensure everything works before we do that?

Also, while I guess we cannot run any ROCm tests on CircleCI (because it's lacking the necessary hardware), I think we should be able to at least add a ROCm build job. Do you think you could do that? Thanks!

Currently pyTorch ROCm tests are executed on jenkins.
The changes are not yet usable, I am raising changes in multiple PR as it was suggested previously. Next PR which builds tensorpipe enabling 'TP_USE_ROCM' is in draft mode, but is not complete due to a not support API. For which we are working with the HIP team.

Regarding the ROCm build, I believe job can be setup. Regarding the tests I will check on the necessary hardware and get back.

lw · 2021-07-13T18:52:03Z

Currently pyTorch ROCm tests are executed on jenkins.

Yes, I saw that. However that's because they are both build and run on AMD GPUs right? If we only built, without running, we could do so on machines without any GPUs. (PyTorch already builds the CUDA version on CPU-only machines). It wouldn't be perfect but it would at least catch build issues, for example caused by hipification.

pruthvistony · 2021-07-19T19:03:12Z

Moved all the hipify related files into the hipify-torch repo.

CMakeLists.txt

mrshenli · 2021-07-20T14:56:20Z

.gitmodules

 [submodule "third_party/libnop"]
 	path = third_party/libnop
 	url = https://github.com/google/libnop.git
+[submodule "third_party/hipify"]


OOC, this dependency does not seem existing in PyTorch's .gitmodules. Any reason for this difference?

Yes, that's because PyTorch currently has hipify logic as part of its source code at https://github.com/pytorch/pytorch/blob/master/torch/utils/hipify
We can't use that for Tensorpipe as it'll create a circular dependency.

Yes, in PyTorch hipify is not used as a git submodule. For new project to hipify we are using hipify-torch repo as a git submodule, so that all the hipification code is in a single place, providing many interfaces.

Do we plan to change PyTorch's hipify to use the same strategy?

Yes, we would like to in the long run, but PyTorch being a much bigger codebase, and having a more complex hipification strategy, will require much more coordination and effort.

- Add Hipify as a git submodule - Trigger hipify from cmake build - TP_USE_ROCM controls the trigger, which will be set to ON when building on ROCm

facebook-github-bot added the cla signed label Jun 24, 2021

pruthvistony mentioned this pull request Jul 1, 2021

[ROCm] Changes to enable build for ROCm platform #401

Draft

lw reviewed Jul 12, 2021

View reviewed changes

beauby reviewed Jul 12, 2021

View reviewed changes

beauby reviewed Jul 20, 2021

View reviewed changes

CMakeLists.txt Outdated Show resolved Hide resolved

mrshenli reviewed Jul 20, 2021

View reviewed changes

pruthvistony added 3 commits August 25, 2021 13:05

[ROCm] Hipify changes

d6337fa

- Add Hipify as a git submodule - Trigger hipify from cmake build - TP_USE_ROCM controls the trigger, which will be set to ON when building on ROCm

Hipify related files moved to hipify-torch repo

528bcac

Moving the CMAKE_MODULE_PATH update into TP_USE_ROCM guard

39ee458

pruthvistony force-pushed the tp_hipify_1 branch from c51ffa6 to 39ee458 Compare August 25, 2021 20:06

Updates to use new hipify() cmake API

86323d8

pruthvistony mentioned this pull request Sep 10, 2025

Tensorpipe - ROCm support pytorch/pytorch#162606

Open


		from hipify import hipify_python

		parser = argparse.ArgumentParser(description='Top-level script for HIPifying, filling in most common parameters')

[ROCm] Hipify changes #398

Are you sure you want to change the base?

[ROCm] Hipify changes #398

Uh oh!

Conversation

pruthvistony commented Jun 24, 2021

Uh oh!

pruthvistony commented Jun 24, 2021

Uh oh!

jithunnair-amd commented Jun 24, 2021

Uh oh!

pruthvistony commented Jun 25, 2021

Uh oh!

lw left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pruthvistony commented Jul 13, 2021

Uh oh!

lw commented Jul 13, 2021

Uh oh!

pruthvistony commented Jul 19, 2021

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!