-
Notifications
You must be signed in to change notification settings - Fork 5
Documentation Update for NCCL #132
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just left a few comments on what you have done so far, this needs much more to be ready but hopefully with the checklist that I put in the description of the PR you will have a easier life to navigate the documentation and add bits related to NCCL
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very good work @tharittk, I left some additional mostly stylistic comments... once you have address them, I think for now this is already a very good improvement to the documentation to include NCCL related stuff and we can continue revising it as we progress with the code development 🚀
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tharittk this looks great to me!
@rohanbabbar04 you want to have a look or shall I go ahead and merge?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ongoing update - in parallel to NCCL implementation PR (#130)
Tasks
README
mentioning the possibility to use NCCL instead of MPI for distributed cupy arrays, updating the install, example and tests sections with NCCL-related commandsindex.rst
similar to README to reflect new NCCL enginegpu.rst
documenting the new env variable (NCCL_PYLOPS_MPI
), adding NCCL to the example, and perhaps consider adding a table like in https://pylops.readthedocs.io/en/stable/gpu.html to document what features are supported in NCCL and what are not, eg the missing support for complex numbers (this can also serve as a live roadmap for you work, as we progress we should see more and more features being supported by both MPI and NCCL)