Skip to content

fix ps+keras without unique #488

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open

fix ps+keras without unique #488

wants to merge 3 commits into from

Conversation

jq
Copy link
Collaborator

@jq jq commented Mar 4, 2025

Description

tfra rely on DistributedVariable to get the correct device placement i.e. ._get_on_device_or_primary method, however, this only returns device 0 under PSV2, It breaks the TFRA PS.
This PR fix it by using resource var directly without the DistributedVariable, the unique backwards computation still has problem, let's fix it in another pr
Co-authored-by: @MoFHeka

Fixes # (issue)
#182
#401
#365

Type of change

  • Bug fix
  • New Tutorial
  • Updated or additional documentation
  • Additional Testing
  • New Feature

Checklist:

  • I've properly formatted my code according to the guidelines
    • By running yapf
    • By running clang-format
  • This PR addresses an already submitted issue for TensorFlow Recommenders-Addons
  • I have made corresponding changes to the documentation
  • I have added tests that prove my fix is effective or that my feature works

How Has This Been Tested?

If you're adding a bugfix or new feature please describe the tests that you ran to verify your changes:
*

@jq jq requested a review from rhdong as a code owner March 4, 2025 22:43
@jq jq force-pushed the ps branch 29 times, most recently from a6bd824 to 9109fa8 Compare March 11, 2025 14:56
@jq jq force-pushed the ps branch 28 times, most recently from 7dd7d7b to 99f2cfb Compare April 11, 2025 00:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant