Update sharding for activations to include 'dp' axis. Validated on internal notebook, after enabling (2, 1, 4) for (dp, fsdp, tp), do see weights replicated on dp axis. #849
+20
−15
Google CLA / cla/google
succeeded
Dec 5, 2025 in 3s
✅ All contributors are covered under a CLA with Google
See https://cla.developers.google.com/ for more info about Google's Contributor License Agreement (CLA).
ℹ️ Googlers: Go here to view more details and manage scans for this pull request.
Details
The following contributors were found for this pull request:
✅ 935c88d PR Opener: @copybara-service[bot]
✅ 935c88d Author: @wang2yn84 <la*****ng@google.com>
(Only the first commit for a unique contributor is listed.)
Loading