Skip to content

Conversation

@daranday
Copy link
Contributor

@daranday daranday commented Apr 28, 2023

Stack from ghstack (oldest at bottom):

As Liam investigated, there is a copy collision with the GPT-NeoX demo currently. This change mitigates that. We might still need to convert to a shared_fs approach in the future to take advantage of larger scale data.

…pired by Liam

As Liam investigated, there is a copy collision with the GPT-NeoX demo currently. This change mitigates that. We might still need to convert to a shared_fs approach in the future to take advantage of larger scale data.

[ghstack-poisoned]
@liamcli
Copy link
Contributor

liamcli commented May 2, 2023

LGTM for now since merging to your own fork.

We'll also need to update the docker image for all experiment configs once my PR to update the gpt-neox image is merged to the environments repo.
determined-ai/environments#205

In particular, we might break still when not using shared fs for multinode dtrain if the logic to build index is not updated such that we build index on local rank 0 when not using shared fs:
https://github.com/determined-ai/gpt-neox/blob/determined/megatron/data/gpt2_dataset.py#L139

@daranday
Copy link
Contributor Author

daranday commented May 9, 2023

Hey @liamcli, thanks for the review!

About code. It looks like it's using local index now on the gp2_dataset.py, so we should be good now?

About PR format. This is actually targeted for the main branch. The weird base branch (gh/daranday/3/head) is just the Sapling's way of creating code reviews. When landing using sl land ghstack <PR url>, it knows to rebase into main.

@sandrews-hpe sandrews-hpe deleted the branch gh/daranday/3/base March 7, 2024 19:11
@sandrews-hpe sandrews-hpe deleted the gh/daranday/3/head branch March 7, 2024 19:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants