Skip to content

Conversation

@daviswer
Copy link
Collaborator

@daviswer daviswer commented Oct 10, 2024

Updates get_latest and get_oldest to use the same sorting function, and allows the dataloader ckp handler to pass in its custom sort manually. Removes the bug where excessive path joins lead to repeated path prefixes in dataloader ckp loading. Fixes GPTBigCode signatures used for speculator training, to match superclass signatures (currently preventing other PRs from landing).

Includes and subsumes #110 and #96. Full credit to @weiji14 and @Akash-Nayak respectively

daviswer and others added 4 commits October 10, 2024 14:22
Signed-off-by: Johannes Schmude <[email protected]>
Signed-off-by: Davis Wertheimer <[email protected]>
Signed-off-by: Davis Wertheimer <[email protected]>
Signed-off-by: Davis Wertheimer <[email protected]>
Signed-off-by: Davis Wertheimer <[email protected]>
Signed-off-by: Davis Wertheimer <[email protected]>
Signed-off-by: Davis Wertheimer <[email protected]>
@daviswer daviswer requested a review from sahilsuneja1 October 10, 2024 18:45
@daviswer
Copy link
Collaborator Author

@sahilsuneja1 verifying that this update to the EmbedGPTBigCode forward function arguments won't affect speculator training?

@sahilsuneja1
Copy link
Collaborator

@daviswer Confirming speculator training for EmbedGPTBigCode works fine with this change

@daviswer daviswer merged commit 408c751 into foundation-model-stack:main Oct 11, 2024
@daviswer daviswer deleted the loader_ckp_fixes branch October 11, 2024 20:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants