Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add support for phi4 #764

Open
wants to merge 11 commits into
base: main
Choose a base branch
from
Open

feat: Add support for phi4 #764

wants to merge 11 commits into from

Conversation

jlonge4
Copy link

@jlonge4 jlonge4 commented Jan 18, 2025

This PR adds support for Meta's Phi-4 model by adapting the existing LLaMA implementation.

The Phi-4 architecture follows the LLaMA architecture closely, with the main difference being in how the weights are stored (fused qkv_proj and gate_up vs separate projections).

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@dacorvo
Copy link
Collaborator

dacorvo commented Feb 4, 2025

@jlonge4 thank you very much for this pull-request: adding support for phi4 would be awesome.

We are however heavily refactoring the export mechanism to remove the dependency to transformers-neuronx and simplify the contribution of new models.

Can you take a look at that pull-request and see if it would make it easier for you to add support for phi4 based on the new HLO backend ?

@jlonge4
Copy link
Author

jlonge4 commented Feb 5, 2025

Hi there @dacorvo , just took a look at the difference and it certainly seems a lot slimmer! I think my effort would be the same in regard to the most important part for this which is the load_weights function. However it would obviously get rid of a lot of boiler plate. I am down to rework this PR and merge into add_hlo branch if you prefer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants