Skip to content

Review the architecture split #237

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
4 tasks
jlamypoirier opened this issue Apr 17, 2025 · 0 comments · Fixed by #252
Closed
4 tasks

Review the architecture split #237

jlamypoirier opened this issue Apr 17, 2025 · 0 comments · Fixed by #252
Labels
enhancement New feature or request

Comments

@jlamypoirier
Copy link
Collaborator

🎯 Goal (What & Why)

Base model configs are currently split into architecture/non-architecture. This is a legacy behaviour that was introduced as a stopgap solution for:

  1. Loading pretrained configs while allowing changes in parameters that support it.
  2. Marking a subset of the parameters as essential to a model, which absolutely need to be kept fixed when loading a pretrained model (i.e. changing it would cause a crash or otherwise break the model).

(1) Is made obsolete by #211, and will become irrelevant after #231. (2) On the other hand needs a replacement. Ignoring the problem is not recommended because it would make the loading of pretrained checkpoints unsafe, causing cryptic crashes and unexpected breakdown. My suggestion (which I've been planning for a while) would be to replace the hard class split by a simple field descriptor, i.e. something like field hints. This would address the issue with relatively low added complexity (and could be made opt-in, ex. for experimental models).

🚀 Execution Plan

(This section may start as an incomplete draft but must be defined before implementation begins.)

Step 1: What is the smallest working version?

(Describe the simplest way to implement this feature with minimal effort.)

Step 2: What additional optimizations are possible (but optional)?

(List potential refinements that can be added in later PRs if needed.)

📌 Acceptance Criteria (Must-Haves for Completion)

  • The feature must be functional and tested.
  • The implementation must be documented in practical terms.
  • The PR must include a performance/impact summary.
  • No refactors unless directly necessary for feature completion.

🛠️ Project Management

  • Assign the project to the Fast-LLM project.
  • Set the Estimate field (in days) in the GitHub project.
  • Use the Size field to categorize the PR size (Small/Medium/Large).
  • Assign an owner when opening the issue.
@jlamypoirier jlamypoirier added the enhancement New feature or request label Apr 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant