[WIP] [GenAI] Lora Finetune #7288
Labels
AutoML.NET
Automating various steps of the machine learning process
enhancement
New feature or request
Milestone
Lora fine-tuning is an adapter-based technique to fine-tune an LLM. It changes LLM model architecture by adding learnable lora layers to transformers. During fine-tuning, only lora weights are adjustable and the LLM weights are frozen, so it requires much less GPU memory comparing to a full-layer fine-tuning. Based on this table, it requires 16GB memory to fine-tuning a 7B size model in 16bits, which can be fit in rtx 3090, 4080 and 4090. A wider range of GPUs can be fit on 3.8B LLMs like phi-3.5-mini
API design (wip)
Package:
Microsoft.ML.GenAI.Lora
The text was updated successfully, but these errors were encountered: