PERP: Rethinking the Prune-Retrain Paradigm in the Era of LLMs

Authors: Max Zimmer, Megi Andoni, Christoph Spiegel, Sebastian Pokutta

This repository contains the official implementation of PERP, a framework for pruning and retraining/reconstructing Large Language Models (LLMs). The code is built on PyTorch and uses Weights & Biases for experiment tracking.

Key Features

Support for multiple LLM architectures (OPT, LLaMA-2, Mistral, Mixtral)
Various pruning methods (magnitude, random, WANDA, SparseGPT)
Sparsity-preserving reconstruction/retraining approaches such as MaskLoRA and ScaleLoRA
Semi-Structured and unstructured sparsity patterns
Parameter-Efficient Fine-Tuning (PEFT) integration
Comprehensive experiment tracking with W&B

Running Experiments

The main entry point is main.py. If passed the --debug flag, the code will run in debug mode, executing the default configuration as specified in the defaults dictionary. Otherwise, the code expectes to be started by a WandB sweep agent. Hence, to run an experiment, either configure the parameters in main.py or use Weights & Biases sweeps. Key parameters include:

Essential hyperparameters

training_mode: Whether to retrain or reconstruct the model, must be one of retrain, reconstruct, or None (in which case the model is not retrained/reconstructed). Retraining is a full retraining of the model using the overall loss. Reconstruction operates layerwise by minimizing the per-layer $L_2$-deviation loss.
peft_strategy: Whether to use SelectivePEFT or FullFT. FullFT retrains all parameters, consequently requires more memory. SelectivePEFT allows for more parameter-efficient fine-tuning, which can be further specified by the peft_use_* parameters.
lora_type: When SelectivePEFT is used and peft_use_lora is True, this parameter specifies the LoRA-variant to use. Options are lora, lora_prune, scale_lora, mask_lora, corresponding to the variants proposed in the paper.

Citation

If you find this work useful for your research, please consider citing:

@article{zimmer2023perp,
  title={Perp: Rethinking the prune-retrain paradigm in the era of llms},
  author={Zimmer, Max and Andoni, Megi and Spiegel, Christoph and Pokutta, Sebastian},
  journal={arXiv preprint arXiv:2312.15230},
  year={2023}
}

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
README.md		README.md
citation.bib		citation.bib
customLayers.py		customLayers.py
main.py		main.py
peft_methods.py		peft_methods.py
prune_methods.py		prune_methods.py
runner.py		runner.py
utilities.py		utilities.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

PERP: Rethinking the Prune-Retrain Paradigm in the Era of LLMs

Key Features

Running Experiments

Essential hyperparameters

Citation

About

Uh oh!

Uh oh!

Languages

ZIB-IOL/PERP

Folders and files

Latest commit

History

Repository files navigation

PERP: Rethinking the Prune-Retrain Paradigm in the Era of LLMs

Key Features

Running Experiments

Essential hyperparameters

Citation

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Uh oh!

Languages