QNI-SCFT

This repository contains the code for the paper . QNI-SCFT updates only salient columns, while injecting Gaussian noise scaled by qunatization step size into non-salient columns.

Installation

Preliminaries

We highly recommend to use docker image that supports CUDA. For our experiments we used the following image:

# pull image
docker pull pytorch/pytorch:2.1.0-cuda12.1-cudnn8-devel

Run container and install Git:

# run container
docker run -it --gpus all --ipc=host -v {local_storage}:{docker_container_storage} pytorch/pytorch:2.1.0-cuda12.1-cudnn8-devel

# install git
apt update && apt install git -y

Packages installation

Clone the QNI-SCFT repository:

git clone https://github.com/ZhMax/qni_scft.git
cd qni_scft

Install QNI-SCFT integration into huggingface's Transformers library and additional packages:

# transformers
cd transformers_modified
pip install .

pip install sentencepiece
pip install protobuf

# configs 
pip install ml_collections

#logging
pip install wandb

Install lm-evaluation-harness for evaluation:

pip install lm-eval

Dependencies

python 3.10.13
pytorch 2.1.0
cuda12.1
cudnn8

Experiments were conducted on NVIDIA A100 GPU with 80GB memory.

Salient Columns

To find salient columns for fine-tuning, it is necessary to compute sensitivity metrics. The metrics for a full-precision LLM can be estimated by the following script:

bash sensitivity_metrics/salientcolumns/scripts/salient_metric.sh

Train

Note: Only the LLaMA models can be fine-tuned.

To fine-tune salient columns of a full-precision LLM with quantization noise injection, run the following command:

python llm_tune/train_instruct.py --config_path=llm_tune/configs/llama_scft_with_bitnoise_4bit.py

Eval

Finally, you can run evaluation on zero-shot tasks benchmarks with lm-evaluation-harness by the following command:

lm_eval --model hf \
  --model_args "pretrained=<path to the directory with the fine-tuned model>" \
  --tasks winogrande,hellaswag,swag,boolq,xwinograd_en \
  --batch_size 16 \
  --num_fewshot 0 \
  --device cuda

Citation

If you plan to use our work in your projects, please consider citing our paper:

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
QUIK		QUIK
llm-tune		llm-tune
sensitivity_metrics		sensitivity_metrics
transformers_modified		transformers_modified
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

QNI-SCFT

Table of contents

Installation

Preliminaries

Packages installation

Dependencies

Salient Columns

Train

Eval

Citation

About

Releases

Packages

Languages

ZhMax/qni_scft

Folders and files

Latest commit

History

Repository files navigation

QNI-SCFT

Table of contents

Installation

Preliminaries

Packages installation

Dependencies

Salient Columns

Train

Eval

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages