You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
output = model.generate(texts=["Why LLM models are becoming so important?"])
61
+
62
+
print("Generated output by the model: {}".format(output))
63
+
```
64
+
65
+
You can find the data folder [here](examples/models/llama/alpaca_data).
66
+
67
+
<br>
68
+
36
69
## 🌟 What's new?
37
70
We are excited to announce the latest enhancements to our `xTuring` library:
38
71
1.__`LLaMA 2` integration__ - You can use and fine-tune the _`LLaMA 2`_ model in different configurations: _off-the-shelf_, _off-the-shelf with INT8 precision_, _LoRA fine-tuning_, _LoRA fine-tuning with INT8 precision_ and _LoRA fine-tuning with INT4 precision_ using the `GenericModel` wrapper and/or you can use the `Llama2` class from `xturing.models` to test and finetune the model.
@@ -45,7 +78,7 @@ from xturing.models import BaseModel
45
78
model = BaseModel.create('llama2')
46
79
47
80
```
48
-
2.__`Evaluation`__ - Now you can evaluate any `Causal Language Model` on any dataset. The metrics currently supported is [`perplexity`](https://towardsdatascience.com/perplexity-in-language-models-87a196019a94).
81
+
2.__`Evaluation`__ - Now you can evaluate any `Causal Language Model` on any dataset. The metrics currently supported is [`perplexity`](https://en.wikipedia.org/wiki/Perplexity).
49
82
```python
50
83
# Make the necessary imports
51
84
from xturing.datasets import InstructionDataset
@@ -118,38 +151,6 @@ For an extended insight, consider examining the [GenericModel working example](e
Copy file name to clipboardExpand all lines: docs/docs/advanced/api_server.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -4,7 +4,7 @@ description: FastAPI inference server
4
4
sidebar_position: 3
5
5
---
6
6
7
-
# ⚡️ Running model inference with FastAPI Ssrver
7
+
# ⚡️ Running model inference with FastAPI Server
8
8
9
9
<!-- Once you have fine-tuned your model, you can run the inference using a FastAPI server. -->
10
10
After successfully fine-tuning your model, you can perform inference using a FastAPI server. The following steps guide you through launching and utilizing the API server for your fine-tuned model.
Copy file name to clipboardExpand all lines: docs/docs/overview/supported_models.md
+1Lines changed: 1 addition & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -31,6 +31,7 @@ description: Models Supported by xTuring
31
31
### INT4 Precision model versions
32
32
> In order to load any model's __`INT4+LoRA`__ version, you will need to make use of `GenericLoraKbitModel` class from `xturing.models`. Below is how to use it:
33
33
```python
34
+
from xturing.models import GenericLoraKbitModel
34
35
model = GenericLoraKbitModel('/path/to/model')
35
36
```
36
37
The `/path/to/model` can be replaced with you local directory or any HuggingFace library model like `facebook/opt-1.3b`.
0 commit comments