You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardexpand all lines: docs/source/en/model_doc/pixtral.md
+3-3
Original file line number
Diff line number
Diff line change
@@ -24,7 +24,7 @@ The Pixtral model was released by the Mistral AI team on [Vllm](https://github.c
24
24
Tips:
25
25
26
26
- Pixtral is a multimodal model, the main contribution is the 2d ROPE on the images, and support for arbitrary image size (the images are not padded together nor are they resized)
27
-
- This model follows the `Llava` familiy, meaning image embeddings are placed instead of the `[IMG]` token placeholders.
27
+
- This model follows the `Llava` familiy, meaning image embeddings are placed instead of the `[IMG]` token placeholders.
28
28
- The format for one or mulitple prompts is the following:
29
29
```
30
30
"<s>[INST][IMG]\nWhat are the things I should be cautious about when I visit this place?[/INST]"
@@ -35,7 +35,7 @@ This model was contributed by [amyeroberts](https://huggingface.co/amyeroberts)
35
35
36
36
Here is an example of how to run it:
37
37
38
-
```python
38
+
```python
39
39
from transformers import LlavaForConditionalGeneration, AutoProcessor
40
40
fromPILimport Image
41
41
@@ -51,7 +51,7 @@ IMG_URLS = [
51
51
]
52
52
PROMPT="<s>[INST]Describe the images.\n[IMG][IMG][IMG][IMG][/INST]"
0 commit comments