chat history #85
hoarycrippl3
started this conversation in
General
Replies: 1 comment 1 reply
-
|
It has been replaced by the "maximum prompt size in tokens" option. This has been suggested here: #77 The idea is that, since the sizes of the messages can vary a lot, it is more robust to limit the prompt size in terms of tokens. Try sending lots of messages with this parameter set to a low value (like 500), then increase it 100 at a time before sending a new message until the script crashes. Then you will know the limit for your GPU. |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hello, I was able to run the text generation webui with pygmalion-6b model on my RTX 2070 super with 8gb vram by using the following options:
--load-in-8bit --auto-devices --disk --gpu-memory 6 --no-stream --share
but I would have to set the chat history to "6" instead of "0" for unlimited. Now it seems that option is gone. I am still able to run the model, but I get out of memory errors much sooner now and sometimes it just doesn't want to run at all. Am I just overlooking this option or has the chat history option been removed? Thank you!
Beta Was this translation helpful? Give feedback.
All reactions