Replies: 1 comment
-
Doing e.g. if your sequence is the tokens:
Next time you prompt, it'll store
Which means the batched work (still storing G@6) makes no sense! That's the mistake which the As you've found this is a bit awkward, if you have just one active Conversation you're in an almost unrecoverable situation. Your best option is probably to look at the That's definitely a bit of an API design wart that needs cleaning up! Any ideas that would make it nicer to work with? |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I used to do
conversation.ShiftLeft(count: (int)this.ContextSize / 2)
, but just now I gotCannotModifyWhileRequiresInferenceException
.So what to do in this case? Inference is impossible due to running out of KV cache space. And shifting is impossible due to this exception.
Can I check the space in advance somehow? Any other suggestions?
Beta Was this translation helpful? Give feedback.
All reactions