-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix bug when resuming generation with OpenAI models #775
base: main
Are you sure you want to change the base?
Conversation
…ation with OpenAI models
@RaoulDrake, We would love this feature, but unfortunately the OpenAI API prohibits pre-filling/partially completing an assistant message for the last turn in a conversation (which significantly hampers our ability to enforce constraints). There's a chance this has changed recently, but to my understanding, this will run into failures on the OpenAI API for now. |
@Harsha-Nori, thanks for the quick feedback. Here's a minimal example that works for me, i.e., no troubles on the OpenAI API side and the model successfully completes the response with "Paris":
Output:
|
@RaoulDrake can you add a test to show how this works, please? |
@riedgar-ms Sure, would the minimal example above turned into a test case in tests/models/test_openai.py suffice? For anything beyond that, I'm afraid I'm a little bit short on time at the moment, unfortunately, so in that case I'd have to get back to you in a couple of weeks, if it's still of interest then. |
That would be great, thanks! |
Codecov ReportAttention: Patch coverage is
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files@@ Coverage Diff @@
## main #775 +/- ##
==========================================
- Coverage 69.04% 62.34% -6.71%
==========================================
Files 55 55
Lines 4071 4074 +3
==========================================
- Hits 2811 2540 -271
- Misses 1260 1534 +274 ☔ View full report in Codecov by Sentry. |
@riedgar-ms I have added the minimal example as a test case |
@RaoulDrake as @Harsha-Nori said, we would love to add the ability to set a prefix for OpenAI calls. But unfortunately OpenAI does not support that currently. They do allow you to end your request with an assistant block, but any generation from the model begins a new assistant block (it seems) it does not continue the last one given. So this means we sometimes get a continuation-like behavior, but often not: |
I've come across what I think must be a bug when resuming generation with OpenAI models, i.e., letting a model complete a partial response. Without the fix in this pull request, the partial response is discarded, which is likely not the intended behaviour and can also lead to a raised Exception a little further down the line. With the fix, everything works fine for me, i.e., any partial response is included in the OpenAI API request and the model can successfully complete the partial response.