Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix bug when resuming generation with OpenAI models #775

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

RaoulDrake
Copy link

I've come across what I think must be a bug when resuming generation with OpenAI models, i.e., letting a model complete a partial response. Without the fix in this pull request, the partial response is discarded, which is likely not the intended behaviour and can also lead to a raised Exception a little further down the line. With the fix, everything works fine for me, i.e., any partial response is included in the OpenAI API request and the model can successfully complete the partial response.

@Harsha-Nori
Copy link
Collaborator

@RaoulDrake, We would love this feature, but unfortunately the OpenAI API prohibits pre-filling/partially completing an assistant message for the last turn in a conversation (which significantly hampers our ability to enforce constraints).

There's a chance this has changed recently, but to my understanding, this will run into failures on the OpenAI API for now.

@RaoulDrake
Copy link
Author

RaoulDrake commented Apr 23, 2024

@Harsha-Nori, thanks for the quick feedback.

Here's a minimal example that works for me, i.e., no troubles on the OpenAI API side and the model successfully completes the response with "Paris":

import os
from guidance import models, gen, system, user, assistant

api_key = os.environ.get('OPENAI_API_KEY')
gpt = models.OpenAI("gpt-3.5-turbo", api_key=api_key, echo=False)

with system():
    lm = gpt + "You are a helpful assistant."

with user():
    lm += "What is the capital of France?"

with assistant():
    lm += "The capital of France is " + gen(name="capital", temperature=0.7, suffix=".")

print(lm, end="\n\n")
print(lm["capital"])

Output:

<|im_start|>system
You are a helpful assistant.<|im_end|><|im_start|>user
What is the capital of France?<|im_end|><|im_start|>assistant
The capital of France is Paris.<|im_end|>

Paris

@riedgar-ms
Copy link
Collaborator

@RaoulDrake can you add a test to show how this works, please?

@RaoulDrake
Copy link
Author

@riedgar-ms Sure, would the minimal example above turned into a test case in tests/models/test_openai.py suffice? For anything beyond that, I'm afraid I'm a little bit short on time at the moment, unfortunately, so in that case I'd have to get back to you in a couple of weeks, if it's still of interest then.

@riedgar-ms
Copy link
Collaborator

That would be great, thanks!

@codecov-commenter
Copy link

codecov-commenter commented Apr 25, 2024

Codecov Report

Attention: Patch coverage is 0% with 3 lines in your changes are missing coverage. Please review.

Project coverage is 62.34%. Comparing base (f5ad01d) to head (12ab89b).

Files Patch % Lines
guidance/models/_openai.py 0.00% 3 Missing ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #775      +/-   ##
==========================================
- Coverage   69.04%   62.34%   -6.71%     
==========================================
  Files          55       55              
  Lines        4071     4074       +3     
==========================================
- Hits         2811     2540     -271     
- Misses       1260     1534     +274     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@RaoulDrake
Copy link
Author

@riedgar-ms I have added the minimal example as a test case test_openai_prefill to test_openai.py.

@slundberg
Copy link
Collaborator

@RaoulDrake as @Harsha-Nori said, we would love to add the ability to set a prefix for OpenAI calls. But unfortunately OpenAI does not support that currently. They do allow you to end your request with an assistant block, but any generation from the model begins a new assistant block (it seems) it does not continue the last one given. So this means we sometimes get a continuation-like behavior, but often not:
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants