-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
translator: support the developer role in Bedrock translator #127
Conversation
Signed-off-by: Ignasi Barrera <[email protected]>
Im no expert here - assigning @yuzisun |
I think aws bedrock does not explicitly support the developer role as defined in OpenAI schema which is mainly to give developers more control over the assistants behavior. You can still achieve the same by providing the instruction in the input prompt to guide the model behavior. |
The intention in this project is to provide an OpenAI compatible API layer. If aws bedrock does not explicitly support this role, an error is more appropriate imo. @nacx what do you think ? |
I'm not sure erroring is a good idea. Also, erroring would be IMHO a bad UX. One could configure, for the same routing header, an OpenAI backend and an AWS Bedrock backend. In that case, user requests would fail intermittently (based on the configured weights), because depending on the selected backend the request could be translated or not. |
According to chat completion API spec, system and developer are still separate roles, it’s not marked deprecated unless I am missing something. |
In the creation API they're still separate roles, but I think
That previously called system prompt makes me think that it is in fact a replacement for the system role and that it should be safe to translate it into system when converting to Bedrock. Also we need to account for the user experience we want to provide in this project as a whole: we're providing an OpenAI façade for different AI backends, and there are already features to load balance between them. If we return errors for things like this, requests will start failing intermittently, depending on the backend they hit. (and this may be unpredictable behavior for App teams as in many orgs the gateways would be owned by a different team). In this concrete case, I think the translation is OK and safe, given the OpenAI docs. Beyond this PR, however, we need to think about what should be the behaviour when something is non-translateable to all possible backends and how that would affect live traffic when N backends are configured for the same model. |
If certain feature is not supported by multiple backends for the same model then application user should not use it if they want to load balance among them. An explicit error is better than processing with inconsistent behaviors imo, user gets more explicit feedback. For this particular case it might be fine, I do not want to block you without a concrete scenario but we need to document such possible incompatible translations. |
+1 but after adding some additional features that are useful without load balancing to multiple providers (like prompt guard etc?), then enabling certain feature for only some backend might make sense... let's see how this unfolds later |
This is the main tradeoff when building an abstraction API: not all backends will support the same features and the safe thing to do is to support the common denominator. Reallity, however, is that as the API becomes richer in features, that common denominator gets smaller and smaller, and the value provided by the abstraction if only supporting the common denominator gets diminished very quickly. I agree we can't support everything, but one of the lessons I learned when building OSS cloud compute abstraction APIs in the past is that if we want the project to succeed, we need to try providing more than just the common things to be valuable. |
This is a follow-up PR for: #117 (comment) I think it makes sense to do the same conversions and set it as "system"? The "developer" role is a replacement for "system" in OpenAI, so it should be fine? Signed-off-by: Ignasi Barrera <[email protected]>
This is a follow-up PR for: #117 (comment)
I think it makes sense to do the same conversions and set it as "system"? The "developer" role is a replacement for "system" in OpenAI, so it should be fine?