Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

translator: support the developer role in Bedrock translator #127

Merged
merged 1 commit into from
Jan 20, 2025

Conversation

nacx
Copy link
Contributor

@nacx nacx commented Jan 18, 2025

This is a follow-up PR for: #117 (comment)

I think it makes sense to do the same conversions and set it as "system"? The "developer" role is a replacement for "system" in OpenAI, so it should be fine?

@nacx nacx requested a review from a team as a code owner January 18, 2025 19:38
@mathetake
Copy link
Member

Im no expert here - assigning @yuzisun

@yuzisun
Copy link
Contributor

yuzisun commented Jan 18, 2025

I think aws bedrock does not explicitly support the developer role as defined in OpenAI schema which is mainly to give developers more control over the assistants behavior. You can still achieve the same by providing the instruction in the input prompt to guide the model behavior.

@yuzisun
Copy link
Contributor

yuzisun commented Jan 18, 2025

The intention in this project is to provide an OpenAI compatible API layer. If aws bedrock does not explicitly support this role, an error is more appropriate imo. @nacx what do you think ?

@nacx
Copy link
Contributor Author

nacx commented Jan 18, 2025

I'm not sure erroring is a good idea. developer is the replacement for system, and we can even see system not being mentioned in the docs anymore: https://platform.openai.com/docs/guides/text-generation#messages-and-roles. I think it is reasonable to use developer as system.

Also, erroring would be IMHO a bad UX. One could configure, for the same routing header, an OpenAI backend and an AWS Bedrock backend. In that case, user requests would fail intermittently (based on the configured weights), because depending on the selected backend the request could be translated or not.

@yuzisun
Copy link
Contributor

yuzisun commented Jan 19, 2025

I'm not sure erroring is a good idea. developer is the replacement for system, and we can even see system not being mentioned in the docs anymore: https://platform.openai.com/docs/guides/text-generation#messages-and-roles. I think it is reasonable to use developer as system.

Also, erroring would be IMHO a bad UX. One could configure, for the same routing header, an OpenAI backend and an AWS Bedrock backend. In that case, user requests would fail intermittently (based on the configured weights), because depending on the selected backend the request could be translated or not.

According to chat completion API spec, system and developer are still separate roles, it’s not marked deprecated unless I am missing something.
https://platform.openai.com/docs/api-reference/chat/create. I think it only applies to new models “With o1 models and newer, use developer messages for this purpose instead.”

@nacx
Copy link
Contributor Author

nacx commented Jan 20, 2025

In the creation API they're still separate roles, but I think developer will replace `system1. In the link I provided, where the available messages and roles are described, developer is explained as follows:

Instructions to the model that are prioritized ahead of user messages, following chain of command. Previously called the system prompt.

That previously called system prompt makes me think that it is in fact a replacement for the system role and that it should be safe to translate it into system when converting to Bedrock.

Also we need to account for the user experience we want to provide in this project as a whole: we're providing an OpenAI façade for different AI backends, and there are already features to load balance between them. If we return errors for things like this, requests will start failing intermittently, depending on the backend they hit. (and this may be unpredictable behavior for App teams as in many orgs the gateways would be owned by a different team).

In this concrete case, I think the translation is OK and safe, given the OpenAI docs. Beyond this PR, however, we need to think about what should be the behaviour when something is non-translateable to all possible backends and how that would affect live traffic when N backends are configured for the same model.

@nacx nacx changed the title Support the developer role in Bedrock translator translator: support the developer role in Bedrock translator Jan 20, 2025
@yuzisun
Copy link
Contributor

yuzisun commented Jan 20, 2025

If certain feature is not supported by multiple backends for the same model then application user should not use it if they want to load balance among them. An explicit error is better than processing with inconsistent behaviors imo, user gets more explicit feedback. For this particular case it might be fine, I do not want to block you without a concrete scenario but we need to document such possible incompatible translations.

@mathetake mathetake merged commit a4c27bd into envoyproxy:main Jan 20, 2025
11 checks passed
@mathetake
Copy link
Member

mathetake commented Jan 20, 2025

If certain feature is not supported by multiple backends for the same model then application user should not use it if they want to load balance among them. An explicit error is better than processing with inconsistent behaviors imo

+1 but after adding some additional features that are useful without load balancing to multiple providers (like prompt guard etc?), then enabling certain feature for only some backend might make sense... let's see how this unfolds later

@nacx
Copy link
Contributor Author

nacx commented Jan 20, 2025

This is the main tradeoff when building an abstraction API: not all backends will support the same features and the safe thing to do is to support the common denominator. Reallity, however, is that as the API becomes richer in features, that common denominator gets smaller and smaller, and the value provided by the abstraction if only supporting the common denominator gets diminished very quickly.

I agree we can't support everything, but one of the lessons I learned when building OSS cloud compute abstraction APIs in the past is that if we want the project to succeed, we need to try providing more than just the common things to be valuable.

@nacx nacx deleted the dev-bedrock branch January 20, 2025 23:55
aabchoo pushed a commit that referenced this pull request Jan 21, 2025
This is a follow-up PR for:
#117 (comment)

I think it makes sense to do the same conversions and set it as
"system"? The "developer" role is a replacement for "system" in OpenAI,
so it should be fine?

Signed-off-by: Ignasi Barrera <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants