-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tgi server :: tool_choice="auto" behaves like tool_choice="required" from OpenAI spec #2549
Comments
gentle ping @drbh |
I am running into this issue as well. I am not knowledgable enough in rust to deal with this, but I would very much appreciate if you take this on @mottoslo ! |
I think handling this issue may involve (breaking) changes in feature and needs to be discussed beforehand, hence I do not know where to start. |
Either way, it would be a huge improvement. As it stands, we can't easily build agents based on models deployed with TGI because of this. At least not using the Messages API. I tried manually applying the chat template and using the generate endpoints, and the model appears to be able to choose not to use a tool. The downside of this approach is that the manual chat template handling makes it much harder to integrate in existing frameworks. Being able to use TGI as a drop-in replacement for OpenAI models would be fantastic. |
#2614 has been merged into main and is part of the latest release. Has anyone already had a chance to test if this solves the issue? |
Hey guys, seen the PRs related to this that were in the recent release. It doesn't look like this has fixed the issue for me. I'm using HuggingFaceEndpoint wrapped up inside ChatHuggingFace to try and do ToolCalling with langchain. I am finding that the LLM will always cool that one tool and never stop in order to respond when it has the information it needs. So, it appears to me that it's still behaving as if tool_choice='required'. A general side note, I find using langchain for anything absolutely horrific but it's the easiest way out there. I would have preferred to suffer less and use the OpenAI classes from langchain to do this (rather than the chathuggingface, hfendpoint classes) but it seems that the API for TGI is not yet fully married with that which OpenAI uses. |
System Info
tgi version : 2.3.0
model : Meta-Llama-3-8B-Instruct
Information
Tasks
Reproduction
0. tool definition to use for reproduction
1. Using OpenAI with tool_choice="auto"
=> responds with normal chat message since prompt does not need tool_call
2. Using tgi with tool_choice="auto" (model = llama)
=> tries to call a function anyway
3. Using OpenAI with tool_choice="required"
=> tries to call a function anyway
Expected behavior
When consuming tgi, I expect the server to be able to respond both with and without tool_call, when provided with tool definitions. As of now, application needs to be aware that tool calling is required before calling tgi, which In my opinion is not something LLM applications should aim for.
I am curious if the above behavior is intended. I have found that someone has raised this issue, (#1587 (comment)) but it wasn't addressed anywhere.
Maybe something can be done with tool prompt, ToolType enum and chat_completions logic in server.rs ?
If this behavior is not intended and needs fixing, I would love to give it a shot !
Thank you :)
The text was updated successfully, but these errors were encountered: