Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support apply_chat_template() #6889

Closed
1 task done
Melancholy495 opened this issue Feb 10, 2025 · 3 comments · Fixed by #6905
Closed
1 task done

Support apply_chat_template() #6889

Melancholy495 opened this issue Feb 10, 2025 · 3 comments · Fixed by #6905
Labels
solved This problem has been already solved

Comments

@Melancholy495
Copy link

Melancholy495 commented Feb 10, 2025

Reminder

  • I have read the above rules and searched the existing issues.

Description

直接使用transformers库里边的apply_chat_template()函数来模板化messages,进行推理或者微调。

现在使用template的方式是用手动预先定义的一些前后缀进行匹配的。
虽然在使用主流模型的时候已经够用,但在许多在这些主流模型上进行微调的模型的模板可能与原始模版有很大区别,甚至会存在更复杂的逻辑,通过在消息前后添加前后缀的方式很有可能难以完成这些逻辑,尤其是在工具调用,或者模板比较复杂的时候。同时手动添加的模板也很难与预先定义的模板完全一致,可能会导致潜在的性能影响。

所以,在模型已有template的时候,直接使用apply_chat_template()模板化历史消息可能更加简洁和高效,免去了繁琐的手动重构,以及更多处理。

Example: LLaMa-3.1-8B when using Tools
在llama-3.1-8B原始的模板中,进行工具调用的时候,本来应该先给出一个ipython的环境信息,再给出当前的时间,同时system message是单独的,而工具调用指令和tools_text则是添加在第一条消息最前面,如下所示:

<|begin_of_text|><|start_header_id|>system<|end_header_id|>

Environment: ipython
Cutting Knowledge Date: December 2023
Today Date: 26 Jul 2024

You are a bot that responds to weather queries.<|eot_id|><|start_header_id|>user<|end_header_id|>

Given the following functions, please respond with a JSON for a function call with its proper arguments that best answers the given prompt.

Respond in the format {"name": function name, "parameters": dictionary of argument name and its value}.Do not use variables.

{{Functions}}

Hey, what's the temperature in Paris right now?<|eot_id|><|start_header_id|>assistant<|end_header_id|>

{"name": "get_current_temperature", "parameters": {"unit": "celsius", "location": "Paris, France"}}<|eom_id|>

但是在llamafactory里边添加的template就不能添加正确的时间,同时也不能把正确的文本(如工具调用指令和工具信息)放到对应的位置。
这些不正确的解析可能会存在一些潜在的风险,可以通过添加额外的判断来解决这个问题(如下图),但这样还是比较繁琐,难以维护,也可能最终无法和原始的版本匹配,尤其是在多轮调用之后。

Image

Pull Request

暂无

@Melancholy495 Melancholy495 added enhancement New feature or request pending This problem is yet to be addressed labels Feb 10, 2025
@hiyouga
Copy link
Owner

hiyouga commented Feb 10, 2025

我们实现了正确的 tool prompt,可能是你没有更新版本

LLAMA3_TOOL_PROMPT = (
"Cutting Knowledge Date: December 2023\nToday Date: {date}\n\n"
"You have access to the following functions. To call a function, please respond with JSON for a function call. "
"""Respond in the format {{"name": function name, "parameters": dictionary of argument name and its value}}. """
"Do not use variables.\n\n{tool_text}"
)

class Llama3ToolUtils(ToolUtils):
r"""
Llama 3.x tool using template with `tools_in_user_message=False`.
Reference: https://www.llama.com/docs/model-cards-and-prompt-formats/llama3_1/#json-based-tool-calling
"""
@override
@staticmethod
def tool_formatter(tools: List[Dict[str, Any]]) -> str:
date = datetime.now().strftime("%d %b %Y")
tool_text = ""
for tool in tools:
wrapped_tool = {"type": "function", "function": tool}
tool_text += json.dumps(wrapped_tool, indent=4, ensure_ascii=False) + "\n\n"
return LLAMA3_TOOL_PROMPT.format(date=date, tool_text=tool_text)
@override
@staticmethod
def function_formatter(functions: List["FunctionCall"]) -> str:
if len(functions) > 1:
raise ValueError("Llama-3 does not support parallel functions.")
return f'{{"name": "{functions[0].name}", "parameters": {functions[0].arguments}}}'
@override
@staticmethod
def tool_extractor(content: str) -> Union[str, List["FunctionCall"]]:
try:
tool = json.loads(content.strip())
except json.JSONDecodeError:
return content
if "name" not in tool or "parameters" not in tool:
return content
return [FunctionCall(tool["name"], json.dumps(tool["parameters"], ensure_ascii=False))]

另外 apply chat template 会对多轮对话训练造成显著的性能下降,所以我们没有使用该函数

@Melancholy495
Copy link
Author

Melancholy495 commented Feb 10, 2025

我们实现了正确的 tool prompt,可能是你没有更新版本

LLAMA3_TOOL_PROMPT = (
"Cutting Knowledge Date: December 2023\nToday Date: {date}\n\n"
"You have access to the following functions. To call a function, please respond with JSON for a function call. "
"""Respond in the format {{"name": function name, "parameters": dictionary of argument name and its value}}. """
"Do not use variables.\n\n{tool_text}"
)

class Llama3ToolUtils(ToolUtils):
r"""
Llama 3.x tool using template with `tools_in_user_message=False`.
Reference: https://www.llama.com/docs/model-cards-and-prompt-formats/llama3_1/#json-based-tool-calling
"""
@override
@staticmethod
def tool_formatter(tools: List[Dict[str, Any]]) -> str:
date = datetime.now().strftime("%d %b %Y")
tool_text = ""
for tool in tools:
wrapped_tool = {"type": "function", "function": tool}
tool_text += json.dumps(wrapped_tool, indent=4, ensure_ascii=False) + "\n\n"
return LLAMA3_TOOL_PROMPT.format(date=date, tool_text=tool_text)
@override
@staticmethod
def function_formatter(functions: List["FunctionCall"]) -> str:
if len(functions) > 1:
raise ValueError("Llama-3 does not support parallel functions.")
return f'{{"name": "{functions[0].name}", "parameters": {functions[0].arguments}}}'
@override
@staticmethod
def tool_extractor(content: str) -> Union[str, List["FunctionCall"]]:
try:
tool = json.loads(content.strip())
except json.JSONDecodeError:
return content
if "name" not in tool or "parameters" not in tool:
return content
return [FunctionCall(tool["name"], json.dumps(tool["parameters"], ensure_ascii=False))]

另外 apply chat template 会对多轮对话训练造成显著的性能下降,所以我们没有使用该函数

感谢回复,那似乎只有每次都手动添加了😢。不过这个版本的prompt似乎还是和官方的有一些差别,官方提供的prompt里边tool分为两类,一类是buildin tool,是直接放到system里边的,另一类是custom tool,是放到第一条用户消息里的(也不知道这个会不会有什么影响,直觉上是没有的😊

-----再编辑

突然想到有没有方法可以自动解析模型自带的chat_template,转换为llamafactory对应的template格式呢?

@hiyouga hiyouga added solved This problem has been already solved and removed enhancement New feature or request pending This problem is yet to be addressed labels Feb 11, 2025
@hiyouga
Copy link
Owner

hiyouga commented Feb 11, 2025

我初步写了一版,如果不指定 template 参数就自动解析 hf chat template。后续可以看下怎么支持 tools

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
solved This problem has been already solved
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants