Skip to content

Conversation

@isLinXu
Copy link
Contributor

@isLinXu isLinXu commented Jan 10, 2026

Model Introduction

Hunyuan Translation Model Version 1.5 includes a 1.8B translation model, HY-MT1.5-1.8B, and a 7B translation model, HY-MT1.5-7B. Both models focus on supporting mutual translation across 33 languages and incorporating 5 ethnic and dialect variations. Among them, HY-MT1.5-7B is an upgraded version of our WMT25 championship model, optimized for explanatory translation and mixed-language scenarios, with newly added support for terminology intervention, contextual translation, and formatted translation. Despite having less than one-third the parameters of HY-MT1.5-7B, HY-MT1.5-1.8B delivers translation performance comparable to its larger counterpart, achieving both high speed and high quality. After quantization, the 1.8B model can be deployed on edge devices and support real-time translation scenarios, making it widely applicable.
 

Key Features and Advantages

  • HY-MT1.5-1.8B achieves the industry-leading performance among models of the same size, surpassing most commercial translation APIs.
  • HY-MT1.5-1.8B supports deployment on edge devices and real-time translation scenarios, offering broad applicability.
  • HY-MT1.5-7B, compared to its September open-source version, has been optimized for annotated and mixed-language scenarios.
  • Both models support terminology intervention, contextual translation, and formatted translation.
     

Create a new file examples/train_lora/hunyuan_1_5_sft.yaml with the following content:

### model
model_name_or_path: tencent/HY-MT1.5-1.8B
trust_remote_code: true

### method
stage: sft
do_train: true
finetuning_type: full

### dataset
dataset: identity,alpaca_en_demo
template: hunyuan_1_5
cutoff_len: 2048
max_samples: 1000
overwrite_cache: true
preprocessing_num_workers: 16
dataloader_num_workers: 4

### output
output_dir: saves/hunyuan_1_5/lora/sft
logging_steps: 10
save_steps: 500
plot_loss: true
overwrite_output_dir: true
save_only_model: false

### train
per_device_train_batch_size: 1
gradient_accumulation_steps: 8
learning_rate: 1.0e-4
num_train_epochs: 3.0
lr_scheduler_type: cosine
warmup_ratio: 0.1
bf16: false
ddp_timeout: 180000000
resume_from_checkpoint: null

# report_to: wandb
# run_name: hunyuan_1_5 # optional

### eval
# eval_dataset: alpaca_en_demo
# val_size: 0.1
# per_device_eval_batch_size: 1
# eval_strategy: steps
# eval_steps: 500

Used:

# 1 gpu
DISABLE_VERSION_CHECK=1 llamafactory-cli train examples/train_lora/hunyuan_1_5_sft.yaml
# 1 node
DISABLE_VERSION_CHECK=1 FORCE_TORCHRUN=1 llamafactory-cli train examples/train_lora/hunyuan_1_5_sft.yaml

fixes #9728

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @isLinXu, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request integrates the Hunyuan Translation Model Version 1.5 into the system, expanding the range of supported translation models. This enhancement allows users to leverage the advanced capabilities of HY-MT1.5, which offers robust mutual translation across numerous languages, optimized performance for edge devices, and features like terminology intervention and contextual translation. The changes include necessary code modifications for template registration and model configuration, alongside documentation updates to reflect the new model's availability.

Highlights

  • New Model Support: Added support for the Hunyuan Translation Model Version 1.5 (HY-MT1.5), including both the 1.8B and 7B parameter versions, as well as their FP8 quantized variants.
  • Template Registration: A new template, hunyuan_1_5, has been registered to correctly format inputs and outputs for the HY-MT1.5 models, defining specific tokens for user, assistant, system, and prefix roles.
  • Documentation Updates: The README.md and README_zh.md files have been updated to include the newly supported Hunyuan1.5 (MT) models in the list of available models.
  • Model Group Configuration: A new model group has been registered in src/llamafactory/extras/constants.py to associate the HY-MT1.5 models with their respective download sources and the new hunyuan_1_5 template.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds support for the Hunyuan-MT 1.5 models. The changes are generally good, but I've found a critical issue and a few areas for improvement. There's a typo in a special character within the stop_words for the new hunyuan_1_5 template, which could lead to generation failures. I've also suggested some improvements for consistency and better user experience, such as adding a space to the model name in the README files for readability, and appending an -Instruct suffix to the model names in constants.py to enable automatic template selection. Please review the detailed comments.

@hiyouga hiyouga added the solved This problem has been already solved label Jan 11, 2026
@hiyouga hiyouga merged commit 15b87f3 into hiyouga:main Jan 11, 2026
15 of 17 checks passed
@hopkin-ghp
Copy link

hopkin-ghp commented Jan 11, 2026

@hiyouga 你好
我看下来,HY-MT1.5-7B模型与hunyuan-mt-7b(1.0版本)模板一样,即hunyuan;而HY-MT1.5-1.8B版本是单独的,即hunyuan_1_5。

@hiyouga
Copy link
Owner

hiyouga commented Jan 11, 2026

@hiyouga 你好 我看下来,HY-MT1.5-7B模型与hunyuan-mt-7b(1.0版本)模板一样,即hunyuan;而HY-MT1.5-1.8B版本是单独的,即hunyuan_1_5。

@isLinXu Could you please confirm it?

@isLinXu
Copy link
Contributor Author

isLinXu commented Jan 11, 2026

@hopkin-ghp
You are right. Thank you for pointing this out!
It appears that HY-MT1.5-7B was incrementally trained on top of the original base, which is why it retains the legacy template. On the other hand, the 1.8B version was developed after the "Hunyuan" abbreviation was officially changed to "HY," leading to the new template structure.
To resolve this, we will:
Assign a dedicated template name for the 1.8B version (hunyuan_small).
Merge HY-MT1.5-7B and hunyuan-mt-7b (v1.0) into a single unified template to maintain consistency.
Thanks again for your contribution!

@hopkin-ghp
Copy link

@isLinXu
Thanks for your work.
I'm not entirely sure if the 1.8b template is correct, as I inferred it from the official chat_template.jinja and specific example outputs of pt tokenizer. I would greatly appreciate it if you could verify the correctness of template hunyuan_small.
Thank you very much.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

solved This problem has been already solved

Projects

None yet

Development

Successfully merging this pull request may close these issues.

finetune HY-MT1.5-1.8B error with LlamaFactory

3 participants