Upload testing suite for `DistillationTrainer` by cmpatino · Pull Request #5615 · huggingface/trl

cmpatino · 2026-04-21T13:59:09Z

What does this PR do?

Uploads tests for the DistillationTrainer

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline, Pull Request section?
Was this discussed/approved via a GitHub issue? Please add a link to it if that's the case.
Did you make sure to update the documentation with your changes?
Did you write any new necessary tests?

AI writing disclosure

We welcome the use of AI tools to help with contributions. For transparency and to help us improve our review process, please indicate the level of AI involvement in this PR.

No AI usage: the PR was written entirely by a human.
AI-assisted: some parts were suggested or improved by AI, but the PR was written and reviewed by a human.
AI-generated: the PR was mostly or fully generated by an AI tool.

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag members/contributors who may be interested in your PR.

Note

Low Risk
Test-only change; no production code paths are modified, with risk limited to potential CI flakiness due to model/dataset dependencies and optional accelerator coverage.

Overview
Adds a new experimental test module covering DistillationTrainer and related utilities.

The suite validates DistillationConfig argument constraints (teacher server vs liger, server URL required, reverse-KL mode restrictions), correctness of build_teacher_request_inputs trimming/prompt-length logic, and generalized_jsd_loss behavior across beta, temperature, and reductions. It also adds integration-style tests that run a short train with a local teacher (including checkpoint output), optionally exercises the liger path, verifies parity between local-teacher and mocked vLLM-teacher-server losses in sampled top-1 mode, and checks _RepeatBatchDataLoader forwards set_epoch.

^{Reviewed by Cursor Bugbot for commit 43950bb. Bugbot is set up for automated code reviews on this repo. Configure here.}

qgallouedec

Thanks for adding tests to DistillationTrainer, it had none, so this is directionally welcome. A few things worth addressing before merge:

Heavy use of DistillationTrainer.__new__(...) + manual attribute assignment, plus five Dummy*/custom mock classes. Every test assembles a fake trainer by bypassing __init__ and setting ~10 private attributes inline. This couples every test to internal attribute names. Any rename or new attribute used in compute_loss silently breaks the suite. It also cuts against two principles: consistency (the rest of tests/experimental/, e.g. test_gkd_trainer.py, loads a tiny real model and exercises the real __init__) and simplicity (the mock scaffolding and duplicated attribute setup in test_server_teacher_path_handles_variable_prompt_lengths / ..._padded_completions is exactly that). A tiny-model fixture would remove most of it.
No end-to-end test. One short trainer.train() on a tiny student+teacher would catch far more than the current mocked suite combined, and would make most of the attribute-juggling tests redundant. Plus it would be more align with the principle of testing behavior over implementation.

qgallouedec · 2026-04-21T16:19:35Z

+    torch.testing.assert_close(local_loss, server_loss)
+
+
+def test_sampled_mode_keeps_teacher_argmax_for_forward_support():


this is tautological. The expected value is computed by calling the same private helper _jsd_divergence with the same support/mask construction that compute_loss uses internally, so it tests wiring, not semantics. Any bug inside _jsd_divergence is invisible. Derive the expected value from first principles (plain JSD formula) or drop it.

qgallouedec · 2026-04-21T16:20:25Z

+                report_to="none",
+            )
+
+    assert caught == []


fragile: any unrelated DeprecationWarning from a dependency will break it. The pytest.raises is already the real assertion

Thank you for the feedback! I addressed your comments with the following changes:

Followed GKD's tests and used a small model and dataset to test the trainer.

Improved the Liger tests by following the testing approach from the SFTTrainer.

HuggingFaceDocBuilderDev · 2026-04-22T08:39:36Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

^{Reviewed by Cursor Bugbot for commit 1033b7c. Configure here.}

Upload testing suite for DistillationTrainer

91cb49c

cmpatino marked this pull request as ready for review April 21, 2026 13:59

qgallouedec reviewed Apr 21, 2026

View reviewed changes

Revampt tests

43997cf

cursor Bot reviewed Apr 22, 2026

View reviewed changes

Comment thread tests/experimental/test_distillation_trainer.py Outdated

Comment thread tests/experimental/test_distillation_trainer.py Outdated

cmpatino added 2 commits April 22, 2026 09:39

Improve Liger tests

2a41c97

Address comment about tautological test

1033b7c

cursor Bot reviewed Apr 22, 2026

View reviewed changes

Comment thread tests/experimental/test_distillation_trainer.py

Simulate variable-length responses from server

43950bb

cmpatino requested a review from qgallouedec April 23, 2026 08:30

qgallouedec approved these changes Apr 29, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Upload testing suite for `DistillationTrainer`#5615

Upload testing suite for `DistillationTrainer`#5615
cmpatino wants to merge 5 commits intohuggingface:mainfrom
cmpatino:kd-distillation-tests

cmpatino commented Apr 21, 2026 •

edited by cursor Bot

Loading

Uh oh!

qgallouedec left a comment

Uh oh!

qgallouedec Apr 21, 2026

Uh oh!

qgallouedec Apr 21, 2026

Uh oh!

cmpatino Apr 22, 2026

Uh oh!

HuggingFaceDocBuilderDev commented Apr 22, 2026

Uh oh!

Uh oh!

Uh oh!

cursor Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		torch.testing.assert_close(local_loss, server_loss)


		def test_sampled_mode_keeps_teacher_argmax_for_forward_support():

Conversation

cmpatino commented Apr 21, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Before submitting

AI writing disclosure

Who can review?

Uh oh!

qgallouedec left a comment

Choose a reason for hiding this comment

Uh oh!

qgallouedec Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

qgallouedec Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

cmpatino Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

HuggingFaceDocBuilderDev commented Apr 22, 2026

Uh oh!

Uh oh!

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

cmpatino commented Apr 21, 2026 •

edited by cursor Bot

Loading