[codex] docs: complete translation configuration reference#2132
Open
lbliii wants to merge 4 commits into
Open
Conversation
Signed-off-by: Lawrence Lane <llane@nvidia.com>
Contributor
| - `backend_type="llm"` requires both `client` and a non-empty `model_name`, even in dry-run mode. | ||
| - FAITH always uses an LLM client. With Google, AWS, NMT, or a custom translation backend, pass a separate `client` and set `faith_model_name`. | ||
| - `merge_scores=True` with `output_mode="replaced"` is rejected. With FAITH disabled, score merging is skipped with a warning. | ||
| - `output_mode="raw"` removes `output_field` after building `translation_metadata`. Do not combine raw mode with message reconstruction; use `"both"` when you need metadata and `translated_messages`. |
Contributor
There was a problem hiding this comment.
The guidance correctly says not to combine
output_mode="raw" with reconstruct_messages, but doesn't explain the concrete consequence. In FormatTranslationOutputStage.process, output_field is dropped from the DataFrame before _build_translated_messages is called, so every translated_messages entry ends up as an empty string. Describing that failure mode here will save users from a confusing silent-empty result.
Suggested change
| - `output_mode="raw"` removes `output_field` after building `translation_metadata`. Do not combine raw mode with message reconstruction; use `"both"` when you need metadata and `translated_messages`. | |
| - `output_mode="raw"` removes `output_field` after building `translation_metadata`. Do not combine raw mode with message reconstruction: `output_field` is dropped before `_build_translated_messages` runs, so every `translated_messages` entry will be an empty string. Use `output_mode="both"` when you need metadata and `translated_messages`. |
Signed-off-by: Lawrence Lane <llane@nvidia.com>
Signed-off-by: Lawrence Lane <llane@nvidia.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What changed
TranslationStageparameter reference covering inputs, segmentation, inference, FAITH evaluation, output, and resume controlsWhy
The translation controls introduced in #2038 were not fully represented in the user guide. Users had to inspect implementation details to discover defaults, constraints, backend requirements, and the behavior of resumed translation runs.
Impact
Users can configure and validate experimental translation pipelines directly from the guide, including mixed-provider translation and quality evaluation workflows. This is a documentation-only change; runtime behavior is unchanged.
Validation
cd fern && npm run check— passes with 0 errorscd fern && fern docs broken-links— changed page has no broken links; the command reports 22 pre-existing errors in older API-reference pagesgit diff --check— passesCloses #2126
Parent workstream: #2118