Skip to content

Conversation

@mali-git
Copy link
Member

No description provided.

@mali-git mali-git requested a review from Copilot June 28, 2025 17:22
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Adds a new translation evaluation feature and updates import paths to the new translation submodule.

  • Updates test imports to use ml_filter.translation.translate.
  • Introduces translation_evaluation.py with functions to load references, prepare inputs, and compute COMET scores.
  • Extends the CLI (__main__.py) with an evaluate_translations command.

Reviewed Changes

Copilot reviewed 4 out of 5 changed files in this pull request and generated 3 comments.

File Description
tests/test_translate.py Updated import path for Translator.
tests/conftest.py Updated import paths for translation clients and Translator.
src/ml_filter/translation/translation_evaluation.py Added new module for reference loading, input prep, and COMET evaluation.
src/ml_filter/main.py Imported evaluate_translations and added a CLI subcommand for translation evaluation.
Comments suppressed due to low confidence (1)

src/ml_filter/translation/translation_evaluation.py:33

  • The docstring for _prepare_translation_input lists a lang parameter that isn’t in the signature. Please update the docstring to match the actual parameters.
        lang: Language code.

@click.option("--gold-path", required=True, help="Path to gold reference JSONL file")
@click.option("--model-name", default="Unbabel/wmt22-cometkiwi-da", help="COMET model to use")
@click.option("--languages", type=str, required=True, help="Comma-separated list of supported language codes")
@click.option("--batch-size", help="Batch size for processing translations")
Copy link

Copilot AI Jun 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CLI option --batch-size has no type specified; it will be parsed as a string. Consider adding type=int to ensure batch_size is passed as an integer.

Suggested change
@click.option("--batch-size", help="Batch size for processing translations")
@click.option("--batch-size", type=int, help="Batch size for processing translations")

Copilot uses AI. Check for mistakes.
for filename in os.listdir(data_dir):
if filename.endswith(".jsonl"):
file_path = os.path.join(data_dir, filename)
lang = filename.split("_")[5]
Copy link

Copilot AI Jun 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Splitting by underscore and accessing index 5 is brittle and may raise IndexError for unexpected filenames. Add a check on segment length or use a more robust parsing method.

Copilot uses AI. Check for mistakes.
Copy link
Member

@le1nux le1nux left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Functionalitywise I think it is all correct. There a few stylistic issues.
Did not get to review the plotting functions themselves, yet.

batch_size: int,
model_name: str = "Unbabel/wmt22-cometkiwi-da",
) -> None:
"""Evaluate translation quality for a set of files using a COMET model.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe explain how we actually want to evaluate it? Like 2-3 sentences about pitching the idea.

target_texts = []
with open(file_path, "r") as f:
for line_num, line in enumerate(f, 1):
if not line:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

line is never a boolean, but it treated here like one. It also works with "None" but in my opinion it is bad style.

Suggested change
if not line:
if line is None:

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we maybe even raise an exception here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

line will never be None since it is a string.
The current solution checks for empty strings.

languages: str,
batch_size: int,
):
"""CLI entry point for evaluating translation quality."""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should document that the files in data_dir need to folllow a certain convention


target_texts = _prepare_translation_input(file_path, gold_dict)

if target_texts:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if target_texts:
if len(target_texts) > 0:

except json.JSONDecodeError as e:
logging.warning(f"Skipping invalid line {line_num} in {file_path}: {e}")
continue
return target_texts
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should add a warning if len(target_texts) != len(gold_dict)
If some languages misses certain translations it introduces biases.


# Step 3: Count translation scores
for sample_id, trans_score in id_to_translation_score.items():
if sample_id in id_to_gt_quality_score:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if this is not the case? Should we issue a warning?

if sample_id in id_to_gt_quality_score:
gt_score = float(id_to_gt_quality_score[sample_id])
trans_score = trans_score.lower()
if trans_score in score_classes:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also what about the else case?

if filename.endswith(".json"):
parts = filename.split("_")
if len(parts) != 8:
continue
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no warning?

languages: list[str],
output_dir: Path,
) -> None:
"""Plot histograms for translation quality results.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should also mention the convention of the file names in data_dir here

output_path=output_path,
)

output_path = os.path.join(output_dir, f"{lang}_translation_quality_vs_gt_histogram.png")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want this for the paper, I would always plot as PDF

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants