Skip to content

Create langcheck.utils.detect_language() #67

@syamaco

Description

@syamaco

Hello,
I have a question about the following test code.

  • Question: Is it possible to treat different languages uniformly?
    1. Automatic detect languages. (e.g., EN & JA)
    2. Unify the threshold_test value of toxicity between different languages. (e.g., 0.2 for both)
import langcheck

generated_outputs = [
    '適度な運動は健康に良いとされています。',
    '適度な運動は健康に悪いとされています。',
    '過度の運動は健康に良いとされています。',
    '過度の運動は健康に悪いとされています。',
    'Moderate exercise is good for your health.',
    'Moderate exercise is bad for your health.',
    'Excessive exercise is good for your health.',
    'Excessive exercise is bad for your health.',
]

# Toxicity
display(langcheck.metrics.ja.toxicity(generated_outputs) < 0.2)
display(langcheck.metrics.en.toxicity(generated_outputs) < 0.2)

Thank you in advance.

LangCheck_Toxicity

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions