Skip to content

feat(gen ai): showcase different options for computation-based metric #12756

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

Valeriy-Burlaka
Copy link
Member

Description

Fixes #

Note: Before submitting a pull request, please open an issue for discussion if you are not associated with Google.

Checklist

@Valeriy-Burlaka Valeriy-Burlaka self-assigned this Nov 8, 2024
@Valeriy-Burlaka Valeriy-Burlaka requested review from a team as code owners November 8, 2024 15:05
@Valeriy-Burlaka Valeriy-Burlaka marked this pull request as draft November 8, 2024 15:05
@product-auto-label product-auto-label bot added the samples Issues that are directly related to samples. label Nov 8, 2024
@@ -37,7 +39,37 @@ def get_rouge_score() -> EvalResult:
life, including endangered species, it faces serious threats from
climate change, ocean acidification, and coral bleaching."""

# Compare pre-generated model responses against the reference (ground truth).
# Option1: Run model inference and evaluate model response against the reference (ground truth)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code samples looks too big now!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, I understand

@@ -37,7 +39,37 @@ def get_rouge_score() -> EvalResult:
life, including endangered species, it faces serious threats from
climate change, ocean acidification, and coral bleaching."""

# Compare pre-generated model responses against the reference (ground truth).
# Option1: Run model inference and evaluate model response against the reference (ground truth)
Copy link
Member Author

@Valeriy-Burlaka Valeriy-Burlaka Nov 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@msampathkumar , I'm thinking about showcasing 2 different options of using the computation-based metrics — Bring-your-own-response (BYOR) and with running model inference.
The reason is that for me, as a developer, the line between these options wasn't immediately obvious (hence this issue with the "prompt" column being silently unused), so I want to make it crystal-clear.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I understand your point, this code samples is still too big(100 lines). Let me check with the tech writing team.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also note, I don't see any example response section for this part of the code.

@msampathkumar msampathkumar self-assigned this Feb 4, 2025
@msampathkumar msampathkumar marked this pull request as ready for review February 14, 2025 09:32
@msampathkumar msampathkumar requested a review from a team as a code owner February 14, 2025 09:32
@msampathkumar
Copy link
Member

Waiting for Kokoro CI - Python 3.13 to complete

@msampathkumar
Copy link
Member

@Valeriy-Burlaka - can you check and address the unresolved comments ?

@msampathkumar msampathkumar added the waiting-response Waiting for the author's response. label Feb 14, 2025
@msampathkumar msampathkumar added the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Feb 24, 2025
@kokoro-team kokoro-team removed kokoro:force-run Add this label to force Kokoro to re-run the tests. labels Feb 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
samples Issues that are directly related to samples. waiting-response Waiting for the author's response.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants