Skip to content

Conversation

@tianxiao-baai
Copy link
Contributor

OP Test

New Feature

Description

add unified benchmark summary script for operator testing

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @tianxiao-baai, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request delivers a new, unified benchmark summary script for operator testing within the FlagGems framework. The script automates the execution of both accuracy and performance tests, leveraging parallel processing across specified GPUs to efficiently gather comprehensive data. Its primary goal is to provide a streamlined method for evaluating operator performance and correctness, consolidating all critical metrics into an easily digestible CSV report.

Highlights

  • New Test Script: Introduces run_flaggems_tests.sh, a new bash script designed to automate the end-to-end testing of operators within the FlagGems framework.
  • Parallel Execution: The script enables concurrent accuracy and performance testing of operators by distributing them across multiple specified GPUs, significantly improving testing efficiency.
  • Unified Reporting: It generates a comprehensive CSV summary of test results, including accuracy status (passed, failed, skipped) and detailed speedup metrics for various data types such as float16, float32, bfloat16, int16, int32, bool, and cfloat.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new shell script for running FlagGems operator tests in parallel across multiple GPUs. The script is a valuable addition for automating the testing process. My review focuses on improving the script's robustness, efficiency, and maintainability. I've identified several areas for improvement, including unsafe variable handling that could lead to errors, inefficient calculations within loops, and brittle logic for file and output parsing. Applying the suggested changes will make the script more reliable and performant.

Comment on lines 170 to 180
sum=0; cnt=0
for v in "$float16_speedup" "$float32_speedup" "$bfloat16_speedup" "$int16_speedup" "$int32_speedup" "$bool_speedup" "$cfloat_speedup"; do
if [[ $v =~ $is_number_regex ]]; then
# 只把大于 0 的计入
gt=$(awk -v x="$v" 'BEGIN{ print (x+0>0) ? 1 : 0 }')
if [ "$gt" -eq 1 ]; then
sum=$(awk -v a="$sum" -v b="$v" 'BEGIN{ printf "%.12f", a + b }')
cnt=$((cnt+1))
fi
fi
done
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This block calculates a fallback average speedup by calling awk multiple times inside a for loop. Spawning a new process for each arithmetic operation and comparison is highly inefficient. This entire calculation can be performed in a single, much faster awk command that processes all values at once.

Suggested change
sum=0; cnt=0
for v in "$float16_speedup" "$float32_speedup" "$bfloat16_speedup" "$int16_speedup" "$int32_speedup" "$bool_speedup" "$cfloat_speedup"; do
if [[ $v =~ $is_number_regex ]]; then
# 只把大于 0 的计入
gt=$(awk -v x="$v" 'BEGIN{ print (x+0>0) ? 1 : 0 }')
if [ "$gt" -eq 1 ]; then
sum=$(awk -v a="$sum" -v b="$v" 'BEGIN{ printf "%.12f", a + b }')
cnt=$((cnt+1))
fi
fi
done
local values="$float16_speedup $float32_speedup $bfloat16_speedup $int16_speedup $int32_speedup $bool_speedup $cfloat_speedup"
read sum cnt < <(echo "$values" | awk '{
s=0; c=0;
for(i=1; i<=NF; i++) {
if ($i > 0) { s+=$i; c++ }
}
print s, c
}')
sum=${sum:-0}
cnt=${cnt:-0}

Comment on lines +158 to +165
float16_speedup=$(echo "$float16_speedup" | sed 's/^"//;s/"$//;s/^[[:space:]]*//;s/[[:space:]]*$//')
float32_speedup=$(echo "$float32_speedup" | sed 's/^"//;s/"$//;s/^[[:space:]]*//;s/[[:space:]]*$//')
bfloat16_speedup=$(echo "$bfloat16_speedup" | sed 's/^"//;s/"$//;s/^[[:space:]]*//;s/[[:space:]]*$//')
int16_speedup=$(echo "$int16_speedup" | sed 's/^"//;s/"$//;s/^[[:space:]]*//;s/[[:space:]]*$//')
int32_speedup=$(echo "$int32_speedup" | sed 's/^"//;s/"$//;s/^[[:space:]]*//;s/[[:space:]]*$//')
bool_speedup=$(echo "$bool_speedup" | sed 's/^"//;s/"$//;s/^[[:space:]]*//;s/[[:space:]]*$//')
cfloat_speedup=$(echo "$cfloat_speedup" | sed 's/^"//;s/"$//;s/^[[:space:]]*//;s/[[:space:]]*$//')
avg_speedup=$(echo "$avg_speedup" | sed 's/^"//;s/"$//;s/^[[:space:]]*//;s/[[:space:]]*$//')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

These lines repeat the same sed command for eight different variables. This violates the DRY (Don't Repeat Yourself) principle and makes the code harder to maintain. Furthermore, the awk script's trim function (lines 108-113) already removes surrounding quotes and whitespace, which makes these sed calls redundant. I recommend removing these lines for cleaner code. If they are intended as a defensive measure, they should be refactored into a single function.

# ---------------- 执行测试 ----------------
for gpu_id in "${GPUS[@]}"; do
{
for op in ${GPU_TASKS[$gpu_id]}; do
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The loop for op in ${GPU_TASKS[$gpu_id]} relies on word splitting to iterate through operator names. If an operator name contains whitespace, it will be incorrectly split into multiple parts, causing test failures. To make this more robust, you could store operators as a newline-separated string (e.g., GPU_TASKS[$gpu_id]+="${OPS[$i]}"$'\n' on line 220) and then iterate using a while read loop: while IFS= read -r op; do [[ -n "$op" ]] && run_op_test "$op" "$gpu_id"; done <<< "${GPU_TASKS[$gpu_id]}".

@meinie0826
Copy link
Collaborator

Please install the pre-commit, and format your code. You can refer to https://github.com/FlagOpen/FlagGems/blob/master/docs/code_countribution.md.

@meinie0826
Copy link
Collaborator

Gemini provides some effective suggestions that can be referenced.

meinie0826
meinie0826 previously approved these changes Nov 5, 2025
meinie0826
meinie0826 previously approved these changes Nov 5, 2025
@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants